[In Progress] Fix bug: enable sparse weigth setting in trainer_config_helper APIs#985
Closed
backyes wants to merge 1 commit intoPaddlePaddle:developfrom
Closed
[In Progress] Fix bug: enable sparse weigth setting in trainer_config_helper APIs#985backyes wants to merge 1 commit intoPaddlePaddle:developfrom
backyes wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
Contributor
Author
Contributor
Author
|
潜在 BUG Update: (SHA1: 28c5010 )
最后一个分支错误。
原因是, trainer_count=1(实际上应该是关闭sparse updater) 会使能local updater, 它不支持基于参数的优化策��,只能支持全局的优化策略。 (问题类似L1正则的问题) 也间接说明, sparse momentum不能与sgdLocalUpdater共存。
crash 在多卡gradient 初始化那里
因为要初始化一个 (matType == MAT_SPARSE_ROW_IDS) 类型矩阵 (为什么单卡没有这个矩阵,尚不明确)? |
Contributor
Author
|
Update:
|
This was referenced Dec 30, 2016
Contributor
|
感谢您给PaddlePaddle贡献代码。由于Paddle V1/V2版本已不再维护,相关代码也已从develop分支上删除,因此关闭您的PR,欢迎您向Paddle最新版-Fluid贡献代码。 |
zhhsplendid
pushed a commit
to zhhsplendid/Paddle
that referenced
this pull request
Sep 25, 2019
wangxicoding
pushed a commit
to wangxicoding/Paddle
that referenced
this pull request
Dec 9, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix #948
为什么开这个pr:
BUG相关:
完善新接口对sparse支持,需要进一步分析的问题:
从icode 老版本git 历史: commit af92dcde6afc4454354089e47870c7ef38dfeda3 看,上述nnz=4的配置得来?
是否可以默认使用csr稀疏格式。 从理论上parameter weight的稀疏存储是一种内部格式,跟数据源无任何关系,因此不建议将老接口中的format参数,导出到用户。 (csr和csc格式在计算上应该没有什么性能差异? @reyoung 可以comment这个观点)
老接口中,只有FCLayer和SelectiveFCLayer支持sparse weight的配置,其他layer均不支持。 因此, 是否需要将这个参数作为general的parameter attribute存在? 是否要将它实现到layer的特殊属性? (新接口设计初衷之一,也是为了简化用户理解的接口,所以我们应该尽量遵循这个准则来设计接口)
除了fix 接口问题之外,还有一些疑问:
理论上, parameter weight 设置成sparse的特性后, 一般并不能进一步优化训练阶段的forward和backward计算耗时。 因为一般data是sparse的配置后,forward应该已经是sparse计算了的了(backward是否尚需确认?), 另外再加上sparse update和sparse remote update使能后,再单独设置parameter weight的sparse 应该没有什么性能提升?
如果parameter weight 设置成sparse的特性,是为了生成稀疏的模型用户预测,那么实际上应该可以通过仅仅在save model阶段进行稀疏存储即可,无需在训练计算阶段进行这种稀疏化处理?
因此它存在的价值是什么?