-
Notifications
You must be signed in to change notification settings - Fork 226
优化Excel读取内存占用 #409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
CodeCasterX
merged 9 commits into
ModelEngine-Group:1.2.x
from
jsbjfkbsjk:excel-fast-optimization
Sep 16, 2025
Merged
优化Excel读取内存占用 #409
CodeCasterX
merged 9 commits into
ModelEngine-Group:1.2.x
from
jsbjfkbsjk:excel-fast-optimization
Sep 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
loveTsong
reviewed
Aug 28, 2025
CodeCasterX
reviewed
Aug 28, 2025
...s/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/service/impl/OperatorServiceImpl.java
Outdated
Show resolved
Hide resolved
Closed
4 tasks
loveTsong
reviewed
Aug 29, 2025
...s/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/service/impl/OperatorServiceImpl.java
Outdated
Show resolved
Hide resolved
Member
10eae4c to
381b95c
Compare
CodeCasterX
reviewed
Sep 6, 2025
...ract-service/src/main/java/modelengine/fit/jade/aipp/file/extract/AbstractFileExtractor.java
Outdated
Show resolved
Hide resolved
CodeCasterX
reviewed
Sep 7, 2025
...ract-service/src/main/java/modelengine/fit/jade/aipp/file/extract/AbstractFileExtractor.java
Outdated
Show resolved
Hide resolved
...ract-service/src/main/java/modelengine/fit/jade/aipp/file/extract/AbstractFileExtractor.java
Show resolved
Hide resolved
...ract-service/src/main/java/modelengine/fit/jade/aipp/file/extract/AbstractFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
app-builder/plugins/aipp-file-extract-excel/src/test/resources/file/content.xlsx
Outdated
Show resolved
Hide resolved
...lugins/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/tool/FileExtractorContainer.java
Outdated
Show resolved
Hide resolved
CodeCasterX
reviewed
Sep 8, 2025
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
app-builder/plugins/aipp-file-extract-excel/src/test/resources/file/content.csv
Show resolved
Hide resolved
...lugins/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/tool/FileExtractorContainer.java
Outdated
Show resolved
Hide resolved
CodeCasterX
reviewed
Sep 11, 2025
...file-extract-service/src/main/java/modelengine/fit/jade/aipp/file/extract/FileExtractor.java
Show resolved
Hide resolved
...lugins/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/tool/FileExtractorContainer.java
Outdated
Show resolved
Hide resolved
...lugins/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/tool/FileExtractorContainer.java
Show resolved
Hide resolved
...lugins/aipp-plugin/src/main/java/modelengine/fit/jober/aipp/tool/FileExtractorContainer.java
Outdated
Show resolved
Hide resolved
...tract-excel/src/test/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractorTest.java
Outdated
Show resolved
Hide resolved
...tract-excel/src/test/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractorTest.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
...e-extract-excel/src/main/java/modelengine/fit/jade/aipp/file/extract/ExcelFileExtractor.java
Outdated
Show resolved
Hide resolved
CodeCasterX
approved these changes
Sep 15, 2025
reeeborn33
approved these changes
Sep 16, 2025
RonnyChan96
approved these changes
Sep 16, 2025
Contributor
RonnyChan96
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
loveTsong
approved these changes
Sep 16, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

🔗 相关问题 / Related Issue
Issue 链接 / Issue Link: #336
📋 变更类型 / Type of Change
📝 变更目的 / Purpose of the Change
原提取方式在遇到复杂excel时因为提取方式是把excel整个解析到内存中会大幅提高内存占用,因此优化文件提取可以提高运行流畅度。
📋 主要变更 / Brief Changelog
引入了fast excel包替代org.apache.poi.xssf.usermodel.XSSFWorkbook;
采用流式读取,无须把先把excel文件解析到内存中。
🧪 验证变更 / Verifying this Change
测试步骤 / Test Steps
测试覆盖 / Test Coverage
📸 截图 / Screenshots
文本对话:
内存占用情况:
文本+文件对话(8M的excel文件有45万条内容,只读到1367行):
最终稳定的内存占用:
由于后端逻辑规定了最大token数是20000,当文件较大时内容会被截取,导致ai只能读到第一个sheet的部分内容,当文件比较小时ai可以识别所有内容。
✅ 贡献者检查清单 / Contributor Checklist
请确保你的 Pull Request 符合以下要求 / Please ensure your Pull Request meets the following requirements:
基本要求 / Basic Requirements:
代码质量 / Code Quality:
测试要求 / Testing Requirements:
mvn -B clean package -Dmaven.test.skip=true,npm install --force && npm run build:pro/ Basic checks passmvn clean install/ Unit tests pass文档和兼容性 / Documentation and Compatibility:
📋 附加信息 / Additional Notes
下一步优化点:由于原逻辑不是流式读取所以再读出内容后再截取,在采用fast excel的情况下可以将最大token的限制传入提取函数,在内容达到上限时提前中断读取。
审查者注意事项 / Reviewer Notes:
fast excel缺点: