Skip to content

[GLUTEN-9205][CH] Support deletion vector native write#9248

Merged
loneylee merged 9 commits into
apache:mainfrom
loneylee:9205
Apr 9, 2025
Merged

[GLUTEN-9205][CH] Support deletion vector native write#9248
loneylee merged 9 commits into
apache:mainfrom
loneylee:9205

Conversation

@loneylee

@loneylee loneylee commented Apr 7, 2025

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #9205)

How was this patch tested?

test by ut

@github-actions

github-actions Bot commented Apr 7, 2025

Copy link
Copy Markdown
@github-actions

github-actions Bot commented Apr 7, 2025

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@loneylee

loneylee commented Apr 7, 2025

Copy link
Copy Markdown
Member Author

@CodiumAI-Agent /improve

@QodoAI-Agent

Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@loneylee loneylee requested a review from Copilot April 7, 2025 13:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 11 out of 15 changed files in this pull request and generated no comments.

Files not reviewed (4)
  • backends-clickhouse/src-delta-32/main/scala/org/apache/gluten/sql/shims/delta32/Delta32Shims.scala: Language not supported
  • backends-clickhouse/src-delta-32/main/scala/org/apache/spark/sql/execution/DeletionVectorWriteTransformer.scala: Language not supported
  • backends-clickhouse/src-delta-32/test/scala/org/apache/spark/gluten/delta/GlutenDeltaParquetDeletionVectorSuite.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/component/CHDeltaComponent.scala: Language not supported
Comments suppressed due to low confidence (1)

cpp-ch/local-engine/Storages/SubstraitSource/Delta/DeltaWriter.cpp:95

  • Ensure that 'get64(0)' is intended to always fetch the first row's value; if each row should have its own cardinality, consider using the current row index (e.g., get64(row_idx)).
auto cardinality = cardinality_src_columns.column->get64(0); // alisa deletedRowIndexCount
@github-actions

github-actions Bot commented Apr 8, 2025

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@QodoAI-Agent

Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@github-actions

github-actions Bot commented Apr 8, 2025

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scala class is copied from the delta source, please ref to the object VacuumCommand, add some comments for the contents of the changes which is different from the source, and add the comments // --- modified start and // --- modified end to specify the changes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call the java api from the native to generate the uuid ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, comment has been added

@github-actions

github-actions Bot commented Apr 8, 2025

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 11 out of 16 changed files in this pull request and generated 1 comment.

Files not reviewed (5)
  • backends-clickhouse/src-delta-32/main/scala/org/apache/gluten/sql/shims/delta32/Delta32Shims.scala: Language not supported
  • backends-clickhouse/src-delta-32/main/scala/org/apache/spark/sql/execution/DeletionVectorWriteTransformer.scala: Language not supported
  • backends-clickhouse/src-delta-32/test/scala/org/apache/spark/gluten/delta/GlutenDeltaParquetDeletionVectorSuite.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/component/CHDeltaComponent.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/sql/shims/DeltaShims.scala: Language not supported
Comments suppressed due to low confidence (1)

cpp-ch/local-engine/Storages/SubstraitSource/Delta/DeltaWriter.cpp:247

  • Verify that passing a std::byte value to writeIntBinary is intended; an explicit cast to an integer type might be required to ensure correct binary output.
DB::writeIntBinary(DV_FILE_FORMAT_VERSION_ID_V1, *write_buffer);
Comment on lines +95 to +96
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alisa deletedRowIndexCount

Copilot AI Apr 8, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider correcting the comment from 'alisa' to 'alias' for clarity.

Suggested change
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alisa deletedRowIndexCount
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alias deletedRowIndexCount
Copilot uses AI. Check for mistakes.

@zzcclp zzcclp left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@loneylee loneylee merged commit a4b230b into apache:main Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants