-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Kernel] Add CCv2 write and read support for benchmark spec and runner #5379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
72930ea to
4c205e2
Compare
a419409 to
824aa16
Compare
824aa16 to
69ae6fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments looks great!
| Option.apply(stagedCommit.getVersion()), // commitVersion | ||
| Option.apply(fileStatus.getSize()), // commitFileSize | ||
| Option.apply(fileStatus.getModificationTime()), // commitFileModTime | ||
| Option.apply(System.currentTimeMillis()), // commitTimestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar, is this supposed to match the ICT or just when committed to catalog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good point. For correctness this should probably be ICT, but that may become expensive to load each commit file to get the ICT 🤔
I'll put a note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just add the timestamp to ccv2_info.json?
| String stagedCommitsDir = Paths.get(tableRoot, "_delta_log", "_staged_commits").toString(); | ||
|
|
||
| String commitUuid = UUID.randomUUID().toString(); | ||
| String stagedCommitFileName = String.format("%020d.%s.json", version, commitUuid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the methods in FileNames for this?
.../kernel-defaults/src/test/java/io/delta/kernel/defaults/benchmarks/WorkloadOutputFormat.java
Outdated
Show resolved
Hide resolved
| public CCv2Info() {} | ||
|
|
||
| /** | ||
| * Constructor for creating CCv2Info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any requirements on the ordering and such of logTail? Might be worth calling out here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StagedCommit contains its version, so it's fine for this to be unordered.r I'll put in a comment to clarify that.
Which Delta project/connector is this regarding?
Description
This PR adds support for Coordinated Commits v2 (CCv2) tables to the Delta Kernel benchmarking framework, enabling performance testing of tables using staged commits with Unity Catalog coordination.
Key changes:
table_info.jsonto indicate whether it is a CCv2 table or not.How was this patch tested?
basic_ccv2test table with complete CCv2 structure (backfilled and staged commits). Added read and write workload specs forbasic_ccv2table.Does this PR introduce any user-facing changes?
No.