-
Notifications
You must be signed in to change notification settings - Fork 282
Fix: Partition Core Tables #2227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We��ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
cmd/hatchet-migrate/migrate/migrations/20250829173445_v1_0_41.sql
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements table partitioning for the core lookup table (v1_lookup_table) to improve performance and enable efficient data management. The migration follows a standard pattern: create a new partitioned table, backfill data, and cut over to the new structure.
Key changes:
- Modifies the primary key to include
inserted_atfor partition compatibility - Creates a partitioned version of the lookup table with weekly range partitions
- Updates trigger functions to work with the new composite primary key
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| sql/schema/v1-core.sql | Updates table schema and trigger functions to use composite primary key (external_id, inserted_at) |
| cmd/hatchet-migrate/migrate/migrations/20250829173445_v1_0_41.sql | Implements the complete migration with partitioned table creation, data backfill, and cutover logic |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
cmd/hatchet-migrate/migrate/migrations/20250829173445_v1_0_41.sql
Outdated
Show resolved
Hide resolved
| SELECT '19700101' INTO startDateStr; | ||
| SELECT TO_CHAR(date_trunc('week', (NOW() - INTERVAL '8 days')::DATE), 'YYYYMMDD') INTO endDateStr; | ||
| SELECT LOWER(FORMAT('%s_%s', targetTableName, startDateStr)) INTO newTableName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, anything older than 8 days goes into the same partition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this was my version of "from the beginning of time until last week" just to make sure we catch everything
cmd/hatchet-migrate/migrate/migrations/20250829173445_v1_0_41.sql
Outdated
Show resolved
Hide resolved
891fd61 to
7efbfc6
Compare
fix: clean up migration feat: migration
7efbfc6 to
9abb6a3
Compare
* feat: initial up migration * fix: add down * fix: add partition for today * fix: weekly partitioning * fix: add partitioning logic for job * Revert "fix: weekly partitioning" This reverts commit 5fca616. * fix: date wrangling * fix: one more migration fix * fix: partitions * fix: ranges * debug: table name * fix: partition naming * feat: add analyze * feat: add more `ANALYZE` * fix: run analyze more often * fix: rm analyze * chore: gen * fix: ugly sql > merge conflicts * fix: migration version * fix: run analyze in migration * chore: gen
* feat: partitioning migration for v1_dag_data * feat: partition wiring * fix: pk name * chore: lint
This reverts commit c9f96ab.
|
@abelanger5 I need to do another self review here but I think these should be good to take a look through at this point at least |
Description
Adding three migrations for partitioning existing core tables (
v1_dag_data,v1_dag_to_task, andv1_lookup_table).For the lookup table, the steps are:
It's a bit different than with Timescale since we need to create the partitions manually. I was thinking it'd make sense to just have one big partition for all the old data, which will all get dropped in one shot after a week or so, but we could also loop through and create many. Definitely makes me nervous to need the three steps in this one since it's really not easily rerunnable / reversible and it's in the OSS
For the other two, we just attach the existing data as a partition to a new table, which should be pretty fast
Type of change