From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: derrickstolee@github.com, jonathantanmy@google.com, gitster@pobox.com
Subject: [RFC PATCH 0/4] move pruned objects to a separate repository
Date: Wed, 29 Jun 2022 14:45:49 -0400 [thread overview]
Message-ID: <cover.1656528343.git.me@ttaylorr.com> (raw)
Now that cruft packs are available in v2.37.0, here is an interesting
application of that new feature to enable a two-phase object pruning
approach.
This came out of a discussion within GitHub about ways we could support
storing a set of pruned objects in "limbo" so that they were not
accessible from the repository which pruned them, but instead stored in
a cruft pack in a separate repository which lists the original one as an
alternate.
This makes it possible to take the collection of all pruned objects and
store them in a cruft pack in a separate repository. This repository
(which I have been referring to as the "expired.git") can then be used
as a donor repository for any missing objects (like the ones described
by the race in [1]).
The first few patches are preparatory. The final one implements writing
the pruned objects separately. The trick is to write another cruft pack
to a separate repository, with two tweaks:
- the `--cruft-expiration` value is set to "never", since we want to
keep around all of the objects we expired in the previous step, and
- the original cruft pack appears as a pack that we are going to keep,
meaning all unreachable objects that are stored in the original
cruft pack are excluded from the one we write to the "expired.git"
repository.
You can try this out yourself by doing something like:
$ git init --bare ../expired.git $ git repack --cruft
--cruft-expiration=1.day.ago -d \
--expire-to=../expired.git/objects/pack/pack
which will create two cruft packs:
- one in the repository which ran `git repack` containing all
unreachable objects written within the last day, and
- another in the "expired.git" repository which contains all
unreachable objects written prior to the last day
This series is an RFC for now since I'm interested in discussing whether
or not this is a feature that people would actually want to use or not.
But if it is, I'm happy to polish this up and turn it into a
non-RFC-quality series ;-).
In the meantime, thanks for your review!
[1]: https://lore.kernel.org/git/YryF+vkosJOXf+mQ@nand.local/
Taylor Blau (4):
builtin/repack.c: pass "out" to `prepare_pack_objects`
builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
builtin/repack.c: write cruft packs to arbitrary locations
builtin/repack.c: implement `--expire-to` for storing pruned objects
Documentation/git-repack.txt | 6 ++
builtin/repack.c | 67 ++++++++++++++++---
t/t7700-repack.sh | 121 +++++++++++++++++++++++++++++++++++
3 files changed, 186 insertions(+), 8 deletions(-)
--
2.37.0.1.g1379af2e9d
next reply other threads:[~2022-06-29 18:46 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-29 18:45 Taylor Blau [this message]
2022-06-29 18:45 ` [RFC PATCH 1/4] builtin/repack.c: pass "out" to `prepare_pack_objects` Taylor Blau
2022-06-29 18:47 ` [RFC PATCH 2/4] builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack` Taylor Blau
2022-06-29 18:47 ` [RFC PATCH 3/4] builtin/repack.c: write cruft packs to arbitrary locations Taylor Blau
2022-06-29 18:47 ` [RFC PATCH 4/4] builtin/repack.c: implement `--expire-to` for storing pruned objects Taylor Blau
2022-06-29 22:54 ` [RFC PATCH 0/4] move pruned objects to a separate repository Jonathan Tan
2022-06-30 2:47 ` Taylor Blau
2022-06-30 21:15 ` Jonathan Tan
2022-06-30 8:00 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1656528343.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).