Make get_tree_size robust against git bombs#7766
Open
jorendorff wants to merge 4 commits intogithub-linguist:mainfrom
Open
Make get_tree_size robust against git bombs#7766jorendorff wants to merge 4 commits intogithub-linguist:mainfrom
get_tree_size robust against git bombs#7766jorendorff wants to merge 4 commits intogithub-linguist:mainfrom
Conversation
For pathological repositories like git bombs with deeply nested tree structures, the previous implementation could hang indefinitely even with a file limit, because it had to visit every tree object to discover the blobs. This change adds a separate tree count that also triggers the limit, ensuring the method terminates promptly regardless of repository structure. The return value is still the blob count for normal repos, preserving existing behavior.
|
This reimplements def get_tree_size(commit_id, limit)
tree_count = 0
blob_count = 0
get_tree(commit_id).walk(:preorder) do |root, entry|
if blob_count >= limit || tree_count >= limit
blob_count = limit # If we have too many trees we return the limit
raise StopIteration
end
case entry[:type]
when :blob
blob_count += 1
when :tree
tree_count += 1
end
true # go into a tree if that's what we were given
end
blob_count
rescue StopIteration
blob_count
end |
The previous test created 1000 git trees, taking 11s. This creates 32 in <0.5s. This now creates an actual git bomb, if you cloned the repo you'd get 2^32-1 directories.
Contributor
Author
|
I ended up with this. This is now ready for review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
For pathological repositories like git bombs with deeply nested tree structures, Rugged's
count_recursivemethod can hang indefinitely even with a file limit, because it has to visit every tree object to find and count the blobs. See libgit2/rugged#995.That causes
RuggedRepository#get_tree_sizeto hang, sogit-linguist statscan hang.This PR works around that issue by using a custom count method that also applies a tree limit, so that it terminates promptly regardless of repository structure. The return value is still the blob count for normal repos, preserving existing behavior.
Checklist: