-
Notifications
You must be signed in to change notification settings - Fork 1k
Add RVV support for RISC-V and optimize decompression speed with an enhanced Memcopy64 function #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…peed For RISC-V[skip ci]
…into add_rvv_support
|
Hi @danilak-G , |
|
Hi @danilak-G , |
|
@pwnall @haney @danilak-G , would you pls help to review? Thanks. |
|
Hi, thanks for the contribution. We will take a look. At a first glance, we don't like |
…into add_rvv_support
…into add_rvv_support
Thanks a lot for the quick feedback! I moved the #if SNAPPY_HAVE_RVV check to the same level as #if defined(x86_64) && defined(AVX), |
Hey, if anyone has a minute, I'd love a review of my code. We're working to improve Snappy's performance on RISC-V—feedback appreciated! |
|
Hi, the feedback is still the same. There are multiple # if preprocessor directives in the middle of the decompression loops. This is not bad per se but requires quite rigorous review. Timeline is still unknown from our side, snappy is mostly in maintenance mode for now |
This PR add RVV support for RV and optimized Memcopy64 function , improving compression speed by ~49.5%.
Optimize Snappy 1.2.2 performance
Added RVV support for RISC-V in Snappy, optimizing Memcopy64 by leveraging RVV vector load/store instructions (e.g., vle8_v_u8m1, vse8_v_u8m1) to reduce memory copy overhead and improve decompression performance. lzbench 2.1 tests onsilesia.tar (GCC 13.2.1, 64-bit Linux) show:Test Parameters:
Notes:
compatibility.
Snappy 1.2.2 unittest ([ PASSED ] 21 tests.)
[==========] Running 21 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CorruptedTest
[ RUN ] CorruptedTest.VerifyCorrupted
Crazy decompression lengths not checked on 64-bit build
[ OK ] CorruptedTest.VerifyCorrupted (14 ms)
[----------] 1 test from CorruptedTest (14 ms total)
[----------] 17 tests from Snappy
[ RUN ] Snappy.SimpleTests
[ OK ] Snappy.SimpleTests (22 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCases
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCases (3 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive (4706 ms)
[ RUN ] Snappy.MaxBlowup
[ OK ] Snappy.MaxBlowup (10 ms)
[ DISABLED ] Snappy.DISABLED_MoreThan4GB
[ RUN ] Snappy.RandomData
[ OK ] Snappy.RandomData (20182 ms)
[ RUN ] Snappy.FourByteOffset
[ OK ] Snappy.FourByteOffset (1 ms)
[ RUN ] Snappy.IOVecSourceEdgeCases
[ OK ] Snappy.IOVecSourceEdgeCases (0 ms)
[ RUN ] Snappy.IOVecSinkEdgeCases
[ OK ] Snappy.IOVecSinkEdgeCases (0 ms)
[ RUN ] Snappy.IOVecLiteralOverflow
[ OK ] Snappy.IOVecLiteralOverflow (0 ms)
[ RUN ] Snappy.IOVecCopyOverflow
[ OK ] Snappy.IOVecCopyOverflow (0 ms)
[ RUN ] Snappy.ReadPastEndOfBuffer
[ OK ] Snappy.ReadPastEndOfBuffer (0 ms)
[ RUN ] Snappy.ZeroOffsetCopy
[ OK ] Snappy.ZeroOffsetCopy (0 ms)
[ RUN ] Snappy.ZeroOffsetCopyValidation
[ OK ] Snappy.ZeroOffsetCopyValidation (0 ms)
[ RUN ] Snappy.FindMatchLength
[ OK ] Snappy.FindMatchLength (0 ms)
[ RUN ] Snappy.FindMatchLengthRandom
[ OK ] Snappy.FindMatchLengthRandom (1089 ms)
[ RUN ] Snappy.VerifyCharTable
[ OK ] Snappy.VerifyCharTable (0 ms)
[ RUN ] Snappy.TestBenchmarkFiles
[ OK ] Snappy.TestBenchmarkFiles (409 ms)
[----------] 17 tests from Snappy (26426 ms total)
[----------] 3 tests from SnappyCorruption
[ RUN ] SnappyCorruption.TruncatedVarint
[ OK ] SnappyCorruption.TruncatedVarint (0 ms)
[ RUN ] SnappyCorruption.UnterminatedVarint
[ OK ] SnappyCorruption.UnterminatedVarint (0 ms)
[ RUN ] SnappyCorruption.OverflowingVarint
[ OK ] SnappyCorruption.OverflowingVarint (0 ms)
[----------] 3 tests from SnappyCorruption (0 ms total)
[----------] Global test environment tear-down
[==========] 21 tests from 3 test suites ran. (26441 ms total)
YOU HAVE 1 DISABLED TEST
[ PASSED ] 21 tests.