-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add DecompressInto for decompression of payloads with known sizes #130
Conversation
@Viq111 — Please let me know your thoughts. Happy to rework the interface as needed. |
@Viq111 — Is this something that you might be willing to merge upstream, or should I fork? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your contribution!
Sorry for the delay, the end of quarter is always a bit busier.
After the comments are resolved, the PR lgtm!
@@ -161,3 +161,18 @@ func Decompress(dst, src []byte) ([]byte, error) { | |||
defer r.Close() | |||
return ioutil.ReadAll(r) | |||
} | |||
|
|||
// DecompressInto decompresses src into dst. Unlike Decompress, DecompressInto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you have Decompress
calls DecompressInto
at l.146 so we don't have to repeat the call to C.ZSTD_decompress
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
zstd.go
Outdated
// payload before attempting decompression. | ||
// | ||
// If dst is too small, DecompressInto returns an error. | ||
func DecompressInto(dst, src []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make it a bit closer to golang stdlib, I'd probably have the method signature also returns the nb of bytes written (DecompressInto(dst, src []byte) (int, error)
like https://pkg.go.dev/io#Copy). You can always ignore it for your use-case but I think it will usage a bit more wide at no extra cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done.
In some systems, compressed payloads are prefixed or tagged with the size of the original payload. This allows allocation of a perfectly-sized buffer to hold the decompressed payload. The current Decompress interface prohibits this type of usage, because Decompress allocates if the destination buffer is smaller than the worst-case bound. In practice, a perfectly-sized buffer will always be smaller than what's required for the worst-case payload. This commit introduces an additional DecompressInto entrypoint. DecompressInto accepts a destination buffer and requires that the decompressed payload fit into the buffer. If it does not, it returns the original error returned by zstd.
No worries! Thanks for reviewing :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change on benchmarks:
goos: darwin
goarch: arm64
pkg: github.com/DataDog/zstd
│ old │ new │
│ sec/op │ sec/op vs base │
Compression-10 66.37m ± ∞ ¹ 67.24m ± ∞ ¹ ~ (p=0.421 n=5)
Decompression-10 9.424m ± ∞ ¹ 9.336m ± ∞ ¹ ~ (p=0.222 n=5)
geomean 1.901m 1.916m +0.84%
¹ need >= 6 samples for confidence interval at level 0.95
│ old │ new │
│ B/s │ B/s vs base │
Compression-10 143.3Mi ± ∞ ¹ 141.4Mi ± ∞ ¹ ~ (p=0.421 n=5)
Decompression-10 1008.9Mi ± ∞ ¹ 1018.5Mi ± ∞ ¹ ~ (p=0.222 n=5)
geomean 259.7Mi 257.6Mi -0.83%
¹ need >= 6 samples for confidence interval at level 0.95
@ribasushi — It wasn't intentional on my side; my project does not use the zstd context API and I only had passing awareness of its existence. |
In some systems, compressed payloads are prefixed or tagged with the size of the original payload. This allows allocation of a perfectly-sized buffer to hold the decompressed payload. The current Decompress interface prohibits this type of usage, because Decompress allocates if the destination buffer is smaller than the worst-case bound. In practice, a perfectly-sized buffer will always be smaller than what's required for the worst-case payload.
This commit introduces an additional DecompressInto entrypoint. DecompressInto accepts a destination buffer and requires that the decompressed payload fit into the buffer. If it does not, it returns the original error returned by zstd.