You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The benchmark is a single-threaded writer that writes 512MiB of data and then reads all that data back. As I increase the buffer size, the graph looks a bit like this:
Where using a batch size of 32MiB makes the runtime suddenly spike to >10s where it is~3s on either side.
This issue can be reproduced by running: make bench PKG=./pkg/col/colserde BENCHES=BenchmarkQueue TESTFLAGS="-v -cpuprofile cpu.out -memprofile mem.out -store=pebble -bufsize=32MiB -blocksize=512KiB -datasize=512MiB"
and make bench PKG=./pkg/col/colserde BENCHES=BenchmarkQueue TESTFLAGS="-v -cpuprofile cpu16.out -memprofile mem16.out -store=pebble -bufsize=16MiB -blocksize=512KiB -datasize=512MiB"
The value size (-blocksize) is 512KiB in this benchmark.
The only difference I see in the cpu profile is a much larger percentage of time taken by memclrNoHeapPointers, possibly pointing to larger GC pressure in the 32MiB case, although the allocation profiles look largely similar (although *Batch.grow stands out in both cases, it seems like we should be able to reuse memory there).
It's possible that this is a problem with the code I wrote, but given that the only variable that changes between benchmarks is the size of a buffered batch, it seems unlikely.
The text was updated successfully, but these errors were encountered:
I'm not sure why there is such a big discrepancy at 32MB, though that buffer size does correspond to the large batch threshold (1/2 of the memtable size). If we set the buffer size to 31MB performance is good. If we set the memtable size to 128MB performance is good. The only difference I see is that when using a large batch we cycle through empty memtables which puts slightly more pressure on the GC. Why this causes such a dramatic slowdown is unclear.
With the move to manual memory management for the Cache and memtable (#523, #527, #529), this is likely no longer a problem. I attempted to verify that, but the BenchmarkDiskQueue benchmark no longer has support for Pebble.
While testing a pebble-backed on-disk queue implementation (using sha 454f971) for vectorized external storage, I ran into some surprising performance characteristics when changing the capacity of a
pebbleMapBatchWriter
(i.e. how many k/vs areSet
before callingFlush
on a batch). The options forPebble
are defined inNewTempEngine
https://github.com/cockroachdb/cockroach/blob/6e1539ac1488f8596376cc3ac32c9b0b334600a5/pkg/storage/engine/temp_engine.go#L117The benchmark is a single-threaded writer that writes
512MiB
of data and then reads all that data back. As I increase the buffer size, the graph looks a bit like this:Where using a batch size of
32MiB
makes the runtime suddenly spike to>10s
where it is~3s
on either side.The benchmark is
BenchmarkQueues
on my branch here: https://github.com/asubiotto/cockroach/commit/470481215325e20733a516bdeaa866f0d13ede56#diff-7599d388fb764b9163afbc7b01484ff1R124This issue can be reproduced by running:
make bench PKG=./pkg/col/colserde BENCHES=BenchmarkQueue TESTFLAGS="-v -cpuprofile cpu.out -memprofile mem.out -store=pebble -bufsize=32MiB -blocksize=512KiB -datasize=512MiB"
and
make bench PKG=./pkg/col/colserde BENCHES=BenchmarkQueue TESTFLAGS="-v -cpuprofile cpu16.out -memprofile mem16.out -store=pebble -bufsize=16MiB -blocksize=512KiB -datasize=512MiB"
The value size (
-blocksize
) is512KiB
in this benchmark.The only difference I see in the cpu profile is a much larger percentage of time taken by
memclrNoHeapPointers
, possibly pointing to larger GC pressure in the32MiB
case, although the allocation profiles look largely similar (although*Batch.grow
stands out in both cases, it seems like we should be able to reuse memory there).It's possible that this is a problem with the code I wrote, but given that the only variable that changes between benchmarks is the size of a buffered batch, it seems unlikely.
The text was updated successfully, but these errors were encountered: