row: joinreader has insufficient memory accounting #65904

jordanlewis · 2021-05-31T22:36:17Z

The joinReader, our index and lookup join processor, does not sufficiently account for much of the memory that it allocates and retains. When run in highly concurrent environments with memory pressure, this will cause the server to run out of memory due to the lack of backpressure.

For an easy example that can overwhelm a database in seconds, run TPCH query 4 with concurrency 1024. See #64906 for more background and other examples of this issue, including a sample setup that can reproduce this issue.

Here are some places where we do not account for memory sufficiently in joinReader.

All input rows are copied without memory accounting.

cockroach/pkg/sql/rowexec/joinreader.go

Line 578 in ff28a1c

jr.scratchInputRows = append(jr.scratchInputRows, jr.rowAlloc.CopyRow(row))
The spans generated by each input batch are not memory accounted. With some strategies like the index join strategy, there can be 10s of thousands of spans present in memory per join since the input batch is sized up to 4 megabytes of input data. And, the spans are sized in proportion to user data (since they are made of user-defined keys).

cockroach/pkg/sql/rowexec/joinreader_span_generator.go

Line 123 in bc465fb

return g.scratchSpans, nil
The span generators store maps that are unaccounted for. These need to be memory accounted, as they are sized in proportion to user data.

cockroach/pkg/sql/rowexec/joinreader_span_generator.go

Line 427 in bc465fb

g.keyToInputRowIndices[string(generatedSpan.Key)] = append(inputRowIndices, i)
The joinReaderOrderingStrategy uses an unaccounted-for multimap. It's just integers, but it grows in proportion to the batch size in rows.

cockroach/pkg/sql/rowexec/joinreader_strategies.go

Line 305 in bc465fb

inputRowIdxToLookedUpRowIndices [][]int

I think it's likely that there are more spots that are unaccounted for besides these, as well.

I took a cursory look at accounting for some of these issues. I found that it is difficult to do properly, because memory is not clearly owned by components of the processor. Memory is sometimes allocated by the joinReader itself, sometimes the joinReader's strategies, and sometimes the spanGenerators. The strategies and generators also retain pointers to some of the backing memory for reuse of the containers, but they don't have a clear spot at which they can zero the pointers inside of the containers to avoid retaining objects past their supposed lifetimes.

At a minimum, since the lifetimes are tricky, we need to over-account for all of this memory by ratcheting up a memory account as we see batches coming through and needing all of this scratch space.

Separately, I want to propose that we start rewriting the joinReader in the vectorized, starting with the simplest algorithm (index join). Doing this would immediately solve our memory accounting issues. It would also have ancillary benefits and I think would significantly improve performance. I will file a separate issue for that.

Epic: CRDB-8562

The text was updated successfully, but these errors were encountered:

This commit improves memory accounting in joinReader by accounting for the in-memory data structures used throughout execution. Fixes cockroachdb#65904 Release note: None

67771: rowexec: improve memory accounting in joinReader r=rytaft a=rytaft **memsize,rowexec: add `String` to `memsize` constants** Add `String` to the `memsize` constants and remove the `sizeOfString` constant from `aggregator.go`. Release note: None **rowexec: improve memory accounting in `joinReader`** This commit improves memory accounting in `joinReader` by accounting for the in-memory data structures used throughout execution. Fixes #65904 Release note: None Co-authored-by: Rebecca Taft <[email protected]>

This commit improves memory accounting in joinReader by accounting for the in-memory data structures used throughout execution. Fixes cockroachdb#65904 Release note: None

jordanlewis added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label May 31, 2021

This was referenced May 31, 2021

colexec: implement index join #65905

Closed

sql: can OOM a cluster with concurrent TPC-H load #64906

Closed

rytaft self-assigned this Jun 1, 2021

rytaft assigned mgartner Jun 1, 2021

jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021

exalate-issue-sync bot unassigned rytaft Jul 14, 2021

mgartner assigned rytaft Jul 14, 2021

rytaft mentioned this issue Jul 19, 2021

rowexec: improve memory accounting in joinReader #67771

Merged

craig bot closed this as completed in 19002f9 Jul 24, 2021

rytaft mentioned this issue Jul 24, 2021

release-21.1: rowexec: improve memory accounting in joinReader #68032

Closed

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

row: joinreader has insufficient memory accounting #65904

row: joinreader has insufficient memory accounting #65904

jordanlewis commented May 31, 2021 •

edited

Loading

row: joinreader has insufficient memory accounting #65904

row: joinreader has insufficient memory accounting #65904

Comments

jordanlewis commented May 31, 2021 • edited Loading

jordanlewis commented May 31, 2021 •

edited

Loading