Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix[mqbs::DataStoreRecordKeyHashAlgo]: prevent trivial collisions #387

Merged
merged 7 commits into from
Aug 5, 2024

Conversation

678098
Copy link
Collaborator

@678098 678098 commented Aug 2, 2024

mqbs::DataStoreRecordKey has 2 fields d_sequenceNum (uint64_t) and d_primaryLeaseId (uint32_t):

unsigned int d_primaryLeaseId;

Before this PR, we calc a hash for this class as a sum d_sequenceNum + d_primaryLeaseId converted to uint64_t:

DataStoreRecordKeyHashAlgo::operator()(const TYPE& type) const
{
    return type.d_sequenceNum +
           static_cast<bsls::Types::Uint64>(type.d_primaryLeaseId);
}

Note that these 2 fields are monotonically incremented, also, when we increment primaryLeaseId, we set sequenceNum to 0:

if (primaryLeaseId > d_primaryLeaseId) {
// Reset the sequence number only if primary leaseId has been bumped
// up.
d_sequenceNum = 0;
}

So it's possible to have multiple colliding hashes like this:

primaryLeaseId | sequenceNum | sum (hash)
=========================================
             0 |           5 |          5
             1 |           4 |          5
             2 |           3 |          5
             3 |           2 |          5
             4 |           1 |          5
             5 |           0 |          5

Or another scenario, when an entire sequence of collisions is generated:

primaryLeaseId | sequenceNum | sum (hash)
=========================================
             1 |           0 |          1
             1 |           1 |          2
             1 |           2 |          3
             1 |           3 |          4
             1 |           4 |          5
>== primary changes, ++primaryLeaseId ==<
             2 |           0 |          2 COLLISION
             2 |           1 |          3 COLLISION
             2 |           2 |          4 COLLISION
             2 |           3 |          5 COLLISION
             2 |           4 |          6

The easy way to address this, is to left-shift d_primaryLeaseId, so this uint32_t value will be added to the highest bits of the hash:

DataStoreRecordKeyHashAlgo::operator()(const TYPE& type) const
{
    return type.d_sequenceNum +
           (static_cast<bsls::Types::Uint64>(type.d_primaryLeaseId) << 32);
}

From the real usage of this structure, we might expect small values for d_primaryLeaseId less than 1000 and d_sequenceNum up to 10s of millions:

02AUG2024_18:10:36.918 (139806285756160) INFO mqbs_filestore.cpp:4664 PartitionId [2] (cluster: local): Received SyncPt indicating rollover: [ primaryLeaseId = 520 sequenceNum = 55479461 dataFileOffsetDwords = 805266269 qlistFileOffsetWords = 206 ], at journal offset: 14131124. Initiating rollover.

So it's very unprobable that d_sequenceNumber in the hash will reach bits occupied by the shifted d_primaryLeaseId.

@678098 678098 requested a review from a team as a code owner August 2, 2024 18:28
678098 added 3 commits August 2, 2024 19:30
Signed-off-by: Evgeny Malygin <[email protected]>
Signed-off-by: Evgeny Malygin <[email protected]>
Signed-off-by: Evgeny Malygin <[email protected]>
@678098
Copy link
Collaborator Author

678098 commented Aug 2, 2024

#388
Related

678098 added 3 commits August 5, 2024 14:44
Signed-off-by: Evgeny Malygin <[email protected]>
Signed-off-by: Evgeny Malygin <[email protected]>
Signed-off-by: Evgeny Malygin <[email protected]>
Copy link
Collaborator

@kaikulimu kaikulimu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement!

@678098 678098 merged commit d84c800 into main Aug 5, 2024
31 checks passed
@678098 678098 deleted the 678098-patch-4 branch August 5, 2024 23:11
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants