Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mqbc::StorageMgr, RecoveryMgr: Thread safety improvements #367

Merged
merged 16 commits into from
Aug 28, 2024

Conversation

kaikulimu
Copy link
Collaborator

This series of changes improved the thread safety of components integral to the FSM mode, including mqbc::StorageManager, mqbc::RecoveryManager, and other neighboring components.

@kaikulimu kaikulimu requested a review from a team as a code owner July 22, 2024 20:11
@@ -79,13 +79,13 @@ class ClusterDataIdentity {
private:
// DATA

bsl::string d_name;
const bsl::string d_name;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid adding these consts here, this can lead to unusually restrictive class design issues later. If the intent is to ensure these fields are never modified after class construction, then we should provide an interface that guarantees that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Our interface already has such guarantees, so I will remove the consts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d

Comment on lines 241 to 252
/// Get a modifiable reference to this object's event scheduler.
bdlmt::EventScheduler* scheduler();

/// Get a modifiable reference to this object's buffer factory.
bdlbb::BlobBufferFactory* bufferFactory();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the return value be NULL here? If not, should we even be returning a pointer type? We should specify.

Copy link
Collaborator Author

@kaikulimu kaikulimu Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they cannot be NULL here. I will add some asserts in the constructor, and change from pointer to reference here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that mqbc::ClusterData itself only has pointers to components such as bdlmt::EventScheduler and bdlbb::BlobBufferFactory (see here). Take the example of bdlbb::BlobBufferFactory, the actual owner is mqba::Application::d_bufferFactory, and it's passed as pointer from mqba::Application -> mqbblp::ClusterCatalog -> mqbblp::Cluster -> mqbc::ClusterData. Given that, would it make sense to keep them as pointers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will also make copy constructor and copy assignment operator private, since this object is not supposed to be copyable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d

@kaikulimu kaikulimu force-pushed the fsm-test__20240722__thread_safety branch from c754199 to f3d03d7 Compare July 22, 2024 20:35
/// Initialize the queue key info map based on information in the specified
/// `clusterState`.
virtual void initializeQueueKeyInfoMap(
const mqbc::ClusterState* clusterState) BSLS_KEYWORD_OVERRIDE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clusterState is a const parameter, it's not optional and required every time. Is it possible to convert it to const reference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d

}

void StorageManager::initializeQueueKeyInfoMap(
BSLS_ANNOTATION_UNUSED const mqbc::ClusterState* clusterState)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you might remove the annotation and arg variable name to silence the unused variable compiler warning

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will keep it as BSLS_ANNOTATION_UNUSED to be consistent with other functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d

Copy link
Collaborator

@pniedzielski pniedzielski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see most of the places where we've fixed thread-safety, but I have a few questions about some changes.

src/groups/mqb/mqbs/mqbs_filestore.cpp Show resolved Hide resolved
src/groups/mqb/mqbc/mqbc_recoverymanager.cpp Show resolved Hide resolved
// PRECONDITION
BSLS_ASSERT_SAFE(d_dispatcher_p->inDispatcherThread(d_cluster_p));

BSLS_ASSERT_OPT(false && "This method should only be invoked in FSM mode");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really what we want? This pair of asserts seems wrong...the second one can fire even if we are in the dispatcher thread?

Copy link
Collaborator Author

@kaikulimu kaikulimu Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only caller of initializeQueueKeyInfoMap() is mqbc::ClusterStateManager, and it should always use mqbc version of StorageManager. By design, only the FSM version of this method, mqbc::StorageManager::initializeQueueKeyInfoMap(), can be invoked. If you can think of a better way to express this, please let me know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In this case, the false makes sense. Perhaps, "Only the FSM version of this method from mqbc::StorageManager should be invoked." as the string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Will change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d

src/groups/mqb/mqbc/mqbc_storagemanager.cpp Outdated Show resolved Hide resolved
src/groups/mqb/mqbi/mqbi_storagemanager.h Outdated Show resolved Hide resolved
src/groups/mqb/mqbc/mqbc_storagemanager.t.cpp Show resolved Hide resolved
@kaikulimu kaikulimu force-pushed the fsm-test__20240722__thread_safety branch from df6ca25 to 3bcd80b Compare August 23, 2024 20:39
@kaikulimu
Copy link
Collaborator Author

@678098 @hallfox @pniedzielski Back for your review

Copy link
Collaborator

@pniedzielski pniedzielski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me!

Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 184 of commit 39fa4e7 has completed with FAILURE

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

bmq-oss-ci[bot]

This comment was marked as duplicate.

@kaikulimu kaikulimu force-pushed the fsm-test__20240722__thread_safety branch from 39fa4e7 to d0301c8 Compare August 26, 2024 21:53
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 192 of commit d0301c8 has completed with FAILURE

@kaikulimu kaikulimu merged commit 0f3466d into bloomberg:main Aug 28, 2024
24 checks passed
@kaikulimu kaikulimu deleted the fsm-test__20240722__thread_safety branch August 29, 2024 16:28
syuzvinsky pushed a commit to syuzvinsky/blazingmq that referenced this pull request Aug 30, 2024
)

* mqbs:FileStore: Improve rc logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageManager.t: Remove the concept of replica healing stages

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageManager [new]

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Clean up manipulators and accessors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Fix mqbc::StorageMgr: Initialize queue key info map in cluster thread

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr.t: Set partition primary also in cluster state

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Apply clang-format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* PR#367: Address feedback

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Return reference not pointer for non-nullables

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc: Make copy constructor and copy assignment private

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* blp::StorageManager: Improve assert logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageMgr: Fix compilation errors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryManager: Apply clang format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

---------

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
)

* mqbs:FileStore: Improve rc logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageManager.t: Remove the concept of replica healing stages

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageManager [new]

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Clean up manipulators and accessors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Fix mqbc::StorageMgr: Initialize queue key info map in cluster thread

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr.t: Set partition primary also in cluster state

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Apply clang-format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* PR#367: Address feedback

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Return reference not pointer for non-nullables

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc: Make copy constructor and copy assignment private

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* blp::StorageManager: Improve assert logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageMgr: Fix compilation errors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryManager: Apply clang format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

---------

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
)

* mqbs:FileStore: Improve rc logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageManager.t: Remove the concept of replica healing stages

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageManager [new]

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Clean up manipulators and accessors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Fix mqbc::StorageMgr: Initialize queue key info map in cluster thread

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr.t: Set partition primary also in cluster state

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Apply clang-format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* PR#367: Address feedback

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Return reference not pointer for non-nullables

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc: Make copy constructor and copy assignment private

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* blp::StorageManager: Improve assert logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageMgr: Fix compilation errors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryManager: Apply clang format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

---------

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>
alexander-e1off pushed a commit to alexander-e1off/blazingmq that referenced this pull request Oct 24, 2024
)

* mqbs:FileStore: Improve rc logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageManager.t: Remove the concept of replica healing stages

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageManager [new]

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Clean up manipulators and accessors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Fix mqbc::StorageMgr: Initialize queue key info map in cluster thread

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr.t: Set partition primary also in cluster state

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::StorageMgr: Thread safety improvements

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* Apply clang-format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* PR#367: Address feedback

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::ClusterData: Return reference not pointer for non-nullables

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc: Make copy constructor and copy assignment private

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* blp::StorageManager: Improve assert logging

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbmock::StorageMgr: Fix compilation errors

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

* mqbc::RecoveryManager: Apply clang format

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>

---------

Signed-off-by: Yuan Jing Vincent Yan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants