Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[21612] Data sharing performance improvement #5209

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

paxifaer
Copy link
Contributor

@paxifaer paxifaer commented Sep 7, 2024

Description

Compared with the traditional separate locking method, Acquire-Release read-write consistency provides a better solution in terms of performance, readability, scalability and memory consistency. When using ordinary locking mechanisms, it is often necessary to introduce global locks to protect shared resources, but this is inefficient. Acquire-Release allows multiple threads to safely operate shared data without complete mutual exclusion.

Contributor Checklist

  • Commit messages follow the project guidelines.
  • The code follows the style guidelines of this project.
  • Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added; the added tests pass locally
  • Any new/modified methods have been properly documented using Doxygen.
  • Any new configuration API has an equivalent XML API (with the corresponding XSD extension)
  • Changes are backport compatible: they do NOT break ABI nor change library core behavior.
  • Changes are API compatible.
  • New feature has been added to the versions.md file (if applicable).
  • New feature has been documented/Current behavior is correctly described in the documentation.
  • Applicable backports have been included in the description.

Reviewer Checklist

  • The PR has a milestone assigned.
  • The title and description correctly express the PR's purpose.
  • Check contributor checklist is correct.
  • If this is a critical bug fix, backports to the critical-only supported branches have been requested.
  • Check CI results: changes do not issue any warning.
  • Check CI results: failing tests are unrelated with the changes.

@paxifaer paxifaer closed this Sep 7, 2024
Mario-DL and others added 26 commits September 7, 2024 10:33
…ch (eProsima#5071)

* Refs #20181: Add BB test

Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>

* Refs #20181: Add Fix

Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>

* Refs #20181: linter

Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>

* Refs #20181. Pass in secure_endpoints as lambda capture.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #20181. New approach.

Automatically sending DATA(p) when receiving a DATA(p) could lead to an infinite ping-pong between the two participants.
This resulted in some cases in the transport threads eating all CPU resources.

The new approach matches the discovered participant to the builtin non-secure PDP writer, so it will receive the DATA(p) of the local participant in the next periodic announcement.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #20181. Unmatch non-secure before matching secure.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

---------

Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Co-authored-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Co-authored-by: Jesús Poderoso <120394830+JesusPoderoso@users.noreply.github.com>
Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: JesusPoderoso <jesuspoderoso@eprosima.com>
Signed-off-by: paxifaer <807128216@qq.com>
* Use steady_clock instead of high_resolution_clock for status checks (high_resolution_clock might not be steady depending on STL impl)

Signed-off-by: Matthias Schneider <ma30002000@yahoo.de>

* Use steady_clock instead for system_clock for calculating timeouts

Signed-off-by: Matthias Schneider <ma30002000@yahoo.de>

* Use correct clock's duration for duration_cast

Signed-off-by: Matthias Schneider <ma30002000@yahoo.de>

* Use Time_t::now()

Signed-off-by: Matthias Schneider <ma30002000@yahoo.de>

* Fix build.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #21314. Refactor on DataWriterImpl.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #21314. Refactor on DataReaderImpl.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #21314. Refactor on StatefulWriter.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Refs #21314. Protect current_time_since_unix_epoch against clock adjustments.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>

* Revert "Use steady_clock instead of high_resolution_clock for status checks (high_resolution_clock might not be steady depending on STL impl)"

This reverts commit d69eb91.

---------

Signed-off-by: Matthias Schneider <ma30002000@yahoo.de>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Co-authored-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: paxifaer <807128216@qq.com>
…d complete TypeObject (eProsima#5181)

* Refs #21509. Fix

Signed-off-by: Ricardo González Moreno <ricardo@richiware.dev>

* Refs #21322. Fixes for ROS2

Signed-off-by: Ricardo González Moreno <ricardo@richiware.dev>

* Refs #21322. Fix unit tests

Signed-off-by: Ricardo González Moreno <ricardo@richiware.dev>

---------

Signed-off-by: Ricardo González Moreno <ricardo@richiware.dev>
Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: JesusPoderoso <jesuspoderoso@eprosima.com>
Signed-off-by: paxifaer <807128216@qq.com>
…e read-write consistency provides a better solution in terms of performance, readability, scalability and memory consistency. When using ordinary locking mechanisms, it is often necessary to introduce global locks to protect shared resources, but this is inefficient. Acquire-Release allows multiple threads to safely operate shared data without complete mutual exclusion.

Signed-off-by: paxifaer <807128216@qq.com>
@JesusPoderoso
Copy link
Contributor

Hi @paxifaer, thanks for our contribution.
The eProsima team will review the PR in the following days.

@JesusPoderoso JesusPoderoso added this to the v3.0.2 milestone Sep 9, 2024
@JesusPoderoso JesusPoderoso changed the title Data sharing [21612] Data sharing performance improvement Sep 9, 2024
@JesusPoderoso JesusPoderoso added the needs-review PR that is ready to be reviewed label Sep 10, 2024
@paxifaer
Copy link
Contributor Author

Hi @paxifaer, thanks for our contribution. The eProsima team will review the PR in the following days.

Anybody review?

@JesusPoderoso
Copy link
Contributor

Hi @paxifaer

Anybody review?

Please, be patient. We have a considerable workload caused by the v3 release. In the following weeks, we will review your PR.

@paxifaer
Copy link
Contributor Author

Hi @paxifaer

Anybody review?

Please, be patient. We have a considerable workload caused by the v3 release. In the following weeks, we will review your PR.

hhh,ok,I just worry about that you forget it.I will be patient.Go head.

@MiguelCompany
Copy link
Member

@paxifaer In the meantime, could you please rebase this on top of master, leaving only the relevant changes (i.e. the last commit)?

@paxifaer
Copy link
Contributor Author

@paxifaer In the meantime, could you please rebase this on top of master, leaving only the relevant changes (i.e. the last commit)?

Is it okay for me to do this? I have merged the latest code

@MiguelCompany
Copy link
Member

@Mergifyio rebase

…e read-write consistency provides a better solution in terms of performance, readability, scalability and memory consistency. When using ordinary locking mechanisms, it is often necessary to introduce global locks to protect shared resources, but this is inefficient. Acquire-Release allows multiple threads to safely operate shared data without complete mutual exclusion.

Signed-off-by: paxifaer <807128216@qq.com>
Copy link
Contributor

mergify bot commented Oct 2, 2024

rebase

✅ Branch has been successfully rebased

// Using Acquire-Release semantics can avoid some blocking problems in traditional locking, such as deadlock and priority inversion.
// This usually means higher concurrency performance because threads don't have to wait for locks to be released.
// Through Acquire-Release, you can achieve finer-grained control over specific variables or data structures without having to lock the entire resource or object, which can reduce contention and improve concurrency.
writer_pools_changed_.store(false,std::memory_order_release);

// Loop on the writers looking for data not read yet
for (auto it = writer_pools_.begin(); it != writer_pools_.end(); ++it)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried about the validity of this iterator changing during the next few lines (until we get to std::shared_ptr<ReaderPool> pool = it->pool;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there was a problem with effectiveness here, it would have been exposed before. I'll keep an eye on this later, and if there's a problem I'll fix it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The places where writer_pools_ is modified are protected by taking mutex_.
Your PR is removing the guard on the mutex_ during the traversal of writer_pools_ here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The places where writer_pools_ is modified are protected by taking mutex_. Your PR is removing the guard on the mutex_ during the traversal of writer_pools_ here.

Don't worry about this, I use the lock_free mechanism here. It is thread safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writer_pools_ is a std::vector<WriterInfo> which is not thread-safe. If DataSharingListener::remove_datasharing_writer() is called on one thread while another thread is inside DataSharingListener::process_new_data(), an invalidated iterator on the vector could be dereferenced

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. But I tested it many times under multi-thread conditions and there was no problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-locked

Comment on lines 136 to 140
notification_->notification_->new_data.store(false,std::memory_order_release);
// Using Acquire-Release semantics can avoid some blocking problems in traditional locking, such as deadlock and priority inversion.
// This usually means higher concurrency performance because threads don't have to wait for locks to be released.
// Through Acquire-Release, you can achieve finer-grained control over specific variables or data structures without having to lock the entire resource or object, which can reduce contention and improve concurrency.
writer_pools_changed_.store(false,std::memory_order_release);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix uncrustify here

@MiguelCompany
Copy link
Member

@paxifaer Please check my review here

Signed-off-by: paxifaer <807128216@qq.com>
…nto data_sharing

Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: paxifaer <807128216@qq.com>
@paxifaer
Copy link
Contributor Author

paxifaer commented Oct 3, 2024

@paxifaer Please check my review here

I adapted the code the way you did. Why does it still go wrong?

@paxifaer
Copy link
Contributor Author

paxifaer commented Oct 3, 2024

@paxifaer Please check my review here

Have your uncrustify configuration files changed?

@rsanchez15 rsanchez15 modified the milestones: v3.1.0, v3.1.1 Oct 3, 2024
…y in test.

Signed-off-by: paxifaer <807128216@qq.com>
Signed-off-by: paxifaer <807128216@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-review PR that is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants