Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p/simulations: fix a deadlock and clean up adapters #17891

Merged
merged 7 commits into from
Oct 11, 2018

Conversation

fjl
Copy link
Contributor

@fjl fjl commented Oct 10, 2018

This fixes a rare deadlock with the inproc adapter:

  • A node is stopped, which acquires Network.lock.
  • The protocol code being simulated (swarm/network in my case)
    waits for its goroutines to shut down.
  • One of those goroutines calls into the simulation to add a peer, which
    waits for Network.lock.

The fix for the deadlock is really simple, just release the lock before stopping the
simulation node.

Other changes in this PR clean up the exec adapter so it reports
node startup errors better and remove the docker adapter because it just adds
overhead.

This avoids log parsing and allows reporting startup errors to the
simulation host.

A small change in package node was needed because simulation nodes use
port zero. Node.{HTTP,WS}Endpoint now return the live endpoints after
startup by checking the TCP listener.
This fixes a rare deadlock with the inproc adapter:

- A node is stopped, which acquires Network.lock.
- The protocol code being simulated (swarm/network in my case)
  waits for its goroutines to shut down.
- One of those goroutines calls into the simulation to add a peer, which
  waits for Network.lock.
@fjl fjl requested a review from zsfelfoldi as a code owner October 10, 2018 22:29
@fjl fjl changed the title 2p/simulations: avoid holding Network lock while stopping node p2p/simulations: avoid holding Network lock while stopping node Oct 10, 2018
@fjl fjl changed the title p2p/simulations: avoid holding Network lock while stopping node p2p/simulations: fix a rare deadlock with the inproc adapter Oct 10, 2018
@fjl fjl requested a review from acud October 10, 2018 22:30
@fjl
Copy link
Contributor Author

fjl commented Oct 10, 2018

@justelad the adapters code would be a lot simpler if we removed the docker adapter. Why does it exist? I don't see the benefit of running the simulation binary through docker vs. just executing it directly.

@nonsense
Copy link
Member

cc @lmars as I believe he has written some, if not most, of this code.

@acud
Copy link
Member

acud commented Oct 11, 2018

@lmars do we really need to be running docker builds in a test run?

@fjl fjl requested a review from zelig as a code owner October 11, 2018 08:52
@fjl
Copy link
Contributor Author

fjl commented Oct 11, 2018

Docker adapter is now removed after short discussion with @nonsense.

@fjl fjl changed the title p2p/simulations: fix a rare deadlock with the inproc adapter p2p/simulations: fix a deadlock and clean up adapters Oct 11, 2018
@lmars
Copy link
Contributor

lmars commented Oct 11, 2018

FYI the DockerAdapter was added to make it easy to run simulations remotely (either on a single Docker host or on a Docker Swarm cluster), and was useful for running the overlay simulation in Docker and visualising the simulation with the visualisation demo.

The vision for the simulations code was that it was more about spinning up simulation clusters with a particular configuration, triggering actions and then observing the result. We were planning to add a generic "remote" adapter that could be pointed at say AWS to simulate more realistic scenarios (mass node churn, net splits, potential attacks etc.) and see what happens. This is why we added the p2psim CLI as a more interactive way to manage a simulation cluster.

I feel like removing the Docker adapter is a bit of a step back from that vision, narrowing the scope to just be about running local clusters (which aren't really realistic). If we're happy with this narrowing of scope (or feel like running more realistic simulations should be done elsewhere), I agree the Docker adapter adds unnecessary complexity.

@fjl
Copy link
Contributor Author

fjl commented Oct 11, 2018

What we discussed is that the docker adapter is useless in its current form. My hunch is that a 'remote' adapter would require significant changes anyway. We'll see what will be needed if/when anyone finds the time to actually implement a remote adapter.

@lmars
Copy link
Contributor

lmars commented Oct 11, 2018

I agree. At least you now know I had good intentions 😄

@nonsense nonsense self-requested a review October 11, 2018 13:53
@fjl fjl merged commit dcae0d3 into ethereum:master Oct 11, 2018
@karalabe karalabe added this to the 1.8.18 milestone Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants