-
Notifications
You must be signed in to change notification settings - Fork 20k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p/simulations: fix a deadlock and clean up adapters #17891
Conversation
This avoids log parsing and allows reporting startup errors to the simulation host. A small change in package node was needed because simulation nodes use port zero. Node.{HTTP,WS}Endpoint now return the live endpoints after startup by checking the TCP listener.
This fixes a rare deadlock with the inproc adapter: - A node is stopped, which acquires Network.lock. - The protocol code being simulated (swarm/network in my case) waits for its goroutines to shut down. - One of those goroutines calls into the simulation to add a peer, which waits for Network.lock.
@justelad the adapters code would be a lot simpler if we removed the docker adapter. Why does it exist? I don't see the benefit of running the simulation binary through docker vs. just executing it directly. |
df4efc5
to
093afaf
Compare
093afaf
to
d8c86d7
Compare
cc @lmars as I believe he has written some, if not most, of this code. |
@lmars do we really need to be running docker builds in a test run? |
Docker adapter is now removed after short discussion with @nonsense. |
FYI the DockerAdapter was added to make it easy to run simulations remotely (either on a single Docker host or on a Docker Swarm cluster), and was useful for running the overlay simulation in Docker and visualising the simulation with the visualisation demo. The vision for the simulations code was that it was more about spinning up simulation clusters with a particular configuration, triggering actions and then observing the result. We were planning to add a generic "remote" adapter that could be pointed at say AWS to simulate more realistic scenarios (mass node churn, net splits, potential attacks etc.) and see what happens. This is why we added the I feel like removing the Docker adapter is a bit of a step back from that vision, narrowing the scope to just be about running local clusters (which aren't really realistic). If we're happy with this narrowing of scope (or feel like running more realistic simulations should be done elsewhere), I agree the Docker adapter adds unnecessary complexity. |
What we discussed is that the docker adapter is useless in its current form. My hunch is that a 'remote' adapter would require significant changes anyway. We'll see what will be needed if/when anyone finds the time to actually implement a remote adapter. |
I agree. At least you now know I had good intentions 😄 |
This fixes a rare deadlock with the inproc adapter:
waits for its goroutines to shut down.
waits for Network.lock.
The fix for the deadlock is really simple, just release the lock before stopping the
simulation node.
Other changes in this PR clean up the exec adapter so it reports
node startup errors better and remove the docker adapter because it just adds
overhead.