Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

unhealthy snapshot passes health check in kademlia.go #1263

Closed
acud opened this issue Feb 22, 2019 · 4 comments
Closed

unhealthy snapshot passes health check in kademlia.go #1263

acud opened this issue Feb 22, 2019 · 4 comments
Assignees
Labels

Comments

@acud
Copy link
Member

acud commented Feb 22, 2019

A 6 node snapshot was generated in the swarm-snapshot passes the health check in kademlia.GetHealthInfo:

Thu Feb 21 13:27:31 UTC 2019 KΛÐΞMLIΛ hive: queen's address: 57d759
population: 2 (5), NeighbourhoodSize: 2, MinBinSize: 2, MaxBinSize: 4
============ DEPTH: 0 ==========================================
000  0                              |  2 c6c1 (1) e90d (0)
001  1 02de                         |  1 02de (0)
002  1 6ef4                         |  1 6ef4 (0)
003  0                              |  0
004  0                              |  0
005  0                              |  1 52b0 (0)
006  0                              |  0
007  0                              |  0
008  0                              |  0
009  0                              |  0
010  0                              |  0
011  0                              |  0
012  0                              |  0
013  0                              |  0
014  0                              |  0
015  0                              |  0
=========================================================================

The Swarm snapshot binary calls the WaitTillHealthy method which apparently returns a false true.
in GetHealthInfo, there is the following method call:
gotnn, countgotnn, culpritsgotnn := k.connectedNeighbours(pp.NNSet) which apparently returns gotnn=true as a false true.

@acud acud added the kademlia label Feb 22, 2019
@nolash
Copy link
Contributor

nolash commented Feb 22, 2019

Also the simulation.WaitTillHealthy() seems to only check that connection is made to NN. This still makes it possible for something to be healthy with only 2 connected nodes, even if you have many more in the network. Any simulation test that passes with topologies generated from this case passes on a false premise.

@gluk256
Copy link

gluk256 commented Feb 22, 2019

here is a sample file

@zelig
Copy link
Member

zelig commented Feb 22, 2019

thanks guys, rabbit hole ah? well. this should be pretty clearly not healthy. Could someone write a unit test that catches this?

@acud
Copy link
Member Author

acud commented Feb 22, 2019

@zelig, @holisticode, the actual snapshot file might not necessarily reveal the problem WaitTillHealthy as the Healthy function might be called before the node knows all NNs

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants