Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

swarm/network: Revised depth and health for Kademlia #1051

Closed
wants to merge 26 commits into from

Conversation

nolash
Copy link
Contributor

@nolash nolash commented Dec 12, 2018

The contents of this PR are:

  1. A stricter definition of kademlia depth.
  2. An adapted definition of health, with corresponding changes to the related methods.
  3. Moving health-related methods to PeerPot as receiver.
  4. Reactivate tests that were skipped because of health problems.

1. Depth

For your pleasure we attempt both a formal definition and a definition legible for puny humans.


Humans:

Proximity order - po - is a metric of distance between two peer addresses. You obtain the numerical value of po by:

  • Performing XOR on the two addresses
  • Count the number of leading bits in the result, from left to right.

In terms of depth, deeper means higher numeric po value, and shallower means lower numeric po value.

When a peer is deeper it is also closer.

Your neighbors are your closest peers

The depth is the whichever is shallower of:

  • the shallowest bin that is empty
  • the bin of the shallowest of your neighbors

To find your neighbors:

  1. Identify the MinProxBinSize number of your deepest peers.
  2. Identify the shallowest bin of those peers.
  3. Your neighbors are all the peers in that bin and deeper.

(the name MinProxBinSize should probably change to NeighbourhoodSize, but let's not get ahead of ourselves)


Formal: (pending review)

depth d such that for every i < d, |{p, PO(p) = i }|>= 1 and |{p, PO(p) >= d}|>=MinProxBinSize

2. Health

The full parameter has been removed both from the PeerPot and as a condition of health in the test code. This is because the evaluating whether a kademlia was "full" involved allowing empty bins shallower than the depth. With the new definition of depth, this cannot occur.

Thus, currently the only condition for a healthy node is if the node is connected to the neighbours it should be connected to. "Should have" means that we know of all the nodes in the network, and thus can tell which ones are the closest of all of them and thus that the node should be connected to them.

Saturation has been added as a property of the PeerPot object.

The "got" ("which peers do I have") and "missing" ("who peers should I have had but don't") results are also added to the knowNeighbours (prev: knowNearestNeighbours) method, to conform with the connectedNeighbours (prev: gotNearestNeighbours) method.

3. Move test methods

The methods knowNeighbours, connectedNeighbours and Healthy now have PeerPot as receiver. PeerPot extends Kademlia. The reasoning for this is as follows:

Methods involving PeerPot are dependent on kademlia, and instantiation of PeerPot depends on a kademlia parameter. By reversing the relationship, fewer arguments are needed when calling the methods. In general, it seems to simplify the code.

Previously a MinProxBinSize parameter was being passed to the health functions externally. In every case this was the same as the hardcoded MinProxBinSize value in the default kademlia params. It is assumed that the value never will be anything else, and if it is, it should anyway be the same value as the PeerPot was created with.

Furthermore, health-related methods are only used in tests, and should be concealed from production code. Moving the methods from the Kademlia object to the PeerPot object serves as preparation for this.

4. Test reactivation

Currently the following tests are failing:

  • in swarm/network/kademlia_test.go: TestSuggestPeerFindPeers skipped
  • tests in swarm/network/simulations/discovery/

Furthermore, tests auditing depth and healthy should be revised the make sure they test the new definitions correctly. These include:

@nolash nolash changed the title swarm/network: Revised depth calculation for Kademlia swarm/network: Revised depth and health for Kademlia Dec 14, 2018
@nolash
Copy link
Contributor Author

nolash commented Dec 17, 2018

The SuggestPeersFindPeers test will be skipped for this PR, but will be fixed in immediately following PR. #1063

Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitTillHealthy changes should be in another separate PR I think.
ready to submit after linter error fixed https://travis-ci.org/ethersphere/go-ethereum/jobs/469257146#L513

KnowNN bool // whether node knows all its nearest neighbours
CountKnowNN int // amount of nearest neighbors connected to
CulpritsKnowNN [][]byte // which known NNs are missing
GotNN bool // whether node is connected to all its nearest neighbours
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should rename these fields to connected too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This terminology of Got and Culprits is not clear to me

@nolash
Copy link
Contributor Author

nolash commented Dec 18, 2018

@zelig I disagree, I'm afraid. The only change made to the Simulation is a parameter that let's you specify a minproxbinsize that may even be different from the ones the kademlias in the simulation has. I think this is highly relevant to depth correctness.

// It is used in Healthy function for testing only
func (k *Kademlia) saturation(n int) int {
// saturation iterates through all peers and
// returns the smallest po value in which the node has less than n peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is n peers? There is no n in this function? Is n == k.MinBinSize?

Copy link
Contributor Author

@nolash nolash Dec 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saturation is treated in next PR. Let's clarify it there. (even if I actually changed the comment here)

swarm/network/kademlia.go Outdated Show resolved Hide resolved
swarm/network/kademlia.go Outdated Show resolved Hide resolved
swarm/network/kademlia.go Outdated Show resolved Hide resolved
swarm/network/kademlia.go Outdated Show resolved Hide resolved
KnowNN bool // whether node knows all its nearest neighbours
CountKnowNN int // amount of nearest neighbors connected to
CulpritsKnowNN [][]byte // which known NNs are missing
GotNN bool // whether node is connected to all its nearest neighbours
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This terminology of Got and Culprits is not clear to me

@@ -186,72 +290,98 @@ func TestSuggestPeerBug(t *testing.T) {
}

func TestSuggestPeerFindPeers(t *testing.T) {
t.Skip("The SuggestPeers implementation seems to have weaknesses exposed by the change in the new depth calculation. The results are no longer predictable")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on - we introduce a change here which breaks a test and we simply skip the test? Is that healthy :) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we fix from the ground up we will inevitably break things further up. In the interest of keeping deltas small between PRs, it is skipped. Fixing it is not necessarily trivial.

Anyway see comment from @zelig below:

You can still submit separate on ethersphere but on upstream lets not do this.

swarm/network/kademlia_test.go Show resolved Hide resolved
swarm/pss/api.go Outdated Show resolved Hide resolved
Copy link
Contributor

@holisticode holisticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, I am happy to see things clearing up with everything related to kademlia.

I am mostly requesting some clarification of some comments

Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it needs a rebase since at least javier's PR is merged in.

@@ -388,6 +390,7 @@ func (k *Kademlia) EachBin(base []byte, pof pot.Pof, o int, eachBinFunc func(con
// EachConn is an iterator with args (base, po, f) applies f to each live peer
// that has proximity order po or less as measured from the base
// if base is nil, kademlia base address is used
// It returns peers in order deepest to shallowest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% this terminology is the right one. It is apparent in this context.
Maybe in the context of a kademlia table the container represents a view of the network from the point of view of the node. With the queen of the hive sitting in deep inside, we can talk about proximity orders in terms of deep and shallow. But if the pivot is another peer it is strange to say you iterate our hive from deep to shallow. It is just best to say from closest to farthest from the pivot or DECENDING PROXIMITY

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just best to say from closest to farthest from the pivot or DECENDING PROXIMITY

The main objective of introducing this terminology is both to simplify and disambiguate. In the spirit of the former, I don't think the quote above serves the purpose.

If the term depth can't serve both purposes, then let's use two terms.

// iterate through nearest neighbors in the peerpot map
// if we can't find the neighbor in the map we created above
// then we don't know all our neighbors
// (which sadly is all too common in modern society)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

khm

swarm/network/kademlia.go Outdated Show resolved Hide resolved
swarm/network/kademlia.go Outdated Show resolved Hide resolved
@zelig
Copy link
Member

zelig commented Dec 22, 2018

merged as ethereum/go-ethereum#18354

@zelig zelig closed this Dec 22, 2018
@frncmx frncmx deleted the depth-the-next-generation branch January 22, 2019 09:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants