Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

swarm/pss: forwarding function refactoring #1043

Closed
wants to merge 13 commits into from

Conversation

gluk256
Copy link

@gluk256 gluk256 commented Dec 4, 2018

fixes #995

@zelig
Copy link
Member

zelig commented Dec 4, 2018

prerequisite for ethereum/go-ethereum#18239

swarm/pss/pss.go Outdated Show resolved Hide resolved
Copy link
Member

@acud acud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gluk256 I just reviewed this PR but unfortunately there's no background or description about what it solves, what was the bug and so. Please let's have more context so that we can review more effectively.

swarm/pss/pss.go Show resolved Hide resolved
swarm/pss/pss.go Outdated
to := make([]byte, addressLength)
copy(to[:len(msg.To)], msg.To)
neighbourhoodDepth := p.Kademlia.NeighbourhoodDepth()
luminousRadius := len(msg.To) * 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I think the correct term would be luminosityRadius

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this name as noted @gluk256 because it's not grammatically correct.

swarm/pss/pss.go Outdated
// address, if there are any; otherwise only to one peer, closest to the recipient address.
// in any case, msg must be sent to at least one peer.
p.Kademlia.EachConn(to, addressLength*8, func(sp *network.Peer, po int, _ bool) bool {
isAddrMatch := (po == luminousRadius)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the po can be bigger here, no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, changed

swarm/pss/pss.go Outdated
sent++
}
}
mustContinue := (isAddrMatch || sent == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this before the if line 959 and replace in if-condition please, please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the same, because sent++ might be incremented

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. So maybe:

if !p.trySend(sp, msg) {
    return true
}
[...]
return isAddrMatch

instead?

swarm/pss/pss.go Outdated
if isDstInProxBin {
// forward to all the nearest neighbours of the forwarding node
p.Kademlia.EachConn(p.BaseAddr(), addressLength*8, func(sp *network.Peer, _ int, isproxbin bool) bool {
if isproxbin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to remove isproxbin as it's wrong and it's only used here. Please calculate this inside the iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit after talking about this: We need to refactor EachConn to eliminate possibility of wrong use with wrong param.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

swarm/pss/pss.go Outdated
p.fwdPoolMu.RLock()
pp := p.fwdPool[sp.Info().ID]
p.fwdPoolMu.RUnlock()
mustContinue := isproxbin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this assignment as first thing in the iterator and use that var instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is better, because otherwise i will need to write a comment in plain language

Copy link
Author

@gluk256 gluk256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is description of what this function is supposed to do:
If the recipient address (or partial address) is within the neighbourhood depth of the forwarding node, then it will be forwarded to all the nearest neighbours of the forwarding node. In case of partial address, it should be forwarded to all the peers matching the partial address, if there are any; otherwise only to one peer, closest to the recipient address. In any case, if the message forwarding fails, the node should try to forward it to the next best peer, until the message is successfully forwarded to at least one peer.

@acud
Copy link
Member

acud commented Dec 6, 2018

@gluk256 and could you please say what it was doing before the fix?
also, if we could PR this into the documentation/write this into the wiki that would be great.

swarm/pss/pss.go Outdated
sent++
}
}
mustContinue := (isAddrMatch || sent == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. So maybe:

if !p.trySend(sp, msg) {
    return true
}
[...]
return isAddrMatch

instead?

swarm/pss/pss.go Outdated
to := make([]byte, addressLength)
copy(to[:len(msg.To)], msg.To)
neighbourhoodDepth := p.Kademlia.NeighbourhoodDepth()
luminousRadius := len(msg.To) * 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this name as noted @gluk256 because it's not grammatically correct.

swarm/pss/pss.go Outdated
pp := p.fwdPool[sp.Info().ID]
p.fwdPoolMu.RUnlock()
mustContinue := isPeerInProxBin
return mustContinue
Copy link
Contributor

@nolash nolash Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point about the comment.

But "explaining through varnames" is rather unusual in our code.

We tend to add a comment instead (that is, when we remember to). I think it makes sense to try to maintain a minimum of style coordination through our code so that the reader doesn't have to interpret that as well as the logic.

So I think:

// if we are in proxbin continue forwarding
return isPeerInProxBin

Is better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my algo, WDYT?

swarm/pss/pss.go Outdated
p.fwdPoolMu.RLock()
pp := p.fwdPool[sp.Info().ID]
p.fwdPoolMu.RUnlock()
mustContinue := isPeerInProxBin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need

swarm/pss/pss.go Outdated
// luminosity is the opposite of darkness. the more bytes are removed from the address, the higher is darkness,
// but the luminosity is less. here luminosity equals the number of bits present in the destination address.
luminousRadius := len(msg.To) * 8
if luminousRadius >= neighbourhoodDepth {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite convoluted esp without comments.

i think this is so much simpler:

pof := pot.DefaultPof(neighbourhoodDepth) // pof function matching upto neighbourhoodDepth bits
depth, full := pof(to, p.BaseAddr(), 0)
// if full match, then depth == neighbourhoodDepth, we broadcast to all neighbours.
if !full  {
   // if not full match, we broadcast upto luminousityRadius
   // if to is not partial address, luminosityRadius is 256 so po<depth will be true after the  first
   depth =  luminosityRadius
}  
p.Kademlia.EachConn(to, addressLength*8, func(sp *network.Peer, po int, _ bool) bool {
    if po < depth && sent > 0 {
        return false
    }
    if p.trySend(sp, msg) {
        sent++
    }
}

swarm/pss/pss.go Outdated
pp := p.fwdPool[sp.Info().ID]
p.fwdPoolMu.RUnlock()
mustContinue := isPeerInProxBin
return mustContinue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my algo, WDYT?

Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@acud acud removed the in progress label Dec 12, 2018
testForwardMsg(900, t, ps, peerAddresses[19][:1], peerAddresses, all[16:])
}

func testForwardMsg(num int, t *testing.T, ps *Pss, addr []byte, addresses []pot.Address, expected []int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on this function to explain what it is testing and what is expected would be nice.

swarm/pss/pss.go Outdated
func (p *Pss) forward(msg *PssMsg) error {
metrics.GetOrRegisterCounter("pss.forward", nil).Inc(1)
// tries to send a message, returns true if successful
func trySendMsg(p *Pss, sp *network.Peer, msg *PssMsg) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this function called trySendMsg and not just sendMsg? Pretty much everything in Go can return an error, so no need to include try in front, as it's clear the action might error?

swarm/pss/pss.go Outdated
// are any; otherwise only to one peer, closest to the recipient address. In any case, if the message
// forwarding fails, the node should try to forward it to the next best peer, until the message is
// successfully forwarded to at least one peer.
func (p *Pss) forward(msg *PssMsg, trySend func(p *Pss, sp *network.Peer, msg *PssMsg) bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are defining trySend function only because we want to override it in tests. If this is the case, wouldn't it be cleaner if trySend is just defined as a var in this package, and the test package overrides it, instead of exposing this behaviour in the interface/signature of the function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brilliant pattern also used by @janos in new localstore now function

ps := createPss(t, kad)
addPeers(kad, peerAddresses)

const firstNearest = depth * 2 // first peer in the nearest neighbours' bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be easier to understand if we just use literal numbers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then it would be very difficult to change the test. remembering all the magic numbers is impossible. now you only need to change one constant (depth).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Should we maybe add a comment saying "add some peers far away from the others". depth * 2 is a bit obscure, no? :)


const firstNearest = depth * 2 // first peer in the nearest neighbours' bin
nearestNeighbours := []int{firstNearest, firstNearest + 1, firstNearest + 2}
//fmt.Println(kad.String()) // print kademlia map for debugging, before any test starts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove unused comment

}
}

func createPss(t *testing.T, kad *network.Kademlia) *Pss {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use pss_test.go:newTestPss()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because i wanted to use specific sttings (e.g. base address)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but now we have at least three different functions for setting up pss :'(

How about expanding the existing one with a chained method? This is a convention we've agreed to use in Swarm.

return network.NewPeer(bp, kad)
}

func newTestMsg(addr []byte) *PssMsg {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see code already existing in pss_test.go etc already creates messages inline in the test. Having a testmessage "factory" is fine, but then we should endeavour to use it the same way everywhere.

Also, I suggest it should generate random data for the payload data, and take a topic param.

Refactoring all that is of course not within scope of this PR.

Perhaps as a first step towards consolidation is that we put this method (and similar generic methods) in a file called common_test.go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gluk256 You didn't want to do anything with this?

swarm/pss/pss.go Outdated
}
}
if !isPssEnabled {
log.Trace("peer doesn't have matching pss capabilities, skipping", "peer", info.Name, "caps", info.Caps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just occurred to me this should likely be a log.Error. Running swarm with Pss used to be optional, but it is no longer.

)

func TestForwardBasic(t *testing.T) {
base := newBaseAddress() // 0xFFFFFF.......
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't just using zeros be easier?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we compare distances (xor), it's easier for me to identify 1s then 0s.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, yeah I sympathize. But for people reading the code later, though, I think the more browsing back and forth in the code we force them to do, the harder it is to get the overview.

testForwardMsg(900, t, ps, peerAddresses[19][:1], peerAddresses, all[16:])
}

func testForwardMsg(num int, t *testing.T, ps *Pss, addr []byte, addresses []pot.Address, expected []int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the num parameter here do?

And please consider more informative varnames than addr and addresses :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

swarm/pss/forwarding_test.go Outdated Show resolved Hide resolved
var fail bool
s := fmt.Sprintf("test id: %d, msg address: %x..., radius: %d", num, addr[:len(addr)%4], 8*len(addr))

// false negatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe what false negatives and false positives mean in this context. I assume:

  • false negative is expected message didn't reach peer
  • false positive is unexpected message reached peer

swarm/pss/forwarding_test.go Show resolved Hide resolved
swarm/pss/pss.go Outdated
// are any; otherwise only to one peer, closest to the recipient address. In any case, if the message
// forwarding fails, the node should try to forward it to the next best peer, until the message is
// successfully forwarded to at least one peer.
func (p *Pss) forward(msg *PssMsg, trySend func(p *Pss, sp *network.Peer, msg *PssMsg) bool) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brilliant pattern also used by @janos in new localstore now function

swarm/pss/pss.go Outdated
neighbourhoodDepth := p.Kademlia.NeighbourhoodDepth()

// luminosity is the opposite of darkness. the more bytes are removed from the address, the higher is darkness,
// but the luminosity is less. here luminosity equals the number of bits present in the destination address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think given is best

Copy link
Contributor

@nolash nolash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mortified to realize that I actually didn't read through and evaluate that the test cases themselves are ok. Expect this in a separate review following shortly.

swarm/pss/forwarding_test.go Show resolved Hide resolved
ps := createPss(t, kad)
addPeers(kad, peerAddresses)

const firstNearest = depth * 2 // first peer in the nearest neighbours' bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Should we maybe add a comment saying "add some peers far away from the others". depth * 2 is a bit obscure, no? :)

}

for i := 0; i < firstNearest; i++ {
// send random messages with different proximity orders
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not just different, but every bin in order even.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added two peers for no other reason than to simulate the system as close to production as possible. concerning the firstNearest = depth * 2, this formula helps to change the depth var, without need to go through the code and fix all other variables as well.

testForwardMsg(300+i, t, ps, dst[:], peerAddresses, nearestNeighbours)
}

// send msg with proximity order higher than the last nearest neighbour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ambiguous. Is "last" deeper or shallower?

}

// this function tests the forwarding of a single message. the recipient address is passed as param,
// along with addreses of all peers, and indexes of those peers which are expected to receive the message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addresses, indexes -> indices

}
}

func createPss(t *testing.T, kad *network.Kademlia) *Pss {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but now we have at least three different functions for setting up pss :'(

How about expanding the existing one with a chained method? This is a convention we've agreed to use in Swarm.

return ps
}

func newBaseAddress() pot.Address {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want to keep this, then?

return network.NewPeer(bp, kad)
}

func newTestMsg(addr []byte) *PssMsg {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gluk256 You didn't want to do anything with this?

metrics.GetOrRegisterCounter("pss.pp.send.error", nil).Inc(1)
log.Error(err.Error())
return true
// soft threshold for msg broadcast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I promise to think of a more elaborate and enlightening comment here, but let's not let that stop the merging.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@nolash nolash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

ps := createPss(t, kad)
addPeers(kad, peerAddresses)

const firstNearest = depth // first peer in the nearest neighbours' bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please let's stick to the terms shallower/deeper .. first is ambiguous

testForwardMsg(600+i, t, ps, peerAddresses[i][:part], peerAddresses, nearestNeighbours)
}

// partial address with proximity order higher than the last nearest neighbour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too; higher

all := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
testForwardMsg(800, t, ps, []byte{}, peerAddresses, all)

// luminous radius of one byte (8 bits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we instead have a case where the luminosity radius covers more than just one shallower than depth.

It's maybe over-cautios, but the reason would be to fully guarantee that the iterator doesn't do either of:

  • Include the first shallower bin after depth and terminate.
  • Continue "spamming" only if you hit depth on the second iteration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But feel free to ignore this if you think it's not needed.

// this function substitutes the real send function, since
// we only want to test the peer selection functionality
func dummySendMsg(_ *Pss, sp *network.Peer, _ *PssMsg) bool {
a := pot.NewAddressFromBytes(sp.Address())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just Hex the Address

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand: what should be done here? and what's wrong with existing implementation?

return true
}

// setDummySendMsg replaces sendMessage function for testing purposes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you dont need these functions, just use this pattern:

in the code:

// default value for production
var sendFunc = send

// the send function used as arg to forward in production via assignment to `sendFunc`
func send(...) ...

in the test:

testSendFunc := func(.....
...
}
// this defer does closure with the current value of sendFunc so it will reset to its orig value when test returns
defer func(t) { sendFunc = t }(sendFunc)
sendFunc = testSendFunc

func TestForwardBasic(t *testing.T) {
base := newBaseAddress() // 0xFFFFFF.......
setDummySendMsg()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment above for a simpler pattern to use here

swarm/pss/pss.go Outdated
@@ -886,8 +886,17 @@ func (p *Pss) send(to []byte, topic Topic, msg []byte, asymmetric bool, key []by
return nil
}

// sendMessage is a helper function that tries to send a message and returns true on success
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it sendFunc and see the pattern proposed

testForwardMsg(900, t, ps, peerAddresses[19][:1], peerAddresses, all[16:])

// luminous radius of one byte (8 bits)
testForwardMsg(900, t, ps, baseAddrBytes[:1], peerAddresses, all[8:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we are missing a lot of cases:

  • test which shows that within the bin, the forward function selects the closest peer to the address first
  • test which shows that if luminosity is < depth but not 0 then it sends to all peers deeper than luminosity
  • test which shows what happens if send returns false (send error)

Therefore I really suggest using a pattern.

type testCase struct{
   name string
    recipient []byte
    errors: []int
    expected: []int
    peers:  [][]byte
}

testCases := []{
{
  expected: all,
  peers: peers,
},
...
{
  recipient: RandomAddrAt(pivot, 2),
  errors: []bool{false, true},
  expected: []int{3},
  peers: peers, 
}
...
for _, tc := range testCases {
t.Run(tc.Name, func(t *testing.T) {
    if err := testForward(tc); err != nil {
       t.Fatal(err)
    }
}
func testForwardMsg(tc testCase) error {
   testSendFunc := func(...
     // use tc fields
    }
   // set sendFunc, defer reset
   // call forward on pss
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test which shows that if luminosity is < depth but not 0 then it sends to all peers deeper than luminosity

this is covered already no?

swarm/pss/pss.go Outdated
var sendFunc func(p *Pss, sp *network.Peer, msg *PssMsg) bool = sendMessageProd

// tries to send a message, returns true if successful
func sendMessageProd(p *Pss, sp *network.Peer, msg *PssMsg) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

send or sendMsg pls

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix pss forwarding
5 participants