Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcp btl: bind() on connect() side, interface conflict #7115

Closed
markalle opened this issue Oct 29, 2019 · 5 comments
Closed

tcp btl: bind() on connect() side, interface conflict #7115

markalle opened this issue Oct 29, 2019 · 5 comments
Labels

Comments

@markalle
Copy link
Contributor

On my machines if I run one rank on each of two hosts and use

  -mca btl_tcp_if_include ib1,ib0

the ordering in which the interfaces are used results in rank1 trying to connect to rank0 using a btl that's set up with a connect-side bind() to ib0 but the endpoint it's trying to connect to for the peer is on ib1.

We seem to create one BTL for each network interface and then we have the connect-side bind to btl->tcp_ifaddr which makes me think the BTL is intended to be exclusive to that interface. But over in mca_btl_tcp_proc_insert() when it's filling in information for a btl_endpoint where the btl is already selected, it loops over all the local interfaces in making its selection.

If I add some code in the i,j loop to skip over any local interface i such that its address local_interface->ipv4_address doesn't match the BTL's selection btl_endpoint->endpoint_btl->tcp_ifaddr then my above ib1,ib0 run results in consistent use of ib0. Eg the endpoint ignores the ib1 entries and picks an IP on ib0.

I can make a PR for this, but I want to get some discussion first about what's the right solution. Are BTLs supposed to be exclusive to the btl->tcp_ifaddr they're set up with? That seems reasonable to me, but it might be significant change in behavior, eg in my example above a BTL that would have been running over ib1 now picks ib0 for its peer. And even though I don't think the weights[] code in mca_btl_tcp_proc_insert() is doing anything super fancy like trying to pick the fastest interface, I'm concerned that maybe it is and my change is potentially is picking a slower interface for the endpoint.

An alternate 'fix' is to let the endpoint get setup with whatever interface it wants and ignore the btl->tcp_ifaddr setting if it's going to conflict. Eg don't bind on the connect-side if opal_net_samenetwork() shows us that the upcoming connect is doomed to hang.

Both of the above fixes worked for me.

Can anyone advise on what fits best with the intended design? Is there meant to be one TCP btl for each local interface and each btl is exclusive to that interface?

@markalle markalle changed the title bind() on connect() side, interface conflict tcp btl: bind() on connect() side, interface conflict Oct 29, 2019
@gpaulsen
Copy link
Member

@bwbarrett, @jsquyres others?

@jsquyres
Copy link
Member

There was discussion on the webex yesterday: AWS has seen similar issues, and has some in-house fixes that they're still testing. @wckzhang has more information.

@markalle
Copy link
Contributor Author

It might be related, but geoffrey's description sounded more like that was about reachability of different hosts on different interfaces. Is there any work-in-progress I can browse for that to see what it does for my case?

I'm concerned here about a mismatch between the interface a BTL is created to use vs what it actually uses. mca_btl_tcp_create() loops over all the interfaces creating a BTL for each and setting btl->tcp_ifaddr which is what the connect-side eventually binds to. But the endpoints it creates for use with this btl can pick between any of the local interfaces.

It's not that it thinks ib0 can connect to ib1, it correctly identifies that they can't talk. It's just ignoring the information about what interface it's going to bind to and picking an incompatible address for the endpoint.

@jjhursey
Copy link
Member

jjhursey commented Nov 2, 2019

@markalle PR #7134 was just filed against master that seems to be targeted at this issue. We should try it out to see if it resolves this specific issue.

@markalle
Copy link
Contributor Author

Thanks, I've browsed and run 7134 and I'm satisfied that it does fix this issue.

I still have two concerns about 7134 but those are separate from the bind() mismatch here. So I guess I'm okay with closing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants