Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate Chaining Keys #30

Closed
wants to merge 8 commits into from

Conversation

ShahanaFarooqui
Copy link
Contributor

  • Imposing max buffer read & write size limit
  • Adding reqId in the JSON request for better debugging with commando
  • Adding timestamp for some logging messages for better debugging

Copy link
Owner

@lnbc1QWFyb24 lnbc1QWFyb24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR Shahana, much appreciated. I tested locally and it looks all good. I just have a couple of questions:

If I understand correctly, when we reconnect the WebSocket due to reaching the buffer limit, we are leaving the original connection open (assuming this is to allow any in flight requests to return).

  • Does this cause issues where we are connecting to the node with the same public key twice? I would have thought that CoreLN would reject a second connection from the same node publickey?
  • Do we need to clean up the the _bytesWrittenMap to prevent a memory leak as the amount of socket connections grows? Do we also need to clean up the old socket connections?

I wasn't aware of this Commando limit, is there some references you could point me to so that I can better understand it? My understanding from the PR is that Commando will only allow 1mb of data to be transmitted per connection? When using Lnmessage in Clams, we are constantly re-setting the connection based on the window being in view, so I can see why we have not likely run in to this issue yet.

src/index.ts Outdated
@@ -190,7 +199,25 @@ class LnMessage {
}

this.socket.onclose = async () => {
this._log('error', 'WebSocket is closed')
this.reconnectWebSocket.bind(this)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to bind this function since it is already declared on the instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this binding because I wanted to use the same logic used on websocket's close event in line 490 & 569 without closing the socket.

@lnbc1QWFyb24 lnbc1QWFyb24 changed the base branch from master to develop May 9, 2023 22:43
@ShahanaFarooqui
Copy link
Contributor Author

If I understand correctly, when we reconnect the WebSocket due to reaching the buffer limit, we are leaving the original connection open (assuming this is to allow any in flight requests to return).

Yeah, the idea is to allow in flight data but I wasn't able to observe that behaviour in my testing.

  • Does this cause issues where we are connecting to the node with the same public key twice? I would have thought that CoreLN would reject a second connection from the same node publickey?

No, I tested it thoroughly and it doesn't cause any issue. Commando only discards an old request with the same reqId.

  • Do we need to clean up the the _bytesWrittenMap to prevent a memory leak as the amount of socket connections grows? Do we also need to clean up the old socket connections?

We are initialising _bytesWrittenMap once in constructor and resetting the same map on socket open event. So I believe that it should not result in memory leak. But I am not sure about the old socket connection, AFAIK it closes itself after some time.

I wasn't aware of this Commando limit, is there some references you could point me to so that I can better understand it? My understanding from the PR is that Commando will only allow 1mb of data to be transmitted per connection? When using Lnmessage in Clams, we are constantly re-setting the connection based on the window being in view, so I can see why we have not likely run in to this issue yet.

This is where commando sets the buffer size. But it is the buffer size of the data and I am yet to understand that how appending content is breaking the data flow completely. However, resetting the connection is the only working solution I found for now.

There might be a better solution to the problem and I will be happy to meet if that helps in explaining the issue and the solution. Feel free to refer to repo which I created for easier debugging. You can reproduce current issue within <15 seconds of running this code.

@lnbc1QWFyb24
Copy link
Owner

Sweet, thanks for the additional info. I will try your example repo to replicate and play around with it.

I am worried about the WS hanging around as Lnmessage handle ping pong messages to keep it alive so it might just live on!

I am surprised that CoreLN allows multiple connections per peer as the spec seems to indicate that there must be one connection per peer, but maybe that is more referring to the fact that different peers should not share a single connection. Just curious, have you run listpeers on the node during your testing to see if two entries show up for the same pubkey from Lnmessage.

I am thinking that since it is a buffer limit rather than a connection limit (if I understand it correcly), another approach we could try is to manage how much we add to the buffer and queue up messages when it gets full rather than resetting the connection.

So we could have a map of reqId -> to byte size. When we get a response for a reqId we can delete it from the map as I think we can assume that the request is no longer in the Commando buffer since it has processed the request and sent a response.
When we want to make a commando request, we add up all the inflight requests byte size from the reqId -> bytes map and check whether this message will tip us over the buffer limit. If we are under, send the request and add it to the map. If we are equal to or over, then queue the message to be sent when the buffer has room for this request.
When we get a response, we remove the reqId from the map, lookup queued messages and see if we have capacity in the buffer to send any of them.

Let me know what you think of an approach like that and I can work with you on an implementation to see if it fixes your issue.

Also I am curious how is the bug showing up in the CoreLN app and what requests are you making on load that end up being over 1mb of requests?

@lnbc1QWFyb24
Copy link
Owner

Also if this blocking your app release and this PR in it's current state fixes your issue, I would be happy to release an alpha version of the package just for this fix in the meantime so you can be unblocked. Let me know if that would be helpful. For a proper release I definitely want to understand the problem better and also ensure the solution is as robust as possible.

@lnbc1QWFyb24 lnbc1QWFyb24 changed the title Reconnect on reaching max buffer limit Separate Chaining Keys May 13, 2023
@lnbc1QWFyb24
Copy link
Owner

Closing in favour of #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants