Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: lnd stops responding and becomes unkillable on openbsd 7.2 #7409

Closed
lnproxy opened this issue Feb 15, 2023 · 6 comments
Closed

[bug]: lnd stops responding and becomes unkillable on openbsd 7.2 #7409

lnproxy opened this issue Feb 15, 2023 · 6 comments
Labels
bug Unintended code behaviour needs triage

Comments

@lnproxy
Copy link

lnproxy commented Feb 15, 2023

Background

I've been running lnd on openbsd for months with great results. Recently I upgraded to openbsd 7.2 from 7.1 and have not been able to get lnd working since.

lnd will run normally for some time but eventually stops adding new lines to .lnd/logs/bitcoin/mainnet/lnd.log, responds to all lncli requests with [lncli] rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: context deadline exceeded" and does not die even with kill -9.

This never happened on openbsd 7.1 so it's probably not related to: golang/go#34988

Your environment

  • version of lnd: lnd version 0.15.5-beta
  • which operating system (uname -a on *Nix): OpenBSD 7.2 GENERIC.MP#1049 amd64
  • version of btcd, bitcoind, or other backend: Bitcoin Core version v24.0.1

Steps to reproduce

Just start lnd, unlock the wallet, and wait.

Expected behaviour

I'd expect lnd to continue running normally, or at least shutdown gracefully.

Actual behaviour

As described above, completely freezes and stops responding even to OS signals. PC requires a hard reboot to get back to a functional state. Even calling ps -p on the broken lnd's pid causes the whole system to freeze.

Even with logging at the most detailed level, there's nothing out of place in the logs.

@lnproxy lnproxy added bug Unintended code behaviour needs triage labels Feb 15, 2023
@jrick
Copy link

jrick commented Feb 15, 2023

you're hitting this: https://marc.info/?t=166694295600001&r=1&w=2

@guggero
Copy link
Collaborator

guggero commented Feb 15, 2023

@jrick thanks a lot for the link! I quickly scanned the conversation and I didn't see any hint of a possible workaround. So this needs to be patched in OpenBSD itself or could something be done on the bbolt side?

EDIT: I just saw the conversation in etcd-io/bbolt#404, so it looks like we can fix this by bumping bbolt to the latest master version. Thanks a lot!

@jrick
Copy link

jrick commented Feb 15, 2023

i don't know if any userland workaround, but if you run the bbolt unit tests, it is always the same test that causes the hang. so whatever it is doing, try avoiding that? but the real fix has to be done in the kernel.

the syscall change is unrelated, it won't fix this bug.

@guggero
Copy link
Collaborator

guggero commented Feb 15, 2023

the syscall change is unrelated, it won't fix this bug.

Ah okay, I misinterpreted the comments in the PR.

@lnproxy
Copy link
Author

lnproxy commented Feb 15, 2023

Thank you @jrick and @guggero. I'll close the issue since it seems not to be an lnd bug.

@lnproxy lnproxy closed this as completed Feb 15, 2023
@lnproxy
Copy link
Author

lnproxy commented Mar 9, 2023

In case anyone runs into this, lnd works great in the latest openbsd snapshot. sysupgrade -s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended code behaviour needs triage
Projects
None yet
Development

No branches or pull requests

3 participants