Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionBalancer Bug #16113

Closed
dmsqls opened this issue Apr 22, 2021 · 4 comments
Closed

ConnectionBalancer Bug #16113

dmsqls opened this issue Apr 22, 2021 · 4 comments

Comments

@dmsqls
Copy link

dmsqls commented Apr 22, 2021

Hi. Report a bug in ConnectionBalancer

When pickTargetHandler is called, Count 1 increases as shown below.

BalancedConnectionHandler&
ExactConnectionBalancerImpl::pickTargetHandler(BalancedConnectionHandler&) {
--- skip ---
    min_connection_handler->incNumConnections();             ///< increment connection count
  }

  return *min_connection_handler;
}

However, the connection count is always 0, and the connection balancer is not processed as expected. (See log below)

[DEBUG] WORKER[0] CONNECTION COUNT[0]
[DEBUG] WORKER[1] CONNECTION COUNT[0]
[DEBUG] incNumConnections!!!!!!            ///< increment count 1
[DEBUG] decNumConnections!!!!!!           ///< decrement ??
[DEBUG] incNumConnections!!!!!!           ///< increment count 1 ???
///< next connection receive
[DEBUG] WORKER[0] CONNECTION COUNT[0]      //< why 0???
[DEBUG] WORKER[1] CONNECTION COUNT[0]
[DEBUG] incNumConnections!!!!!!
[DEBUG] decNumConnections!!!!!!
[DEBUG] incNumConnections!!!!!!

Temporarily modified as follows and it resolved.

BalancedConnectionHandler&
ExactConnectionBalancerImpl::pickTargetHandler(BalancedConnectionHandler&) {
--- skip ---
    min_connection_handler->incNumConnections();
#ifdef _BUG_CONN_BALANCER_ 
    min_connection_handler->incNumConnections();            ///< add
#endif

  }

  return *min_connection_handler;
}

Please review.

Thanks.

Reference information

Envoy Binary Version
./envoy version: 430fb4cd4004831170885184066045a3902acc4b/1.17.0/Modified/RELEASE/BoringSSL

Configuration

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: test-filter
  namespace: istio-system
spec:
  configPatches:
  - applyTo: LISTENER
    match:
      context: SIDECAR_OUTBOUND
      listener: {}
    patch:
      operation: MERGE
      value:
        connection_balance_config:
          exact_balance: {}
@dmsqls dmsqls added bug triage Issue requires triage labels Apr 22, 2021
@mattklein123
Copy link
Member

cc @lambdai I think this is the bug you are currently fixing?

@asraa asraa added area/connection and removed triage Issue requires triage labels Apr 23, 2021
@dmsqls
Copy link
Author

dmsqls commented Apr 26, 2021

cc @lambdai I think this is the bug you are currently fixing

No, I'm not currently fixing it.

@lambdai
Copy link
Contributor

lambdai commented Apr 26, 2021

@dmsqls Thanks for the analysis.
Will be fixed by #15842

Before the fix, only the balance depends on the connections owned by the virtualoutbound listener. The number of connections is usually 0 there.
With my fix and the envoy filer crd, the connection will be balanced as your expectation

@dmsqls
Copy link
Author

dmsqls commented Apr 27, 2021

Thank you. It works as expected. 👍

@dmsqls dmsqls closed this as completed Apr 27, 2021
htuch pushed a commit that referenced this issue Apr 29, 2021
…15842)

If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance.
Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to
not own any connection and the rebalance is no-op.

Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior.

Fix #15146 #16113

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
gokulnair pushed a commit to gokulnair/envoy that referenced this issue May 6, 2021
…nvoyproxy#15842)

If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance.
Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to
not own any connection and the rebalance is no-op.

Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior.

Fix envoyproxy#15146 envoyproxy#16113

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Gokul Nair <gnair@twitter.com>
gokulnair pushed a commit to gokulnair/envoy that referenced this issue May 6, 2021
…nvoyproxy#15842)

If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance.
Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to
not own any connection and the rebalance is no-op.

Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior.

Fix envoyproxy#15146 envoyproxy#16113

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Gokul Nair <gnair@twitter.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants