Skip to content

Commit

Permalink
USHIFT-4377: introduce microshift ingress performance enhancement
Browse files Browse the repository at this point in the history
Signed-off-by: Evgeny Slutsky <eslutsky@redhat.com>
  • Loading branch information
eslutsky committed Oct 7, 2024
1 parent 58f0eaa commit cfccfa8
Showing 1 changed file with 205 additions and 0 deletions.
205 changes: 205 additions & 0 deletions enhancements/microshift/microshift-router-configuration-performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
---
title: microshift-router-configuration
authors:
- "@eslutsky"
reviewers:
- "@pacevedom"
- "@copejon"
- "@ggiguash"
- "@pmtk"
- "@pliurh"
- "@Miciah"
approvers:
- "@jerpeter1"
api-approvers:
- None
creation-date: 2024-09-23
last-updated: 2024-10-07
tracking-link:
- https://issues.redhat.com/browse/USHIFT-4091
---

# MicroShift router Operations & performance configuration options

## Summary
MicroShift's default router is created as part of the platform, but does not
allow configuring any of its specific parameters. For example, you cannot
specify the policy for HTTP traffic compression or enable HTTP2 protocol.

In order to allow these operations and many more, a set of configuration options
is proposed.

the configuration change will propagate to the router deployment [manifest](https://github.com/pacevedom/microshift/blob/8f76e21b9a3f0044c83eae4e2c177871c8235103/assets/components/openshift-router/deployment.yaml) Environment variable.

it will be used from the operator CRDs .

## Motivation

Customers need the ability to adjust how long HAProxy holds connections open,
either by extending the timeout to accommodate slower backends or clients, or by
shortening the timeout, allowing connections to be closed more aggressively.


### User Stories

#### User Story 1

My application starts processing requests from clients, but the connection is
getting closed before it can respond.

I set `ingress.tuningOptions.serverTimeout` in the configuration file to a
higher value to accommodate the slow response from the server.

#### User Story 2

The router has many connections open because an application running on my
cluster doesn't close connections properly.

I set `ingress.tuningOptions.serverTimeout` and
`spec.tuningOptions.serverFinTimeout` in the ingresscontroller API to a lower
value, forcing those connections to close sooner if my application stops
responding to them.



### Goals
Allow users to configure the additional HAProxy/Router performance customization parameters, see Proposal table for details.


### Non-Goals
N/A

## Proposal

see the table below for the proposed configuration changes:

| new configuration | description | default |
| -------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| httpCompressionMimeTypes | list of MIME types that should have compression applied. <br>At least one MIME type must be specified.<br> | none |
| forwardedHeaderPolicy | specify when and how the Ingress Controller sets the `Forwarded``X-Forwarded-For``X-Forwarded-Host``X-Forwarded-Port``X-Forwarded-Proto`, and `X-Forwarded-Proto-Version` HTTP headers. | append |
| tuningOptions -> headerBufferBytes | describes how much memory should be reserved (in bytes) for IngressController connection sessions. | 32768 |
| tuningOptions -> headerBufferMaxRewriteBytes | describes how much memory should be reserved from headerBufferBytes for HTTP header rewriting and appending for IngressController connection sessions. | 8192 |
| tuningOptions -> healthCheckInterval | defines how long the router waits between two consecutive health checks on its configured backends. | 5000ms |
| tuningOptions -> clientTimeout | defines the maximum time to wait for a connection attempt to a server/backend to succeed. | 30s |
| tuningOptions -> clientFinTimeout | defines how long a connection will be held open while waiting for the client response to the server/backend closing the connection. | 1s |
| tuningOptions -> serverTimeout | defines how long a connection will be held open while waiting for a server/backend response. | 30s |
| tuningOptions -> serverFinTimeout | defines how long a connection will be held open while waiting for the server/backend response to the client closing the connection. | 1s |
| tuningOptions -> tunnelTimeout | defines how long a tunnel connection (including websockets) will be held open while the tunnel is idle. | 1h |
| tuningOptions -> tlsInspectDelay | defines how long the router can hold data to find a matching route. | 5s |
| tuningOptions -> threadCount | defines the number of threads created per HAProxy process.. | 4 |
| tuningOptions -> maxConnections | defines the maximum number of simultaneous connections that can be established per HAProxy process. | 50000 |
| LogEmptyRequests | specifies how connections on which no request is received should be logged. | Log |
| HTTPEmptyRequestsPolicy | indicates how HTTP connections for which no request is received should be handled. | Respond |
| HTTP2IsEnabled | enables http/2 in ingress controller | false |

### Workflow Description
**cluster admin** is a human user responsible for configuring a MicroShift
cluster.

1. The cluster admin adds specific configuration for the router prior to
MicroShift's start.
2. After MicroShift started, the system will ingest the configuration and setup
everything according to it.


### API Extensions
As described in the proposal, there is an entire new section in the configuration:
```yaml
ingress:
tuningOptions:
headerBufferBytes: 32768
headerBufferMaxRewriteBytes: 8192
healthCheckInterval: 5000m
clientTimeout: 30s
clientFinTimeout: 1s
serverTimeout: 30s
serverFinTimeout: 1s
tunnelTimeout: 1h
tlsInspectDelay: 5s
threadCount: 4
maxConnections: 50000
httpCompressionMimeTypes: none
LogEmptyRequests: Log
forwardedHeaderPolicy: Append
HTTPEmptyRequestsPolicy: Respond
HTTP2IsEnabled: false
```
For more information check each individual section.
### Topology Considerations
TBD
#### Standalone Clusters
N/A
#### Single-node Deployments or MicroShift
Enhancement is solely intended for MicroShift.
### Implementation Details/Notes/Constraints
### Risks and Mitigations
TBD
### Drawbacks
- Setting the timeout higher may cause some dead connections to be kept open
for longer, and would add to the memory footprint of the router
- Setting the timeout too low can cause connection closure before the server or
client has enough time to respond
## Design Details
TBD
## Open Questions
N/A
## Test Plan
All of the changes listed here will be included in the current e2e scenario
testing harness in MicroShift.
## Graduation Criteria
Targeting GA for MicroShift 4.18 release.
### Dev Preview -> Tech Preview
- Ability to utilize the enhancement end to end
- End user documentation, relative API stability
- Sufficient test coverage
### Tech Preview -> GA
- More testing (upgrade, downgrade)
- Sufficient time for feedback
- Available by default
- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/)
### Removing a deprecated feature
N/A
## Upgrade / Downgrade Strategy
When upgrading from 4.17 or earlier to 4.18, the new configuration fields will remain
unset, causing the existing defaults to be used.
When downgrading from 4.18 to 4.17 or earlier, the specified timeout values will
be discarded, and the previous defaults will be used.
## Version Skew Strategy
N/A
## Operational Aspects of API Extensions
### Failure Modes
TBD
## Support Procedures
N/A
## Implementation History
N/A
## Alternatives
N/A

0 comments on commit cfccfa8

Please sign in to comment.