Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve FlowControllingHttpContentProducer state machine #118

Closed
mikkokar opened this issue Mar 23, 2018 · 0 comments
Closed

Improve FlowControllingHttpContentProducer state machine #118

mikkokar opened this issue Mar 23, 2018 · 0 comments

Comments

@mikkokar
Copy link
Contributor

mikkokar commented Mar 23, 2018

The problem

Confusing warning messages from the FlowControllingHttpContentProducer when origin server closes the TCP connection:

    WARN  com.hotels.styx.client.netty.connectionpool.FlowControllingHttpContentProducer 2018-03-03 02:02:02,000 [Styx-Client-Worker-14-Thread] 
    - message="Inappropriate event=ChannelIdleEvent", prefix=myserver.com/1.1.5.1:22222 -> /2.1.1.2:55555, Request(method=POST, url=/some/page.html, i0=00000000-0000-0000-0000-000000000000), state=TERMINATED, receivedChunks=0, receivedBytes=0, emittedChunks=0, emittedBytes=0

This suggests the TCP connection is closed due to idleness. But in fact the warning also pops up when the channel has been subjected to a "delayed closure". This happens when the origin server closes the TCP connection to indicate response content length (See RFC 7230, Chapter 3.3.3. Message Body Length, item 7.) Tomcat apps have been seen to behave this way when they respond with HTTP error status code (4xx or 5xx).

A clarification regarding "delayed closure". It is the Flow Controller state machine associated to the TCP connection that is subjected to the delayed closure. The TCP connection itself is closed immediately. The Flow Controller state machine regulates the delivery of the HTTP message body to the rest of the styx honouring the back pressure and Rx.Java observable protocol. The purpose of the delayed closure is to allow styx server some time to consume the response in order to proxy it back.

Detailed description

Back in the days when we realised it was necessary to implement the delayed closure as described above, we hijacked the FlowController's "idle state event" signal to inform the state machine to tear down any resources. This is in NettyToStyxResponsePropagator.java:

    @Override
    public void channelInactive(ChannelHandlerContext ctx) throws Exception {
        ensureContentProducerIsCreated(ctx);

        ctx.channel().eventLoop().schedule(
                () -> contentProducer.ifPresent(producer -> producer.idleStateEvent(
                        new ResponseTimeoutException(
                                origin,
                                "channelInactive",
                                producer.receivedBytes(),
                                producer.receivedChunks(),
                                producer.emittedBytes(),
                                producer.emittedChunks()))),
                idleTimeoutMillis,
                MILLISECONDS);

        TransportLostException cause = new TransportLostException(ctx.channel().remoteAddress(), origin);
        contentProducer.ifPresent(producer -> producer.channelInactive(cause));
    }

The misleading error message is caused by this re-use of the idle state event, and the time has come to address this issue.

Acceptance criteria

  • Introduce a new delayed teardown (or something similar) event to replace the usage of idle state event to tear down the Flow Controller state machine.
  • The handling of delayed teardown in the Flow Controller state machine:
    • The 'delayed teardown' must be handled in BUFFERING_COMPLETED and EMITTING_BUFFERED_CONTENT states.
    • In BUFFERING_COMPLETED state: release all buffered content and transition to TERMINATED state.
    • In EMITTING_BUFFERED_CONTENT state, release all buffered content, emit error on content observable, and transition to TERMINATED state.
    • In COMPLTED state, swallow the event to prevent an unnecessary warning message.
    • Leave it unhandled in all other states, so that an informational warning message is logged.
  • The 'idle state event' is only used to inform the state machine about prolonged inactivity on the TCP connection.
@dvlato dvlato self-assigned this Mar 26, 2018
dvlato pushed a commit to dvlato/styx that referenced this issue Mar 26, 2018
…te machine.

New event DelayedTearDownEvent created to replace IdleStateEvent.
dvlato pushed a commit to dvlato/styx that referenced this issue Mar 27, 2018
…te machine.

New event DelayedTearDownEvent created to replace IdleStateEvent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants