Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gzip middleware leads to warp "Client closed connection prematurely" #997

Open
pbrisbin opened this issue Jun 28, 2024 · 3 comments
Open

Comments

@pbrisbin
Copy link
Member

We have an app using WAI, Warp, and hs-opetelemetry-instrumentation-wai and -yesod.

Ever since making this change:

Our open-telemetry tracing reports a high number of 200s with the error:

InvalidRequest: Warp: Client closed connection prematurely

new versions increasing error-rate

increase in "200" errors

As far as we can tell, the client of such requests receives a 200 (green in this trace) but the server continues to run beyond then (orange) and reports the error.

It's as if the response is returned, but then the request handling in the service doesn't end. It would make sense the LB (satisfied with the 200 it has already forwarded along) closes the connection, and I guess that would generate this error... but why? And why only once we started on-the-fly gzipping?

Since clients are getting 200s, there is no actual negative effects here, but the errors are confusing too see and contribute to our error-rate metrics causing false-positive alerts.

@pbrisbin
Copy link
Member Author

pbrisbin commented Jul 2, 2024

I did a little more digging in the code. It seems the function in question is:

compressE
    :: Response
    -> (Response -> IO ResponseReceived)
    -> IO ResponseReceived
compressE res sendResponse =
    wb $ \body -> sendResponse $
        responseStream s (fixHeaders hs) $ \sendChunk flush -> do
            (blazeRecv, _) <- B.newBuilderRecv B.defaultStrategy
            deflate <- Z.initDeflate 1 (Z.WindowBits 31)
            let sendBuilder builder = do
                    popper <- blazeRecv builder
                    fix $ \loop -> do
                        bs <- popper
                        unless (S.null bs) $ do
                            sendBS bs
                            loop
                sendBS bs = Z.feedDeflate deflate bs >>= deflatePopper
                flushBuilder = do
                    sendBuilder Blaze.flush
                    deflatePopper $ Z.flushDeflate deflate
                    flush
                deflatePopper popper = fix $ \loop -> do
                    result <- popper
                    case result of
                        Z.PRDone -> return ()
                        Z.PRNext bs' -> do
                            sendChunk $ byteString bs'
                            loop
                        Z.PRError e -> throwIO e

            body sendBuilder flushBuilder
            sendBuilder Blaze.flush
            deflatePopper $ Z.finishDeflate deflate
  where
    (s, hs, wb) = responseToStream res

So it seems enabling compression changes the response to streaming-fashion. This could explain how the ALB is getting a complete 200 response before this code is finished whatever it's doing. If it decides to close the connection at that point, I think we have our answer. I don't really know how to fix it though.

@pbrisbin
Copy link
Member Author

pbrisbin commented Jul 2, 2024

I wonder if using GzipCacheFolder (which is just GzipCompress but caching things in some folder) would effectively work around the issue. Using that construct won't switch the response to streaming and, once any given response has been compressed/cached, it'll just be serving the file.

@pbrisbin
Copy link
Member Author

pbrisbin commented Jul 9, 2024

Update: GipCacheFolder does work much better. But it has introduced different errors that seem like a similar situation (200 response with an exception):

IOException: Network.SendFile.Linux.sendfileloop: resource vanished (Connection reset by peer)

They happen much less frequently though, and seem limited to specific kinds of requests (large responses, and from only frontend/JS clients).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant