Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server returns 400: "invalid character in chunk size header" for valid request #5220

Closed
romasku opened this issue Nov 11, 2020 · 15 comments · Fixed by #7764
Closed

Server returns 400: "invalid character in chunk size header" for valid request #5220

romasku opened this issue Nov 11, 2020 · 15 comments · Fixed by #7764
Labels

Comments

@romasku
Copy link

romasku commented Nov 11, 2020

🐞 Describe the bug
When server ignores content of request with Transfer-Encoding: chunked, parsing of next request can fail.

💡 To Reproduce

The following script:

import asyncio

from aiohttp import web, ClientSession


async def start_server():
    async def hello(request):
        return web.Response(text="OK", status=200)

    app = web.Application()
    app.add_routes([web.get('/', hello)])
    app.add_routes([web.put('/', hello)])

    runner = web.AppRunner(app)
    await runner.setup()
    site = web.TCPSite(runner, 'localhost', 8080)
    await site.start()


async def main():
    await start_server()

    async def data_gen():
        for _ in range(10):
            yield b'just data'
            await asyncio.sleep(0.1)  # Make sending of content slow

    async with ClientSession() as session:
        while True:
            # Do put
            async with session.put("http://localhost:8080", data=data_gen()):
                pass
            # Do get
            async with session.get("http://localhost:8080") as resp:
                if resp.status != 200:
                    print("failed with status: ", resp.status)
                    print("content is: ", await resp.content.read())
                    return
            await asyncio.sleep(0.1)

asyncio.run(main())

Prints the following:

$ python3 reproduce.py 
Error handling request
Traceback (most recent call last):
  File ".../venv/lib/python3.7/site-packages/aiohttp/web_protocol.py", line 315, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 546, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message='invalid character in chunk size header'
failed with status:  400
content is:  b'invalid character in chunk size header'

If I add await request.content.read(), the problem disappears.

💡 Expected behavior

Server should not fail to parse valid request.

📋 Your version of the Python

$ python --version
Python 3.7.9

📋 Your version of the aiohttp/yarl/multidict distributions

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.7.2
...
$ python -m pip show multidict
Name: multidict
Version: 4.7.6
...
$ python -m pip show yarl
Name: yarl
Version: 1.6.2
...
@romasku romasku added the bug label Nov 11, 2020
@asvetlov
Copy link
Member

Thanks for the reproducer.
I recall bug reports that sound similar to your issue but they have a lack of details; your script definitely can help with pinning the problem down.

@asvetlov
Copy link
Member

I suspect web_protocol.py doesn't close the socket if a web-handler is done but the socket has an unread part of the processed request's body. It screws up the HTTP parser easily.

@derlih
Copy link
Contributor

derlih commented Nov 14, 2020

I've reprduced this issue with

async def test_5220(aiohttp_client) -> None:
    async def handler(request):
        return web.Response(text="OK", status=200)

    app = web.Application()
    app.add_routes([web.get("/", handler)])
    app.add_routes([web.put("/", handler)])

    client = await aiohttp_client(app)

    async def data_gen():
        for _ in range(10):
            yield b"just data"
            await asyncio.sleep(0.1)

    async with client.put("/", data=data_gen(), raise_for_status=True):
        pass
    async with client.get("/", raise_for_status=True):
        pass
ERROR    aiohttp.server:web_protocol.py:425 Error handling request
Traceback (most recent call last):
  File "/home/dmitry/Sources/aiohttp/aiohttp/web_protocol.py", line 345, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 546, in aiohttp._http_parser.HttpParser.feed_data
    raise ex
aiohttp.http_exceptions.BadHttpMessage: 400, message='invalid character in chunk size header'

@derlih
Copy link
Contributor

derlih commented Nov 14, 2020

This is the packets dump dump.pcapng.gz
Looks like a client continues to execute PUT after it starts to execute GET.

@greshilov
Copy link
Contributor

greshilov commented Nov 14, 2020

This issue is really tricky.

I'll try to explain what is happening!

First of all put request is chunked because of the data=data_gen() argument. Here is what the first portion of data looks like:

PUT / HTTP/1.1
Host: localhost:8080
Accept: */*
Accept-Encoding: gzip, deflate
User-Agent: Python/3.8 aiohttp/4.0.0a1
Content-Type: application/octet-stream
Transfer-Encoding: chunked

9
just data

After sending this data client quits, but the connection is not closed, because according to HTTP 1.1 connections are keep-alive by default.
Here's where the first problem comes in, client didn't send termination byte b'0' before closing.

On the server side, however, this wouldn't be an issue if lingering mechanism wasn't enabled by default.
lingering was introduced in #1050. Its entire idea is in reading from socket after the handler is over, to prevent errors on client side (read more in original PR).

So as you may have guessed, when client starts get request, server is still in lingering state with a parser waiting for the next chunk of data or b'0' from previous request :(

I've tried to fix this in #5238

@asvetlov
Copy link
Member

@greshilov thanks for digging into the problem.

I think the #5238 fixes the incorrect side.
Maybe it makes sense (let me double-check the PR) but I believe that the server should correctly process even such "malformed" requests as we have in reproducers above.

Reviewing the web_protocol.py code I found that lingering_time handling (written by myself BTW) is buggy and can be improved. During writing a patch I've figured out that the whole protocol class can be significantly improved and simplified.
Thanks to #4771 both master and 3.8 versions of this file are almost identical, the change can be backported easily.

So, my plan is:

  1. Refactor error handling to avoid spawning a task for handle_parse_error() but push a special "error" message into self._messages and process it in a normal way. Working with two potentially concurrent tasks (one for regular request handling and another one for parser error processing is hard).
  2. Move the lingering code into the recently added finish_response() method; always use the method for response sending and half-closing with dropping the connection after a timeout as RFC 7230 suggests (Lingering close in ServerHttpProtocol (was #927) #1050).

@webknjaz
Copy link
Member

Ref: #5269.

@greshilov
Copy link
Contributor

@asvetlov glad to hear! In addition to the new correct behaviour on server, shouldn't we protect aiohttp client users in such cases?

async with client.put("/", data=data_gen(), raise_for_status=True):
        pass
#   <-- At this point, after the context manager is over, client is still connected and has uncompleted request!

@asvetlov
Copy link
Member

I think on both sides we should drop the connection even if keepalive is enabled but the payload was not fully read. The only difference is that on the server side halfclose-lingering-closing should be used. The client can just drop immediately.

@Evanyifan
Copy link

I think on both sides we should drop the connection even if keepalive is enabled but the payload was not fully read. The only difference is that on the server side halfclose-lingering-closing should be used. The client can just drop immediately.

Is there a ready-made solution for this problem now? I also encountered the same problem。
#5443 (comment)

@luochen1990
Copy link

luochen1990 commented Aug 29, 2023

IMHO, if this issue is not easy to be fixed, then the workaround about os.environ.setdefault('AIOHTTP_NO_EXTENSIONS', '1') should be added by default.

@Dreamsorcerer
Copy link
Member

I don't see how that's relevant...

@luochen1990
Copy link

@Dreamsorcerer Sorry, I added the related post now.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Aug 29, 2023

Still not sure how that's relevant. That is an issue with there actually being an invalid character in header (when read client-side), which is closed because we've enabled lenient options on the llhttp parser (though, currently may still be an issue if a response doesn't use CRLF for the line breaks: nodejs/llhttp#241). This is an issue about reading from a keep-alive connection that should have been closed (on the server-side).

@luochen1990
Copy link

Thank you for your explanation. I agree with you now. It was me who confused these two issues. Should I delete my posts or just keep them here?

Dreamsorcerer added a commit that referenced this issue Nov 3, 2023
Fixes #5220.

I believe this is a better fix than #5238. That PR detects that we
didn't finish sending a chunked response and then closes the connection.
This PR ensures that we simply complete the chunked response by sending
the EOF bytes, allowing the connection to remain open and be reused
normally.
Dreamsorcerer added a commit that referenced this issue Nov 3, 2023
Fixes #5220.

I believe this is a better fix than #5238. That PR detects that we
didn't finish sending a chunked response and then closes the connection.
This PR ensures that we simply complete the chunked response by sending
the EOF bytes, allowing the connection to remain open and be reused
normally.

(cherry picked from commit 9c07121)
Dreamsorcerer added a commit that referenced this issue Nov 3, 2023
Fixes #5220.

I believe this is a better fix than #5238. That PR detects that we
didn't finish sending a chunked response and then closes the connection.
This PR ensures that we simply complete the chunked response by sending
the EOF bytes, allowing the connection to remain open and be reused
normally.

(cherry picked from commit 9c07121)
xiangxli pushed a commit to xiangxli/aiohttp that referenced this issue Dec 4, 2023
Fixes aio-libs#5220.

I believe this is a better fix than aio-libs#5238. That PR detects that we
didn't finish sending a chunked response and then closes the connection.
This PR ensures that we simply complete the chunked response by sending
the EOF bytes, allowing the connection to remain open and be reused
normally.

(cherry picked from commit 9c07121)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants