Explain why example docker-compose.yml file is not suited for production #16041

david-woelfle · 2021-05-25T08:14:19Z

Hi everyone,

thank you so much for your outstanding work on this project! While going through the docs I found the section on Running Airflow in Docker, where a docker-compose.yml file is provided. In the header of the this file it is stated:

WARNING: This configuration is for local development. Do not use it in a production deployment.

Could you explain why this is the case? I.e. why this setting should not be used in production and also what must be done/changed if someone would like to run an Airflow environment in production based on this configuration example?

boring-cyborg · 2021-05-25T08:14:21Z

Thanks for opening your first issue here! Be sure to follow the issue template!

JavierLopezT · 2021-05-25T08:17:45Z

+1 to this explanation.

For the record, we are using docker-compose in production for almost 100 DAGs and our deployment is very very stable and good-performing.

potiuk · 2021-05-25T08:41:31Z

Publishing something that is labelled as production ready is a lot of effort, and the maintenance effort required by the community to maintain it is much bigger. That's why we have to be very careful with labeling something as "production ready" when we officially publish it as community. We have to be prepared to support all kind of users, respond to their issues and fix them and possibly add new features continuously.

Just look how much time it took to graduate Helm chart (3 people worked full time for last ~ 2 months I believe to get it to the state where we could label it as "officially ready"). It took me few months to release first version of Production Docker Images and then over a year to iterate on it and update and respond to issues and handle many cases that were initially unforeseen but users raised them and we responded and added them. And only now I think we are close to make it the image released as "Official Docker image" #10107. Just a project to get it there https://github.com/apache/airflow/projects/3 has 35 issues in "Done" state and two more are needed to complete it.

Different users have different expectations, configurations, databases, executors, deployments, scalability requirements etc. etc. What works for you @JavierLopezT - might not work for 100 other users and they might have different expectations. Don't forget how opinionated you are in the way how you run YOUR deployment and how those opinions my be different for many other people.

When we label something as "production-ready" we should be ready to respond to such issues. We need to have automated tests covering regressions in case anything changes, we need to have formal release process ready, we need to be able to analyse and diagnose and fix problems when they arise.

The current Docker-compose is by far not production ready. It is more of a "quick-start" if you want to try Airflow. No more, no less. There are already a number of issues "The docker compose does not work with LocalExecutor" or "The docker compose does not work with MySQL" raised . And yes - it does not, and this is by design. It is not supposed to. It's not production-ready. This is not it's purpose.

There are other issues that are already created around that:

More examples of Docker Compose: More examples for docker-compose #16031
Add production ready docker-compose files: Add Production-ready docker compose for the production image #8605

I think what @mik-laj proposed as having a wizard that generates docker-compose based on the expectations is a good start to go into "production-ready docker-compose" direction. But we have a loooong way to get there.

david-woelfle · 2021-05-25T08:53:07Z

Thank you @potiuk for the detailed answer! Now it makes much more sense to me.

JavierLopezT · 2021-05-25T08:55:28Z

Crystal clear @potiuk It's OK for you if I open an MR with a summary of your words for including it in the docker-compose file?

potiuk · 2021-05-25T09:29:03Z

Hmm i think it would have to be a general description rather than specific to docker-compose. We have a couple of those 'not production ready' things in Airflow code and i am not sure even where to put it.

But maybe we could indeed make some words about it in README or smth. I am not sure what others think about it. Anyone from the @apache/airflow-committers have an opinion ?

mik-laj · 2021-05-25T09:44:30Z

This file is also missing a few things that make this docker-compose unsafe for production:

CPU/memory resource limit- Each. the container has access to all system resources
SSL - The connection to the container should be encrypted by Traefik / other proxy, or we should configure SSL in the webserver ([webserver] web_server_ssl_*).
Containers use a local file system, but we should use volumes in a production environment.

We should also mention the possible ways of deploying DAGs, eg Git Sync.

I. I also recommend the last one. discussions on Slack, where I explained the assumptions of this guide.
https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1621801810231000?thread_ts=1621711385.211600&cid=CCQ7EGB1P

This example docker file has been prepared for the most popular configuration. I've only tested it with CeleryExecuttor. As for other executor configurations, I think that is beyond the scope of this article. The purpose of this article was to facilitate the launch of Airflow by someone who is unfamiliar with Airflow on its first run, so that they can test and check how Airflow works.
One way to do this is to limit all the actions that the user takes. Now, to start Airflow they just need to run two simple commands:
curl...
docker-compose up
You don't need to select a database engine, executor, or set other configuration options. You don't even need to know what it is to run Airflow.
We should prepare separate guides on how to configure Docker-compose in other configurations . I even started working on a tool that would allow us to generate several Docker-compose filesets based on user-supplied options, but I gave up work when Polidea was acquired by Snowflake.

david-woelfle added the kind:feature Feature Requests label May 25, 2021

potiuk closed this as completed May 25, 2021

This was referenced May 25, 2021

sqlite3.OperationalError: no such table: dag #15965

Closed

More examples for docker-compose #16031

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain why example docker-compose.yml file is not suited for production #16041

Explain why example docker-compose.yml file is not suited for production #16041

david-woelfle commented May 25, 2021

boring-cyborg bot commented May 25, 2021

JavierLopezT commented May 25, 2021

potiuk commented May 25, 2021 •

edited

Loading

david-woelfle commented May 25, 2021

JavierLopezT commented May 25, 2021

potiuk commented May 25, 2021 •

edited

Loading

mik-laj commented May 25, 2021 •

edited

Loading

Explain why example docker-compose.yml file is not suited for production #16041

Explain why example docker-compose.yml file is not suited for production #16041

Comments

david-woelfle commented May 25, 2021

boring-cyborg bot commented May 25, 2021

JavierLopezT commented May 25, 2021

potiuk commented May 25, 2021 • edited Loading

david-woelfle commented May 25, 2021

JavierLopezT commented May 25, 2021

potiuk commented May 25, 2021 • edited Loading

mik-laj commented May 25, 2021 • edited Loading

potiuk commented May 25, 2021 •

edited

Loading

potiuk commented May 25, 2021 •

edited

Loading

mik-laj commented May 25, 2021 •

edited

Loading