Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-slim build causes "Could not find a suitable TLS CA certificate bundle" in Python requests #101

Open
nottrobin opened this issue Dec 11, 2019 · 14 comments

Comments

@nottrobin
Copy link

nottrobin commented Dec 11, 2019

I know nothing about docker-slim, I don't have a clue how it works, but it looks exciting so I'm giving it a try on our Dockerfile for ubuntu.com.

I build the original image as follows:

$ git clone git@github.com:canonical-web-and-design/ubuntu.com.git
...
$ cd ubuntu.com
$ DOCKER_BUILDKIT=1 docker build --tag ubuntu-com .
...
$ docker images ubuntu-com:latest
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
ubuntu-com          latest              0162e0fc46e5        21 seconds ago      253MB

Then I run docker-slim, which appears to succed, and does indeed more than halve the size of the image:

$ ~/Downloads/dist_linux/docker-slim build --expose 80 ubuntu-com
docker-slim[build]: info=http.probe message='using default probe'
docker-slim[build]: state=started
docker-slim[build]: info=params target=ubuntu-com continue.mode=probe
docker-slim[build]: state=image.inspection.start
docker-slim[build]: info=image id=sha256:0162e0fc46e559ff9b8b12dd57b4ea5d28670b0e388fcc234fee158255f20ee9 size.bytes=253235824 size.human=253 MB
docker-slim[build]: info=image.stack index=0 name='ubuntu-com:latest' id='sha256:0162e0fc46e559ff9b8b12dd57b4ea5d28670b0e388fcc234fee158255f20ee9'
docker-slim[build]: state=image.inspection.done
docker-slim[build]: state=container.inspection.start
docker-slim[build]: info=container status=created name=dockerslimk_2276_20191211104333 id=130d60dacd6dc660602af00a0014af97fa60b7974b75046ad198a2e5191ecca0
docker-slim[build]: info=cmd.startmonitor status=sent
docker-slim[build]: info=event.startmonitor.done status=received
docker-slim[build]: info=container name=dockerslimk_2276_20191211104333 id=130d60dacd6dc660602af00a0014af97fa60b7974b75046ad198a2e5191ecca0 target.port.list=[32774] target.port.info=[80/tcp => 0.0.0.0:32774] message='YOU CAN USE THESE PORTS TO INTERACT WITH THE CONTAINER'
docker-slim[build]: state=http.probe.starting message='WAIT FOR HTTP PROBE TO FINISH'
docker-slim[build]: info=continue.after mode=probe message='no input required, execution will resume when HTTP probing is completed'
docker-slim[build]: info=prompt message='waiting for the HTTP probe to finish'
docker-slim[build]: state=http.probe.running
docker-slim[build]: info=http.probe.ports count=1 targets='32774'
docker-slim[build]: info=http.probe.commands count=1 commands='GET /'
docker-slim[build]: info=http.probe.call status=200 method=GET target=http://127.0.0.1:32774/ attempt=1  time=2019-12-11T10:43:46Z
docker-slim[build]: info=http.probe.summary total=1 failures=0 successful=1
docker-slim[build]: state=http.probe.done 
docker-slim[build]: info=event message='HTTP probe is done'
docker-slim[build]: state=container.inspection.finishing
docker-slim[build]: state=container.inspection.artifact.processing
docker-slim[build]: state=container.inspection.done
docker-slim[build]: state=building message='building minified image'
docker-slim[build]: state=completed
docker-slim[build]: info=results status='MINIFIED BY 2.29X [253235824 (253 MB) => 110452274 (110 MB)]'
docker-slim[build]: info=results  image.name=ubuntu-com.slim image.size='110 MB' data=true
docker-slim[build]: info=results  artifacts.location='/home/robin/Downloads/dist_linux/.docker-slim-state/images/0162e0fc46e559ff9b8b12dd57b4ea5d28670b0e388fcc234fee158255f20ee9/artifacts'
docker-slim[build]: info=results  artifacts.report=creport.json
docker-slim[build]: info=results  artifacts.dockerfile.original=Dockerfile.fat
docker-slim[build]: info=results  artifacts.dockerfile.new=Dockerfile
docker-slim[build]: info=results  artifacts.seccomp=ubuntu-com-seccomp.json
docker-slim[build]: info=results  artifacts.apparmor=ubuntu-com-apparmor-profile
docker-slim[build]: state=done
docker-slim[build]: info=report file='slim.report.json'

$ docker images ubuntu-com.slim:latest
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
ubuntu-com.slim     latest              834c263189e7        About a minute ago   110MB

But now if I run the site from the new image:

$ docker run -ti -p 8222:80 ubuntu-com.slim:latest
2019-12-11 10:37:03.147Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."
2019-12-11 10:37:03.184Z INFO gunicorn.error "Starting gunicorn 19.10.0"                                                                                                                                          
2019-12-11 10:37:03.185Z INFO gunicorn.error "Listening at: http://0.0.0.0:80 (7)"                                                                                                                                 
2019-12-11 10:37:03.185Z INFO gunicorn.error "Using worker: sync"                                  
2019-12-11 10:37:03.187Z INFO gunicorn.error "Booting worker with pid: 11"                                                                                                                                         
2019-12-11 10:37:03.239Z INFO gunicorn.error "Booting worker with pid: 12"                                                                                                                                         
2019-12-11 10:37:03.252Z INFO gunicorn.error "Booting worker with pid: 13"                                                                                                                                         
2019-12-11 10:37:03.284Z INFO gunicorn.error "Booting worker with pid: 14"                                                                                                                                         
2019-12-11 10:37:03.378Z INFO gunicorn.error "Booting worker with pid: 15"                                                                                                                                         
2019-12-11 10:37:04.262Z INFO talisker.flask "updating raven config from flask app"                                       
2019-12-11 10:37:04.263Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."                                                                  
2019-12-11 10:37:04.282Z INFO talisker.flask "updating raven config from flask app"                                                    
2019-12-11 10:37:04.283Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."
2019-12-11 10:37:04.379Z INFO talisker.flask "updating raven config from flask app"                
2019-12-11 10:37:04.379Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."
2019-12-11 10:37:04.494Z INFO talisker.flask "updating raven config from flask app"         
2019-12-11 10:37:04.494Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."
2019-12-11 10:37:04.511Z INFO talisker.flask "updating raven config from flask app"                
2019-12-11 10:37:04.512Z INFO talisker.sentry "Raven is not configured (logging is disabled). Please see the documentation for more information."

Then I browse to http://127.0.0.1:8222/blog, the "blog" feed fails to load, and I see these errors in the image output:

2019-12-11 10:37:10.861Z ERROR talisker.requests "http request failure" url=https://admin.insights.ubuntu.com/wp-json/wp/v2/posts? qs="?per_page=<len 1>&page=<len 1>&tags_exclude=<len 14>&sticky=<len 4>&_embed=<len 4>" qs_size=73 method=GET host=admin.insights.ubuntu.com service=ubuntu.com request_id=adacabfb-fd60-4
368-aa23-48f7e6de26f9
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/talisker/requests.py", line 173, in send
    return func(request, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/requests/adapters.py", line 416, in send
    self.cert_verify(conn, request.url, verify, cert)
  File "/root/.local/lib/python3.6/site-packages/requests/adapters.py", line 228, in cert_verify
    "invalid path: {}".format(cert_loc))
OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /root/.local/lib/python3.6/site-packages/certifi/cacert.pem
2019-12-11 10:37:10.864Z ERROR flask.app "Exception on /blog/latest-news [GET]" service=ubuntu.com request_id=adacabfb-fd60-4368-aa23-48f7e6de26f9
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/root/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/root/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/root/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/root/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/root/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/root/.local/lib/python3.6/site-packages/canonicalwebteam/blog/flask.py", line 63, in latest_news
    limit=flask.request.args.get("limit", "3"),
  File "/root/.local/lib/python3.6/site-packages/canonicalwebteam/blog/common_view_logic.py", line 268, in get_latest_news
    sticky=True,
  File "/root/.local/lib/python3.6/site-packages/canonicalwebteam/blog/wordpress_api.py", line 89, in get_articles
    response = api_session.get(url)
  File "/root/.local/lib/python3.6/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/talisker/requests.py", line 190, in request
    return func(method, url, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/root/.local/lib/python3.6/site-packages/talisker/requests.py", line 173, in send
    return func(request, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/requests/adapters.py", line 416, in send
    self.cert_verify(conn, request.url, verify, cert)
  File "/root/.local/lib/python3.6/site-packages/requests/adapters.py", line 228, in cert_verify
    "invalid path: {}".format(cert_loc))
OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /root/.local/lib/python3.6/site-packages/certifi/cacert.pem

The container is using requests to query a wordpress API at https://admin.insights.ubuntu.com/wp-json/wp/v2/posts, and it works in the original ubuntu-com:latest image. Something about the slimming process appears to be removing something the Requests library needs to verify HTTPS certificates, or something.

@nottrobin nottrobin changed the title docker-slim build causes "Could not find a suitable TLS CA certificate bundle" in Python docker-slim build causes "Could not find a suitable TLS CA certificate bundle" in Python requests Dec 11, 2019
@kcq
Copy link
Member

kcq commented Dec 11, 2019

Thank you for opening the issue @nottrobin ! Sorry for the headaches! Let me try to repro the condition to investigate this a bit more...

@kcq kcq added the triage label Dec 12, 2019
@TJM
Copy link

TJM commented Dec 12, 2019

For what its worth, I am getting a similar issue trying to minify morpheusdata/morpheus-cli (ruby). I even tried using --include-path=/etc/ssl to no avail. It seems fine, but is unable to validate SSL certificates. However, it reports valid SSL certs as self signed.

Error Communicating with the Appliance. SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)

I was really glad to see someone else with an SSL error, cause this one might be hard to reproduce. :)

NOTE: I did "docker exec" into the container while it was sitting on the "Press " prompt, and ran several commands that would have connected to the same "appliance" and validated the SSL certificate.

@nottrobin
Copy link
Author

Please let me know if there's any way I could help dig into this further.

@TJM
Copy link

TJM commented Dec 16, 2019

I am wondering if you are logging into your container while it is in "build" mode and running it thorough its different options or whatever. I am not sure exactly how it works, but I assume it is watching to see what files are opened by the running app. I am starting to believe that it doesn't work to "docker exec" into it :-/

@kcq
Copy link
Member

kcq commented Jan 21, 2020

@nottrobin Sorry, didn't get a chance to respond sooner (was traveling at the time). Some application stacks might require additional coverage when the http probe in docker-slim runs. By default, the http probe sends a GET / request to the target http interface. In your case that was not enough. The /blog endpoint also needs to be covered, so your webapp gets a chance to load the cert file. To make the blog endpoint work I added an extra http probe command like this: docker-slim build --expose 80 --http-probe-cmd /blog ubuntu-com. When docker-slim is minifying the image you should now see something like this on you screen:

docker-slim[build]: info=http.probe.call status=200 method=GET target=http://127.0.0.1:32774/blog attempt=1  time=2020-01-21T15:58:29Z
docker-slim[build]: info=http.probe.call status=200 method=GET target=http://127.0.0.1:32774/ attempt=1  time=2020-01-21T15:58:29Z

If you have a lot of extra http probe commands or if you prefer using a config file you can also use the --http-probe-cmd-file parameter. There are a few notes about http probing and how to specify additional http probe commands here (including an http probe command file example): https://github.com/docker-slim/docker-slim#http-probe-commands

I'll add an example to the examples repo for your webapp container if you don't mind :)

@kcq
Copy link
Member

kcq commented Jan 21, 2020

@TJM , @nottrobin docker-slim has several continue-after modes. By default, when http probing is enabled (it's enabled when there's, at least, one exposed port) docker-slim will run its http probes and then it'll finish its execution once the http probes finish, but you can also configure it to wait allowing you to run your own tests for your container (that would invoke various endpoints in your webapp): docker-slim build --continue-after enter your-image. With enter docker-slim will wait for you to press enter to finish processing. With timeout it'll wait the number of seconds you specify to wait. With signal it'll wait for you or your automation tooling to send the USR1 signal to docker-slim to continue.

@kcq
Copy link
Member

kcq commented Jan 21, 2020

One of the future enhancements will allow docker-slim to discover the http routes in your application, so it can auto-generate additional http probe commands on its own. It does require stack-specific static analysis. Definitely doable, but need help to make it happen sooner :-)

@kcq
Copy link
Member

kcq commented Feb 2, 2020

@nottrobin added an example for your app image here: https://github.com/docker-slim/examples/tree/master/3rdparty/ubuntu-com

Note that I also needed to add a couple of extra directories with static web artifacts (/srv/templates and /srv/static) for the web app to be fully operational. Another alternative there is to have a more complete set of HTTP probes that would discover all of those static resources on their own.

@kcq kcq removed the triage label Feb 3, 2020
@kcq
Copy link
Member

kcq commented Feb 3, 2020

@TJM I'm trying to repro your condition... and there's a WIP example for it here: https://github.com/docker-slim/examples/tree/master/3rdparty/morpheus-cli I'm not an expert with morpheus and its cli though. What would be a good set of morpheus cli commands to execute? We can put those cli commands in a shell script, mount running the docker-slim build command and set it as the entrypoint, so it gets to run when docker-slim is inspecting the temporary container it creates.

What are the command line parameters you used trying to minify your morpheus-cli container image? This is what I used for my single command example: docker-slim build --http-probe=false --show-clogs=true --cmd="remote add demo https://demo.morpheusdata.com -N" morpheusdata/morpheus-cli, which should be similar to doing this with docker: docker run -it --rm morpheusdata/morpheus-cli remote add demo https://demo.morpheusdata.com -N

P.S.
docker exec doesn't help because it's executed as separate processes that are not in the process tree that's being observed.

@TJM
Copy link

TJM commented Feb 3, 2020

@kcq If docker exec won't work, then, for sure, my testing was defunct. As it is a command line tool that interacts with the API of an appliance, there are many different commands that could be run. I was trying to just do a remote add and remote list -a. I was actually bind mounting my local config file so that I didn't have to do the "add" or authentication part. I have asked the morpheus devs for help on a list of commands that we should be running instead of guessing. :)

At a minimum, I think we would need to add all the CA certs. If there is some way to detect when python/ruby/go/etc is opening CA certs, and load all of them, it would probably be a more complete solution.

Tommy

@kcq
Copy link
Member

kcq commented Feb 3, 2020

@TJM docker-slim does keep the certs that get used. For example, with the current WIP example (that tries to execute remote add demo https://demo.morpheusdata.com -N) it saves /usr/share/ca-certificates/mozilla/DigiCert_Global_Root_CA.crt and the links that point to it in /etc/ssl/certs.

@TJM
Copy link

TJM commented Feb 3, 2020

@kcq Right, that works as expected. but this is a CLI tool that could be used to connect to any other morpheus appliance. That only will work if your appliance's cert was signed by that specific CA. However, if it is using the global system certs, perhaps it is as simple as ensuring everything in /etc/ssl is included?

~tommy

@kcq
Copy link
Member

kcq commented Feb 6, 2020

@TJM It's not always enough to include /etc/ssl. In this case the actual certs are in another directory ( /usr/share/ca-certificates ) and /etc/ssl includes only links. Try doing something like this: docker-slim build --http-probe=false --show-clogs=true --include-path /etc/ssl --include-path /usr/share/ca-certificates --cmd="remote add demo https://demo.morpheusdata.com -N" morpheusdata/morpheus-cli

@TJM
Copy link

TJM commented Feb 6, 2020

Ah yes... its stuff like that where it feels like you still need intimate knowledge of the underlying container OS. The -N is a syntax error, but without it, you get to the next barrier, which is that it cant load morpheus/cli/login.rb. I tried loading a known good config file and giving it the command "remote check -a" but then I get another error. I feel like you should at least be able to run the same command that you ran during the slimming, but alas. I am hoping the morpheus-cli team can provide a list of commands that we can use instead of just arbitrarily something like "remote add xxx url"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants