Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable to generate and package a CDS archive #2471

Open
rmannibucau opened this issue May 18, 2020 · 47 comments
Open

Enable to generate and package a CDS archive #2471

rmannibucau opened this issue May 18, 2020 · 47 comments

Comments

@rmannibucau
Copy link
Contributor

rmannibucau commented May 18, 2020

Hi,

CDS enables java application to boot way faster (in particular when any scanning/reflection is involved) so it would be neat to integrate it with JIB.
It requires to launch 2-3 commands to generate the archive and then just bundle it and modify the jvm arguments.

https://blog.codefx.org/java/application-class-data-sharing/#Creating-A-JDK-Class-Data-Archive explains it quite well.

Since the classpath must be stable and not use folders nor wildcards, doing it in jib seems the most reliable and relevant IMHO.

Romain

@chanseokoh
Copy link
Member

Never heard of CDS, but this does seem worth looking into it at some point. Thanks for the feedback! Does CDS only work with -jar my.jar or can work with the classpath launch style (-cp ... com.example.MyMain)?

@rmannibucau
Copy link
Contributor Author

Wrote a quick post about it: https://rmannibucau.metawerx.net/post/java-class-data-sharing-docker-startup

Long story short, CDS is supported by any java app but classpath beginning must be stable and match the archive so jib can't use lib/*.jar for example.

I wrapped jib in a custom main and gain is ~30% on java 11 (i'm using zulu) on my app (CDI) so definitively worth enabling :).

@chanseokoh
Copy link
Member

chanseokoh commented May 18, 2020

Thanks for the info. From which Java version can we use this feature?

jib can't use lib/*.jar for example.

@GoogleContainerTools/java-tools-build I remember the discussion that we can't get rid of the classpath wildcard libs/* until Java 9+ because of the max argument length limit or the max command line length (particularly short on Windows). However, looks like modern Linux kernels support up to 2MB and seems like this is not an issue in practice. (We don't have to think about Windows.)

I have lost the confidence. (And who knows if Windows would matter if, e.g., running a Linux container on a Windows dev machine?)

@rmannibucau
Copy link
Contributor Author

@chanseokoh think it was around java 10 (not sure commands are 100% the same in the first versions). These commands work well on java 11.
BTW I would expect this feature to be off by default since it requires to execute docker commands so somebody enabling it would do it intentionally and classpath would fit the command line anyway I guess?

@saturnism
Copy link

saturnism commented Aug 28, 2020

I took Jib for a spin to see how to do this mechanically (w/o automation). The basics premises are:

  1. JDK arch/distribution needs to be the same
  2. We need JARs - no exploded classpath, and no nested JARs. Just JARs listed in classpath.

It turns out that Jib's packaged containerization mode is a great fit to be able to produce just the JARs! The only issue is AppCDS's -cp can't take wildcard, so we need to list out individual JARs, which is discussed in #2733. Starting with Jib 2.7.0, this can be done by setting <container><expandClasspathDependencies>true.

I was experimenting w/ my Hello World Spring Boot App https://github.com/saturnism/jvm-helloworld-by-example/tree/master/helloworld-springboot-tomcat

  1. Containerize w/ Packaged Mode, to local Docker daemon, and also need to use a debug base image so I can generate the classpath list using shell script.
    mvn package com.google.cloud.tools:jib-maven-plugin:2.7.0:dockerBuild \
      -Dimage=helloworld-experiment \
      -Djib.containerizingMode=packaged \
      -Djib.container.expandClasspathDependencies=true \
      -Djib.from.image=gcr.io/distroless/java-debian10:11-debug
    
  2. Generate the class list, and the archive in the image
    # The Java CLASSPATH is the third element of the default image `ENTRYPOINT`
    # in the Jib-built image, e.g., "java -cp <...classpath...> com.example.MyMain".
    JIB_CLASSPATH=$( docker inspect helloworld-experiment --format '{{(index .Config.Entrypoint 2)}}' )
    
    docker run --entrypoint=sh --name=helloworld-experiment helloworld-experiment \
      -c "mkdir -p /app/appcds \
          && java -XX:DumpLoadedClassList=/app/appcds/classes.lst \
                  -cp '$JIB_CLASSPATH' \
                  com.example.helloworld.HelloworldApplication --appcds=true \
          && java -Xshare:dump \
                  -XX:SharedClassListFile=/app/appcds/classes.lst \
                  -XX:SharedArchiveFile=/app/appcds/archive.jsa \
                  -cp '$JIB_CLASSPATH'"
    
  3. Commit the changes
    docker commit helloworld-experiment helloworld-experiment
    docker rm helloworld-experiment
    
  4. Produce the new container image w/ a different entrypoint. Was hoping to use Jib CLI for this, but ran into some issues.
    # Produce the classpath again
    JIB_ENTRYPOINT='"/usr/bin/java","-Xshare:on","-XX:SharedArchiveFile=/app/appcds/archive.jsa","-cp","'${JIB_CLASSPATH}'","com.example.helloworld.HelloworldApplication"'
    
    cat << EOF > Dockerfile.appcds
    FROM helloworld-experiment
    
    CMD []
    ENTRYPOINT [ $JIB_ENTRYPOINT ]
    EOF
    
    docker build -f Dockerfile.appcds -t helloworld-experiment:appcds .
    
  5. You can then run the container image w/ AppCDS
    docker run -ti --rm helloworld-experiment:appcds
    

@chanseokoh
Copy link
Member

#2866 added the option jib.container.expandClasspathDependencies, and setting it to false will enumerate the dependency classpath (not yet released).

@chanseokoh
Copy link
Member

@saturnism @rmannibucau @koeberlue @holledauer @olivierboudet @bric3 @guillaumeblaquiere @bilak we've released Jib 2.7.0 which added a new configuration option (jib.container.expandClasspathDependencies (Gradle) / <container><expandClasspathDependencies> (Maven)) that enables expanding classpath dependencies in the default java command for an image ENTRYPOINT. Turning on the option (off by default) will enumerate all the dependencies, which will match the dependency loading order in Maven or Gradle builds. For example, the ENTRYPOINT becomes

java ... -cp /app/resources:/app/classes:/app/libs/spring-boot-starter-web-2.0.3.RELEASE.jar:/app/libs/shared-library-0.1.0.jar:/app/libs/spring-boot-starter-json-2.0.3.RELEASE.jar:... com.example.Main

instead of the default

java ... -cp /app/resources:/app/classes:/app/libs/* com.example.Main

Expanding the dependency list will help the AppCDS use case above.

Note that an expanded dependency list can become very long in practice, and we are not sure if there may be a potential issue due to a long command line ("argument list too long" or "command line is too long").

As with other Jib configurations, this option can also be set through the system property (-Djib.container.expandClasspathDependencies=true|false).

@rmannibucau
Copy link
Contributor Author

Does it work with extra classpath? Cds works with classpath prefix which must be expanded but end can stays a wildcard which helps to mount plugins. Would be great to have that feature without going with jibcore programmatic option.

@chanseokoh
Copy link
Member

chanseokoh commented Dec 7, 2020

@rmannibucau no, Jib will just add the list of strings set by extraClasspath as-is, whether it contains a wildcard (*) or not. They are custom classpath, and it's not feasible for Jib to determine or enforce some order of expanding wildcards in custom classpath. (According to #2733 (comment), the loading order of * seems to depend on filesystems (and potentially JVMs)). So it's interesting that AppCDS can safely use a wildcard?

@rmannibucau
Copy link
Contributor Author

@chanseokoh it is more about ensuring extra classpath is appended to the libs than prepended (recall it was prepended at some point - https://github.com/GoogleContainerTools/jib/pull/1642/files#diff-a5317ef6dce278f4451fa8e298358067261e7d10b3cca71c093673202ebb6d5cR276 ). If prepended it breaks cds, if appended it will keep CDS working well.

@chanseokoh chanseokoh added enhancement and removed question User inquiries labels Dec 8, 2020
@chanseokoh
Copy link
Member

chanseokoh commented Dec 8, 2020

Oh, now I understand. For AppCDS to work, it's enough for only some front portion of the entire classpath to be identical and it's fine to have a different classpath entries for the back portion, including using a wildcard, right?

Hmm... yeah, intentionally we prepend extraClasspath so that resources and classes from there take precedence. I wish it were easy to fix #894, which would have made extraClasspath obsolete. Maybe #894 could be supported with a new Jib extension?

@rmannibucau
Copy link
Contributor Author

@chanseokoh well we can do anything with extensions but it kind of break using jib and using multiple extensions will quickly be hard so let's try to maybe keep it "core" until it is a specific feature? I see three simple options (in terms of usage and impl):

  1. use a placeholder with known keywords: ${projectClasspath}:${extraLibs}
  2. (preferred since easier for everybody to use and impl) add an enum PREPEND/APPEND in extractClasspath
  3. (not directly linked to the order but more this issue) if expanded, extractClasspath goes at the end implicitly, this is more a workaround but works.

To have written several mains manipulating the entrypoint I'm really unhappy with this solution and it does not merge well with a concurrent extension doing the same so hope it hits jib-core/maven-plugin soon.

@chanseokoh
Copy link
Member

Thanks for the input. Perhaps it makes sense to reduce the scope of #894 and enable simple keyword substitution only for <entrypoint> (after which extraClasspath can be deprecated).

Just in case, Jib extensions don't run concurrently but in the order they are defined.

@rmannibucau
Copy link
Contributor Author

Yes in order but combining them is hard, it is easy to break previous one and in practise easier to use exec mvn plugin with jib-core :(.

@chanseokoh
Copy link
Member

chanseokoh commented Dec 14, 2020

@saturnism I've updated your AppCDS demo using Jib 2.7.0 which has the option <container><expandClasspathDependencies> to expand the wildcard (*) in the classpath.

@chanseokoh
Copy link
Member

chanseokoh commented Jun 9, 2021

@saturnism @rmannibucau @koeberlue @holledauer @olivierboudet @bric3 @guillaumeblaquiere @bilak @bademux @ykayacan @bilak @biro456 @zecit @chlund

Jib 3.1.1 is released and it creates two new JVM argument files inside an image, where they hold the computed classpath and the main class respectively. Although I haven't tried, I think this will greatly simplify the process to generate and use AppCDS. See the doc for more details. I'd like to try it myself and update the example above, but I'm not sure when I will find time for that.

@rmannibucau
Copy link
Contributor Author

Geronimo arthur already handles graalvm wrapping jib, only cds remains ;)

@chanseokoh
Copy link
Member

Ah, very interesting project. Thanks for the pointer.

@olivierboudet
Copy link

In my case, I can't have it to work but perhaps it is a JDK issue.
I am using Jib 3.1.1 with a spring boot application.
When running with flags -Xshare:dump -XX:SharedClassListFile=/tmp/cds/classes.lst -XX:SharedArchiveFile=/tmp/cds/application.jsa I encounter the following error :

Rewriting and linking classes: done
Error: non-empty directory '/app/resources'
Error: non-empty directory '/app/classes'
Hint: enable -Xlog:class+path=info to diagnose the failure
Error occurred during initialization of VM
Cannot have non-empty directory in paths

This is described in this issue but I am using OpenJDK 11.0.11 so it should be fixed.

@rmannibucau
Copy link
Contributor Author

@olivierboudet maybe use packaged mode?

@olivierboudet
Copy link

@olivierboudet maybe use packaged mode?

You are totally right, it works with packaged mode but I don't see the point to use Jib without building layered images.

@rmannibucau
Copy link
Contributor Author

@olivierboudet nothing prevents you to use layered images, put spring-boot stack in a first image (a project without spring app, use SpringApplication as fake main) then your app with the spring-boot stack as a provided dependencies and voilà. Side note being pakcage mode is not related to layers, I spoke of using target/yourapp.jar instead of resources/ and classes.

@helpermethod
Copy link

Hi!

Because this topic revolves about App CDS, I would like to know if jib will work ootb with Spring Boot 3.3's App CDS support?

https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-3.3.0-M3-Release-Notes#cds-support

@artemptushkin
Copy link

@chanseokoh @elefeint can I gently tag you to answer the question above, please? SB 3.3.0 has been released and we'd like to leverage

@elefeint
Copy link
Contributor

elefeint commented Jun 3, 2024

Sorry, I am 2 years into a new adventure, so I can no longer make any decisions on these projects.

@loosebazooka
Copy link
Member

@meltsufin for routing

@meltsufin
Copy link
Contributor

Is there a proposal for how this could be supported in Jib?
cc: @mpeddada1

@wleese
Copy link

wleese commented Jun 7, 2024

I currently use the following workflow in a GitLab CICD pipeline to get this to work (this being what Spring Framework recommends: https://docs.spring.io/spring-framework/reference/integration/cds.html)

  1. In a job, build the Jib image with containerizingMode = packaged
  2. In the next job, run that image with -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh
  3. In the final job, pick up that application.jsa file, add it to the existing image and change the entrypoint to include -XX:SharedArchiveFile=/application.jsa

All this is with Jib and without ever using Docker (something we cannot do in our GitLab CICD environment).

It works quite well, except step 3.
Here I:

      mkdir cds_output; cp application.jsa cds_output/ 
      ./mvnw -Drevision="${VERSION}" \
        -Djib.to.image="${DOCKER_IMAGE}:${VERSION}" \
        -Djib.from.image="${DOCKER_IMAGE}:${VERSION}" \
        -Djib.containerizingMode=packaged \
        -Djib.extraDirectories.paths=./cds_output \
        -Djib.container.jvmFlags=-XX:SharedArchiveFile=/application.jsa \ 
        jib:build

Now the jvmFlags overwrites whatever the user has already set in their pom.xml.
It would be of great help if there was an option to append to the jvmFlags, so that both options in pom.xml and via the commandline could work together.

Generally speaking, it would be good for Jib to provide a way to easily take an image previously created by Jib, and to append in various ways. That would benefit the CDS workflow of starting an application via a container, capturing some data and then putting it into a new image (optionally overwriting the old one) and adding some jvm flags.

@chanseokoh
Copy link
Member

Now the jvmFlags overwrites whatever the user has already set in their pom.xml.
It would be of great help if there was an option to append to the jvmFlags, so that both options in pom.xml and via the commandline could work together.

For that particular matter, I can think of some hacks.

  <properties>
    <!-- comma-separated flags -->
    <jib-extra-jvm-flags></jib-extra-jvm-flags>
    <jib.container.jvmFlags>${jib-extra-jvm-flags}-Xms512m</jib.container.jvmFlags>
  </properties>

  <profiles>
    <profile>
     <!-- Activate this profile with `mvn -Pcds ...`. -->
      <id>cds</id>
      <properties>
        <!-- Should end with a comma (`,`). -->
        <jib-extra-jvm-flags>-XX:SharedArchiveFile=/application.jsa,</jib-extra-jvm-flags>
      </properties>
    </profile>
  </profiles>

or

  <properties>
    <!-- single dummy flag -->
    <jib-extra-jvm-flag>-Ddummy-system-property</jib-extra-jvm-flag>
  </properties>

  <profiles>
    <profile>
     <!-- Activate this profile with `mvn -Pcds ...`. -->
      <id>cds</id>
      <properties>
        <jib-extra-jvm-flag>-XX:SharedArchiveFile=/application.jsa</jib-extra-jvm-flag>
      </properties>
    </profile>
  </profiles>

  ...
          <container>
            <jvmFlags>
              <jvmFlag>${jib-extra-jvm-flag}</jvmFlag>
              <jvmFlag>-Xms512m</jvmFlag>
            </jvmFlags>

@rmannibucau
Copy link
Contributor Author

Don't use empty tag (<jib-extra-jvm-flags></jib-extra-jvm-flags>) cause some plugin (release for ex) will rewrite it null (<jib-extra-jvm-flags/>).
My workaround about that is to set a jvmflag per potential customization and set -Dapplication.flag1=true as default instead of using an empty value.

tip: using -Dcontainer=true can be more relevant/less weird as default but shouldn't be used from the app to enable the customization.

@artemptushkin
Copy link

@wleese could you explain why we should build it with containerizingMode: packaged?

I can not figure out why you pack. Thank you in general for the issues and support here.

@rmannibucau
Copy link
Contributor Author

rmannibucau commented Jun 13, 2024

@artemptushkin cds works wih *.jar generally speaking so packaged is needed to not keep classes folder.
Spring boot unpack is about the fatjar which is inefficient not the jars in the fatjat - which is packaged about.

@artemptushkin
Copy link

artemptushkin commented Jun 14, 2024

I'm getting running the app in the container with the .jsa file specified:

Dynamic archive cannot be used: static archive header checksum verification failed.
[0.013s][error  ][cds] An error has occurred while processing the shared archive file.
[0.013s][error  ][cds] Failed to initialize dynamic archive
Error occurred during initialization of VM
Unable to use shared archive.

the sha512sum and md5sum equals on the file in the image and the file generated in the pipeline, the only different is modify time - Modify: 1970-01-01 00:00:01.000000000 +0000 in the image, do you think it can impact?

Wonder on actual of the actual algorithm of "header checksum verification"

@artemptushkin
Copy link

Okay then now I'm thinking (sorry out of the JIB context) how come CDS be more efficient if we have to the packaged mode instead of exploded which is confirmed to be the better (efficient) practice

@rmannibucau
Copy link
Contributor Author

@artemptushkin means you compare the time to open and browse a zip or filesystem (I/O + CPU, #jars <<1000) vs the time of loading classes from zips/filesystem vs in memory (CDS). Key thing is #classes >> #jars in most apps so CDS is often way faster at memory cost for most applications and since it includes JVM classes too it is often also the cases for very small apps. Since zip are indexed and startup CPU is generally not an issue exploded vs packaged does not make a huge difference (in particular compared to having "free" classloader (in mem)).

Side note: indeed it is totally wrong for spring fat jar which are zips in zip and there is is way too costly until you lost it all in mem at startup but spring boot does not do that and do a double zip lazy opening which is insanely slow.

@skin27
Copy link

skin27 commented Jun 21, 2024

Maybe helpful some resources on how to run Spring Boot with Docker + CDS:

Don't know if this helps, but I though I will share it.

@artemptushkin
Copy link

@skin27 Thank you, I noticed too that they published Sebastian's presentation video from the Spring IO. And btw, I succeded with CDS and JIB - I hope a will publish a full working example

@artemptushkin
Copy link

artemptushkin commented Jul 15, 2024

I got back to the issue and I saw your support - thank you! I published an example of how I configured the Spring Boot app with CDS, AOT, and Jib Gradle plugin

I documented it briefly but it could have been better

@sureshg
Copy link

sureshg commented Jul 15, 2024

@artemptushkin should we really need this in the Gradle kotlin script? I assume that should be taken care of by the spring boot plugin..rt?

@artemptushkin
Copy link

@sureshg I'd like too to be carried automatically but it's not, I'm not sure if it's an issue of AOT plugin or kinda expected. I just reversed and found so far. So yes, it's needed

@roman4ello
Copy link

@artemptushkin Thank you for your example, but after script:

./build-cds.sh

when I execute:

docker run spring-boot-aot-cds-gradle-jib-example:latest

I got message:

Unable to find image 'spring-boot-aot-cds-gradle-jib-example:latest' locally.

although in reality it is present, somehow it is strange. I just shared with you.

@artemptushkin
Copy link

@roman4ello thanks for sharing, usually it's about the platform. It is probably built with amd/arm and docker runs another by default, I'm not sure why but this is why docker typically can not find an image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests