Run MLProjects on docker containers (mlflow#555)

Add ability to specify and run MLflow projects dependent on a docker environment
nlaille · Jan 18, 2019 · d7d6d5d · d7d6d5d
1 parent 3a37840
commit d7d6d5d
Show file tree

Hide file tree

Showing 21 changed files with 5,428 additions and 31 deletions.
diff --git a/dev-requirements.txt b/dev-requirements.txt
@@ -7,4 +7,4 @@ codecov
 coverage
 pypi-publisher
 scikit-learn
-scipy
+scipy
diff --git a/docs/source/projects.rst b/docs/source/projects.rst
@@ -26,12 +26,12 @@ Name
     A human-readable name for the project.
 
 Dependencies
-    Libraries needed to run the project. MLflow currently uses the
-    `Conda <https://conda.io/docs>`_ package manager, which supports both Python packages and native
-    libraries (for example, CuDNN or Intel MKL), to specify dependencies. MLflow will use the
-    Conda installation given by the ``MLFLOW_CONDA_HOME`` environment variable if specified
-    (e.g. running Conda commands by invoking ``$MLFLOW_CONDA_HOME/bin/conda``), and default to
-    running ``conda`` otherwise.
+    Libraries needed to run the project. MLflow supports using `Docker <https://docs.docker.com/>`_ 
+    to run projects inside a container or `Conda <https://conda.io/docs>`_ package manager, 
+    which supports both Python packages and native libraries (for example, CuDNN or Intel MKL), to 
+    specify dependencies. MLflow will use the Conda installation given by the ``MLFLOW_CONDA_HOME`` 
+    environment variable if specified (e.g. running Conda commands by invoking ``$MLFLOW_CONDA_HOME/bin/conda``), 
+    and default to running ``conda`` otherwise.
 
 Entry Points
     Commands that can be executed within the project, and information about their
@@ -64,6 +64,10 @@ following conventions to determine its parameters:
   is specified in ``conda.yaml``, if present. If no ``conda.yaml`` file is present, MLflow
   will use a Conda environment containing only Python (specifically, the latest Python available to
   Conda) when running the project.
+* Alternatively, you may provide a Docker environment for project execution, which allows for capturing
+  non-Python dependencies such as Java libraries.
+ `See here <https://github.com/mlflow/mlflow/tree/master/examples/docker>`_ for an example of an
+  MLflow project with a Docker environment.
 * Any ``.py`` and ``.sh`` file in the project can be an entry point, with no parameters explicitly
   declared. When you execute such a command with a set of parameters, MLflow will pass each
   parameter on the command line using ``--key value`` syntax.
@@ -76,6 +80,9 @@ YAML syntax. The MLproject file looks like this:
     name: My Project
 
     conda_env: my_env.yaml
+    # Can have a docker_env instead of a conda_env, e.g.
+    # docker_env:
+    #    image:  mlflow-docker-example
 
     entry_points:
       main:
@@ -88,7 +95,7 @@ YAML syntax. The MLproject file looks like this:
           data_file: path
         command: "python validate.py {data_file}"
 
-As you can see, the file can specify a name and a different environment file, as well as more
+As you can see, the file can specify a name and a conda or docker environment, as well as more
 detailed information about each entry point. Specifically, each entry point has a *command* to
 run and *parameters* (including data types). We describe these two pieces next.
 
@@ -219,6 +226,17 @@ where ``<uri>`` is a Git repository URI or a folder. You can pass Git credential
 ``MLFLOW_GIT_PASSWORD`` environment variables.
 
 
+Execution on Docker containers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+You can run projects inside Docker container instead of conda environments. In order to do that 
+you need to specify the ``docker_env`` and ``dockerimage`` atributes in MLProject as described bellow. 
+It simply mounts the local directory of the project as a volume inside container in ``/mlflow/projects/code`` path.
+
+.. code::
+
+    docker_env:
+        dockerimage: mlflow-run-image
+
 Iterating Quickly
 -----------------
 

diff --git a/examples/README.md b/examples/README.md
@@ -23,3 +23,5 @@ and stores (logs) them as MLflow artifacts.
 * `sklearn_logisic_regression` is a simple MLflow example with hooks to log training data to MLflow
 tracking server.
 * `tensorflow` is an end-to-end one run example from train to predict.
+* `docker` demonstrates how to create and run an MLflow project using docker (rather than conda)
+  to manage project dependencies
diff --git a/examples/docker/.dockerignore b/examples/docker/.dockerignore
diff --git a/examples/docker/Dockerfile b/examples/docker/Dockerfile
@@ -0,0 +1,8 @@
+FROM continuumio/miniconda:4.5.4
+
+RUN pip install mlflow==0.8.1 \
+    && pip install azure-storage==0.36.0 \
+    && pip install numpy==1.14.3 \
+    && pip install pandas==0.22.0 \
+    && pip install scikit-learn==0.19.1 \
+    && pip install cloudpickle
diff --git a/examples/docker/MLproject b/examples/docker/MLproject
@@ -0,0 +1,11 @@
+name: docker-example
+
+docker_env:
+  image:  mlflow-docker-example
+
+entry_points:
+  main:
+    parameters:
+      alpha: float
+      l1_ratio: {type: float, default: 0.1}
+    command: "python train.py --alpha {alpha} --l1-ratio {l1_ratio}"
diff --git a/examples/docker/README.rst b/examples/docker/README.rst
@@ -0,0 +1,41 @@
+Dockerized Model Training with MLflow
+-------------------------------------
+This directory contains an MLflow project that trains a linear regression model on the UC Irvine
+Wine Quality Dataset. The project uses a docker image to capture the dependencies needed to run
+training code. Running a project in a docker environment (as opposed to conda) allows for capturing
+non-Python dependencies, e.g. Java libraries. In the future, we also hope to add tools to MLflow
+for running dockerized projects e.g. on a Kubernetes cluster for scaleout.
+
+
+Running this Example
+^^^^^^^^^^^^^^^^^^^^
+
+Install MLflow via `pip install mlflow` and `docker <https://www.docker.com/get-started>`_.
+Then, build a docker image containing MLflow via `docker build examples/docker -t mlflow-docker-example`
+and run the example project via `mlflow run examples/docker -P alpha=0.5`
+
+What happens when the project is run?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Let's start by looking at the MLproject file, which specifies the docker image in which to run the
+project via a docker_env field:
+
+```
+docker_env:
+  image:  mlflow-docker-example
+```
+
+Here, `image` can be any valid argument to `docker run`, such as the tag, ID or
+URL of a docker image (see `Docker docs <https://docs.docker.com/engine/reference/run/#general-form>`_).
+The above example references a locally-stored image (mlflow-docker-example) by tag.
+
+Running `mlflow run examples/docker` builds a new docker image based on `mlflow-docker-example`
+but also containing our project code, then executes the default (main) project entry point
+within the container via `docker run`.
+This built image will be tagged as `mlflow-docker-example-<git-version>` where git-version is the git 
+commit ID.
+
+Environment variables such as MLFLOW_TRACKING_URI are
+propagated inside the container during project execution. When running against a local tracking URI,
+e.g. a local `mlruns` directory, MLflow will mount the host system's tracking directory inside the
+container so that metrics and params logged during project execution are accessible afterwards.
+
diff --git a/examples/docker/train.py b/examples/docker/train.py
@@ -0,0 +1,72 @@
+# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
+# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
+# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
+
+import os
+import warnings
+import sys
+import argparse
+
+import pandas as pd
+import numpy as np
+from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
+from sklearn.model_selection import train_test_split
+from sklearn.linear_model import ElasticNet
+
+import mlflow
+import mlflow.sklearn
+
+
+def eval_metrics(actual, pred):
+    rmse = np.sqrt(mean_squared_error(actual, pred))
+    mae = mean_absolute_error(actual, pred)
+    r2 = r2_score(actual, pred)
+    return rmse, mae, r2
+
+
+
+if __name__ == "__main__":
+    warnings.filterwarnings("ignore")
+    np.random.seed(40)
+
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--alpha')
+    parser.add_argument('--l1-ratio')
+    args = parser.parse_args()
+
+    # Read the wine-quality csv file (make sure you're running this from the root of MLflow!)
+    wine_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "wine-quality.csv")
+    data = pd.read_csv(wine_path)
+
+    # Split the data into training and test sets. (0.75, 0.25) split.
+    train, test = train_test_split(data)
+
+    # The predicted column is "quality" which is a scalar from [3, 9]
+    train_x = train.drop(["quality"], axis=1)
+    test_x = test.drop(["quality"], axis=1)
+    train_y = train[["quality"]]
+    test_y = test[["quality"]]
+
+    alpha = float(args.alpha)
+    l1_ratio = float(args.l1_ratio)
+
+    with mlflow.start_run():
+        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
+        lr.fit(train_x, train_y)
+
+        predicted_qualities = lr.predict(test_x)
+
+        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
+
+        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
+        print("  RMSE: %s" % rmse)
+        print("  MAE: %s" % mae)
+        print("  R2: %s" % r2)
+
+        mlflow.log_param("alpha", alpha)
+        mlflow.log_param("l1_ratio", l1_ratio)
+        mlflow.log_metric("rmse", rmse)
+        mlflow.log_metric("r2", r2)
+        mlflow.log_metric("mae", mae)
+
+        mlflow.sklearn.log_model(lr, "model")