Update release version to 0.5.0 (mlflow#323)

brianhendee · Aug 18, 2018 · faf43d9 · faf43d9
1 parent 384001f
commit faf43d9
Show file tree

Hide file tree

Showing 9 changed files with 92 additions and 31 deletions.
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -1,6 +1,43 @@
 Changelog
 =========
 
+0.5.0 (2018-08-17)
+------------------
+
+MLflow 0.5.0 offers some major improvements, including Keras and PyTorch first-class support as models, SFTP support as an artifactory, a new scatterplot visualization to compare runs, and a more complete Python SDK for experiment and run management.
+
+Breaking changes:
+
+- The Tracking API has been split into two pieces, a "basic logging" API and a "tracking service" API. The "basic logging" API deals with logging metrics, parameters, and artifacts to the currently-active active run, and is accessible in ``mlflow`` (e.g., ``mlflow.log_param``). The tracking service API allow managing experiments and runs (especially historical runs) and is available in ``mlflow.tracking``. The tracking service API will look analogous to the upcoming R and Java Tracking Service SDKs. Please be aware of the following breaking changes:
+
+  - ``mlflow.tracking`` no longer exposes the basic logging API, only ``mlflow``. So, code that was written like ``from mlflow.tracking import log_param`` will have to be ``from mlflow import log_param`` (note that almost all examples were already doing this).
+  - Access to the service API goes through the ``mlflow.tracking.get_service()`` function, which relies on the same tracking server set by either the environment variable ``MLFLOW_TRACKING_URI`` or by code with ``mlflow.tracking.set_tracking_uri()``. So code that used to look like ``mlflow.tracking.get_run()`` will now have to do ``mlflow.tracking.get_service().get_run()``. This does not apply to the basic logging API.
+  - ``mlflow.ActiveRun`` has been converted into a lightweight wrapper around ``mlflow.entities.Run`` to enable the Python ``with`` syntax. This means that there are no longer any special methods on the object returned when calling ``mlflow.start_run()``. These can be converted to the service API.
+
+  - The Python entities returned by the tracking service API are now accessible in ``mlflow.entities`` directly. Where previously you may have used ``mlflow.entities.experiment.Experiment``, you would now just use ``mlflow.entities.Experiment``. The previous version still exists, but is deprecated and may be hidden in a future version.
+- REST API endpoint `/ajax-api/2.0/preview/mlflow/artifacts/get` has been moved to `$static_prefix/get-artifact`. This change is coversioned in the JavaScript, so should not be noticeable unless you were calling the REST API directly (#293, @andremchen)
+
+Features:
+
+- [Models] Keras integration: we now support logging Keras models directly in the log_model API, model format, and serving APIs (#280, @ToonKBC)
+- [Models] PyTorch integration: we now support logging PyTorch models directly in the log_model API, model format, and serving APIs (#264, @vfdev-5)
+- [UI] Scatterplot added to "Compare Runs" view to help compare runs using any two metrics as the axes (#268, @ToonKBC)
+- [Artifacts] SFTP artifactory store added (#260, @ToonKBC)
+- [Sagemaker] Users can specify a custom VPC when deploying SageMaker models (#304, @dbczumar)
+- Pyfunc serialization now includes the Python version, and warns if the major version differs (can be suppressed by using ``load_pyfunc(suppress_warnings=True)``) (#230, @dbczumar)
+- Pyfunc serve/predict will activate conda environment stored in MLModel. This can be disabled by adding ``--no-conda`` to ``mlflow pyfunc serve`` or ``mlflow pyfunc predict`` (#225, @0wu)
+- Python SDK formalized in ``mlflow.tracking``. This includes adding SDK methods for ``get_run``, ``list_experiments``, ``get_experiment``, and ``set_terminated``. (#299, @aarondav)
+- ``mlflow run`` can now be run against projects with no ``conda.yaml`` specified. By default, an empty conda environment will be created -- previously, it would just fail. You can still pass ``--no-conda`` to avoid entering a conda environment altogether (#218, @smurching)
+
+Bug fixes:
+
+- Fix numpy array serialization for int64 and other related types, allowing pyfunc to return such results (#240, @arinto)
+- Fix DBFS artifactory calling ``log_artifacts`` with binary data (#295, @aarondav)
+- Fix Run Command shown in UI to reproduce a run when the original run is targeted at a subdirectory of a Git repo (#294, @adrian555)
+- Filter out ubiquitious dtype/ufunc warning messages (#317, @aarondav)
+- Minor bug fixes and documentation updates (#261, @stbof; #279, @dmatrix; #313, @rbang1, #320, @yassineAlouini; #321, @tomasatdatabricks; #266, #282, #289, @smurching; #267, #265, @aarondav; #256, #290, @ToonKBC; #273, #263, @mateiz; #272, #319, @adrian555; #277, @aadamson; #283, #296, @andrewmchen)
+
+
 0.4.2 (2018-08-07)
 ------------------
 

diff --git a/docs/source/tracking.rst b/docs/source/tracking.rst
@@ -72,6 +72,8 @@ Logging Data to Runs
 You can log data to runs using either the MLflow Python or REST API. This section
 shows the Python API.
 
+.. _basic_logging_functions:
+
 Basic Logging Functions
 ^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -156,6 +158,21 @@ environment variable.
         mlflow.log_param("a", 1)
         mlflow.log_metric("b", 2)
 
+Managing Experiments and Runs with the Tracking Service API
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+MLflow provides a more detailed Tracking Service API for managing experiments and runs directly, which is available in the :doc:`mlflow.tracking<python_api/mlflow.tracking>` package. This makes it possible to query data about past runs, log additional information about them, create experiments and more.
+
+Example usage:
+
+.. code:: python
+
+    from  mlflow.tracking import get_service
+    service = get_service()
+    experiments = service.list_experiments() # returns a list of mlflow.entities.Experiment
+    run = service.create_run(experiments[0].experiment_id) # returns mlflow.entities.Run
+    service.log_param(run.info.run_uuid, "hello", "world")
+    service.set_terminated(run.info.run_uuid)
 
 .. _tracking_ui:
 

diff --git a/docs/source/tutorial.rst b/docs/source/tutorial.rst
@@ -18,8 +18,6 @@ is from UCI's `machine learning repository <http://archive.ics.uci.edu/ml/datase
   :local:
   :depth: 1
 
-.. _what-youll-need
-
 What You'll Need
 ----------------
 To run this tutorial, you'll need to:

diff --git a/mlflow/entities/run.py b/mlflow/entities/run.py
@@ -17,10 +17,12 @@ def __init__(self, run_info, run_data):
 
     @property
     def info(self):
+        """:return: :py:class:`mlflow.entities.RunInfo`"""
         return self._info
 
     @property
     def data(self):
+        """:return: :py:class:`mlflow.entities.RunData`"""
         return self._data
 
     def to_proto(self):

diff --git a/mlflow/pytorch.py b/mlflow/pytorch.py
@@ -98,7 +98,8 @@ def load_model(path, run_id=None, **kwargs):
     """
     Load a PyTorch model from a local file (if run_id is None) or a run.
     :param path: Local filesystem path or Run-relative artifact path to the model saved
-                 by `mlflow.pytorch.log_model`.
+    by :py:func:`mlflow.pytorch.log_model`.
+
     :param run_id: Run ID. If provided it is combined with path to identify the model.
     :param kwargs: kwargs to pass to `torch.load` method
     """
@@ -115,7 +116,7 @@ def load_pyfunc(path, **kwargs):
     corresponding (n x k) torch.FloatTensor (or torch.cuda.FloatTensor) as input to the PyTorch
     model. ``predict`` returns the model's predictions (output tensor) in a single-column DataFrame.
 
-    :param path: Local filesystem path to the model saved by `mlflow.pytorch.log_model`.
+    :param path: Local filesystem path to the model saved by :py:func:`mlflow.pytorch.log_model`.
     :param kwargs: kwargs to pass to `torch.load` method.
     """
     return _PyTorchWrapper(_load_model(os.path.dirname(path), **kwargs))

diff --git a/mlflow/tracking/fluent.py b/mlflow/tracking/fluent.py
@@ -106,7 +106,7 @@ def log_param(key, value):
     Log the passed-in parameter under the current run, creating a run if necessary.
 
     :param key: Parameter name (string)
-    :param value: Parameter value (string)
+    :param value: Parameter value (string, but will be string-ified if not)
     """
     run_id = _get_or_start_run().info.run_uuid
     get_service().log_param(run_id, key, value)

diff --git a/mlflow/tracking/service.py b/mlflow/tracking/service.py
@@ -26,23 +26,24 @@ def __init__(self, store):
         self.store = store
 
     def get_run(self, run_id):
-        """:return: mlflow.entities.Run associated with this run id"""
+        """:return: :py:class:`mlflow.entities.Run` associated with this run id"""
         _validate_run_id(run_id)
         return self.store.get_run(run_id)
 
     def create_run(self, experiment_id, user_id=None, run_name=None, source_type=None,
                    source_name=None, entry_point_name=None, start_time=None,
                    source_version=None, tags=None):
-        """Creates a new mlflow.entities.Run object, which can be associated with
+        """Creates a new :py:class:`mlflow.entities.Run` object, which can be associated with
         metrics, parameters, artifacts, etc.
-        Unlike mlflow.run(), this does not actually run any code, just creates objects.
-        Unlike mlflow.start_run(), this does not change the "active run" used by
-        mlflow.log_param and friends.
-        :param: user_id If not provided, we will use the current user as a default.
-        :param: start_time If not provided, we will use the current timestamp.
-        :param: tags A dictionary of key-value pairs which will be converted into
-        RunTag objects.
-        :return: mlflow.entities.Run which was created
+        Unlike :py:func:`mlflow.projects.run`, does not actually run code, just creates objects.
+        Unlike :py:func:`mlflow.start_run`, this does not change the "active run" used by
+        :py:func:`mlflow.log_param` and friends.
+
+        :param user_id: If not provided, we will use the current user as a default.
+        :param start_time: If not provided, we will use the current timestamp.
+        :param tags: A dictionary of key-value pairs which will be converted into
+          RunTag objects.
+        :return: :py:class:`mlflow.entities.Run` which was created
         """
         tags = tags if tags else {}
         return self.store.create_run(
@@ -58,22 +59,23 @@ def create_run(self, experiment_id, user_id=None, run_name=None, source_type=Non
         )
 
     def list_runs(self, experiment_id):
-        """:return: list of mlflow.entities.Run (with only RunInfo filled) within the experiment"""
+        """:return: list of :py:class:`mlflow.entities.Run` (with only RunInfo filled)"""
         run_infos = self.store.list_run_infos(experiment_id)
         return [Run(run_info.run_uuid, run_info) for run_info in run_infos]
 
     def list_experiments(self):
-        """:return: list of mlflow.entities.Experiment"""
+        """:return: list of :py:class:`mlflow.entities.Experiment`"""
         return self.store.list_experiments()
 
     def get_experiment(self, experiment_id):
-        """:return: mlflow.entities.Experiment associated with the given experiemnt id"""
+        """:return: :py:class:`mlflow.entities.Experiment`"""
         return self.store.get_experiment(experiment_id)
 
     def create_experiment(self, name, artifact_location=None):
         """Creates an experiment.
-        :param: name must be unique
-        :artifact_location: If not provided, the backing server will pick an appropriate default.
+
+        :param name: must be unique
+        :param artifact_location: If not provided, the server will pick an appropriate default.
         :return: integer id of the created experiment
         """
         return self.store.create_experiment(
@@ -98,22 +100,26 @@ def log_param(self, run_id, key, value):
 
     def log_artifact(self, artifact_uri, local_path, artifact_path=None):
         """Writes a local file to the remote artifact_uri.
-        :param: local_path of the file to write
-        :param: artifact_path If provided, will be directory in artifact_uri to write to"""
+
+        :param local_path: of the file to write
+        :param artifact_path: If provided, will be directory in artifact_uri to write to"""
         artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
         artifact_repo.log_artifact(local_path, artifact_path)
 
     def log_artifacts(self, artifact_uri, local_dir, artifact_path=None):
         """Writes a directory of files to the remote artifact_uri.
-        :param: local_dir of the file to write
-        :param: artifact_path If provided, will be directory in artifact_uri to write to"""
+
+        :param local_dir: of the file to write
+        :param artifact_path: If provided, will be directory in artifact_uri to write to"""
         artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
         artifact_repo.log_artifacts(local_dir, artifact_path)
 
     def set_terminated(self, run_id, status=None, end_time=None):
         """Sets a Run's status to terminated
-        :param: status A string value of mlflow.entities.RunStatus. Defaults to FINISHED.
-        :param: end_time If not provided, defaults to the current time."""
+
+        :param status: A string value of :py:class:`mlflow.entities.RunStatus`.
+          Defaults to FINISHED.
+        :param end_time: If not provided, defaults to the current time."""
         end_time = end_time if end_time else int(time.time() * 1000)
         status = status if status else "FINISHED"
         self.store.update_run_info(run_id, run_status=RunStatus.from_string(status),
@@ -122,7 +128,7 @@ def set_terminated(self, run_id, status=None, end_time=None):
 
 def get_service(tracking_uri=None):
     """
-    :param: tracking_uri Address of local or remote tracking server. If not provided,
+    :param tracking_uri: Address of local or remote tracking server. If not provided,
       this will default to the store set by mlflow.tracking.set_tracking_uri. See
       https://mlflow.org/docs/latest/tracking.html#where-runs-get-recorded for more info.
     :return: mlflow.tracking.MLflowService"""

diff --git a/mlflow/version.py b/mlflow/version.py
@@ -1,4 +1,4 @@
 # Copyright 2018 Databricks, Inc.
 
 
-VERSION = '0.4.2'
+VERSION = '0.5.0'
diff --git a/test-requirements.txt b/test-requirements.txt
@@ -14,7 +14,7 @@ pytest-cov
 rstcheck==3.2
 scipy
 tensorflow
-http://download.pytorch.org/whl/cpu/torch-0.4.1-cp27-cp27mu-linux_x86_64.whl ; python_version == '2.7'
-http://download.pytorch.org/whl/cpu/torch-0.4.1-cp36-cp36m-linux_x86_64.whl ; python_version == '3.6'
+torch
+torchvision
 pysftp
-keras
+keras