Skip to content

Commit

Permalink
Update release version to 0.5.0 (mlflow#323)
Browse files Browse the repository at this point in the history
  • Loading branch information
aarondav committed Aug 18, 2018
1 parent 384001f commit faf43d9
Show file tree
Hide file tree
Showing 9 changed files with 92 additions and 31 deletions.
37 changes: 37 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,43 @@
Changelog
=========

0.5.0 (2018-08-17)
------------------

MLflow 0.5.0 offers some major improvements, including Keras and PyTorch first-class support as models, SFTP support as an artifactory, a new scatterplot visualization to compare runs, and a more complete Python SDK for experiment and run management.

Breaking changes:

- The Tracking API has been split into two pieces, a "basic logging" API and a "tracking service" API. The "basic logging" API deals with logging metrics, parameters, and artifacts to the currently-active active run, and is accessible in ``mlflow`` (e.g., ``mlflow.log_param``). The tracking service API allow managing experiments and runs (especially historical runs) and is available in ``mlflow.tracking``. The tracking service API will look analogous to the upcoming R and Java Tracking Service SDKs. Please be aware of the following breaking changes:

- ``mlflow.tracking`` no longer exposes the basic logging API, only ``mlflow``. So, code that was written like ``from mlflow.tracking import log_param`` will have to be ``from mlflow import log_param`` (note that almost all examples were already doing this).
- Access to the service API goes through the ``mlflow.tracking.get_service()`` function, which relies on the same tracking server set by either the environment variable ``MLFLOW_TRACKING_URI`` or by code with ``mlflow.tracking.set_tracking_uri()``. So code that used to look like ``mlflow.tracking.get_run()`` will now have to do ``mlflow.tracking.get_service().get_run()``. This does not apply to the basic logging API.
- ``mlflow.ActiveRun`` has been converted into a lightweight wrapper around ``mlflow.entities.Run`` to enable the Python ``with`` syntax. This means that there are no longer any special methods on the object returned when calling ``mlflow.start_run()``. These can be converted to the service API.

- The Python entities returned by the tracking service API are now accessible in ``mlflow.entities`` directly. Where previously you may have used ``mlflow.entities.experiment.Experiment``, you would now just use ``mlflow.entities.Experiment``. The previous version still exists, but is deprecated and may be hidden in a future version.
- REST API endpoint `/ajax-api/2.0/preview/mlflow/artifacts/get` has been moved to `$static_prefix/get-artifact`. This change is coversioned in the JavaScript, so should not be noticeable unless you were calling the REST API directly (#293, @andremchen)

Features:

- [Models] Keras integration: we now support logging Keras models directly in the log_model API, model format, and serving APIs (#280, @ToonKBC)
- [Models] PyTorch integration: we now support logging PyTorch models directly in the log_model API, model format, and serving APIs (#264, @vfdev-5)
- [UI] Scatterplot added to "Compare Runs" view to help compare runs using any two metrics as the axes (#268, @ToonKBC)
- [Artifacts] SFTP artifactory store added (#260, @ToonKBC)
- [Sagemaker] Users can specify a custom VPC when deploying SageMaker models (#304, @dbczumar)
- Pyfunc serialization now includes the Python version, and warns if the major version differs (can be suppressed by using ``load_pyfunc(suppress_warnings=True)``) (#230, @dbczumar)
- Pyfunc serve/predict will activate conda environment stored in MLModel. This can be disabled by adding ``--no-conda`` to ``mlflow pyfunc serve`` or ``mlflow pyfunc predict`` (#225, @0wu)
- Python SDK formalized in ``mlflow.tracking``. This includes adding SDK methods for ``get_run``, ``list_experiments``, ``get_experiment``, and ``set_terminated``. (#299, @aarondav)
- ``mlflow run`` can now be run against projects with no ``conda.yaml`` specified. By default, an empty conda environment will be created -- previously, it would just fail. You can still pass ``--no-conda`` to avoid entering a conda environment altogether (#218, @smurching)

Bug fixes:

- Fix numpy array serialization for int64 and other related types, allowing pyfunc to return such results (#240, @arinto)
- Fix DBFS artifactory calling ``log_artifacts`` with binary data (#295, @aarondav)
- Fix Run Command shown in UI to reproduce a run when the original run is targeted at a subdirectory of a Git repo (#294, @adrian555)
- Filter out ubiquitious dtype/ufunc warning messages (#317, @aarondav)
- Minor bug fixes and documentation updates (#261, @stbof; #279, @dmatrix; #313, @rbang1, #320, @yassineAlouini; #321, @tomasatdatabricks; #266, #282, #289, @smurching; #267, #265, @aarondav; #256, #290, @ToonKBC; #273, #263, @mateiz; #272, #319, @adrian555; #277, @aadamson; #283, #296, @andrewmchen)


0.4.2 (2018-08-07)
------------------

Expand Down
17 changes: 17 additions & 0 deletions docs/source/tracking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ Logging Data to Runs
You can log data to runs using either the MLflow Python or REST API. This section
shows the Python API.

.. _basic_logging_functions:

Basic Logging Functions
^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -156,6 +158,21 @@ environment variable.
mlflow.log_param("a", 1)
mlflow.log_metric("b", 2)
Managing Experiments and Runs with the Tracking Service API
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

MLflow provides a more detailed Tracking Service API for managing experiments and runs directly, which is available in the :doc:`mlflow.tracking<python_api/mlflow.tracking>` package. This makes it possible to query data about past runs, log additional information about them, create experiments and more.

Example usage:

.. code:: python
from mlflow.tracking import get_service
service = get_service()
experiments = service.list_experiments() # returns a list of mlflow.entities.Experiment
run = service.create_run(experiments[0].experiment_id) # returns mlflow.entities.Run
service.log_param(run.info.run_uuid, "hello", "world")
service.set_terminated(run.info.run_uuid)
.. _tracking_ui:

Expand Down
2 changes: 0 additions & 2 deletions docs/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ is from UCI's `machine learning repository <http://archive.ics.uci.edu/ml/datase
:local:
:depth: 1

.. _what-youll-need
What You'll Need
----------------
To run this tutorial, you'll need to:
Expand Down
2 changes: 2 additions & 0 deletions mlflow/entities/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@ def __init__(self, run_info, run_data):

@property
def info(self):
""":return: :py:class:`mlflow.entities.RunInfo`"""
return self._info

@property
def data(self):
""":return: :py:class:`mlflow.entities.RunData`"""
return self._data

def to_proto(self):
Expand Down
5 changes: 3 additions & 2 deletions mlflow/pytorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,8 @@ def load_model(path, run_id=None, **kwargs):
"""
Load a PyTorch model from a local file (if run_id is None) or a run.
:param path: Local filesystem path or Run-relative artifact path to the model saved
by `mlflow.pytorch.log_model`.
by :py:func:`mlflow.pytorch.log_model`.
:param run_id: Run ID. If provided it is combined with path to identify the model.
:param kwargs: kwargs to pass to `torch.load` method
"""
Expand All @@ -115,7 +116,7 @@ def load_pyfunc(path, **kwargs):
corresponding (n x k) torch.FloatTensor (or torch.cuda.FloatTensor) as input to the PyTorch
model. ``predict`` returns the model's predictions (output tensor) in a single-column DataFrame.
:param path: Local filesystem path to the model saved by `mlflow.pytorch.log_model`.
:param path: Local filesystem path to the model saved by :py:func:`mlflow.pytorch.log_model`.
:param kwargs: kwargs to pass to `torch.load` method.
"""
return _PyTorchWrapper(_load_model(os.path.dirname(path), **kwargs))
Expand Down
2 changes: 1 addition & 1 deletion mlflow/tracking/fluent.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def log_param(key, value):
Log the passed-in parameter under the current run, creating a run if necessary.
:param key: Parameter name (string)
:param value: Parameter value (string)
:param value: Parameter value (string, but will be string-ified if not)
"""
run_id = _get_or_start_run().info.run_uuid
get_service().log_param(run_id, key, value)
Expand Down
50 changes: 28 additions & 22 deletions mlflow/tracking/service.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,24 @@ def __init__(self, store):
self.store = store

def get_run(self, run_id):
""":return: mlflow.entities.Run associated with this run id"""
""":return: :py:class:`mlflow.entities.Run` associated with this run id"""
_validate_run_id(run_id)
return self.store.get_run(run_id)

def create_run(self, experiment_id, user_id=None, run_name=None, source_type=None,
source_name=None, entry_point_name=None, start_time=None,
source_version=None, tags=None):
"""Creates a new mlflow.entities.Run object, which can be associated with
"""Creates a new :py:class:`mlflow.entities.Run` object, which can be associated with
metrics, parameters, artifacts, etc.
Unlike mlflow.run(), this does not actually run any code, just creates objects.
Unlike mlflow.start_run(), this does not change the "active run" used by
mlflow.log_param and friends.
:param: user_id If not provided, we will use the current user as a default.
:param: start_time If not provided, we will use the current timestamp.
:param: tags A dictionary of key-value pairs which will be converted into
RunTag objects.
:return: mlflow.entities.Run which was created
Unlike :py:func:`mlflow.projects.run`, does not actually run code, just creates objects.
Unlike :py:func:`mlflow.start_run`, this does not change the "active run" used by
:py:func:`mlflow.log_param` and friends.
:param user_id: If not provided, we will use the current user as a default.
:param start_time: If not provided, we will use the current timestamp.
:param tags: A dictionary of key-value pairs which will be converted into
RunTag objects.
:return: :py:class:`mlflow.entities.Run` which was created
"""
tags = tags if tags else {}
return self.store.create_run(
Expand All @@ -58,22 +59,23 @@ def create_run(self, experiment_id, user_id=None, run_name=None, source_type=Non
)

def list_runs(self, experiment_id):
""":return: list of mlflow.entities.Run (with only RunInfo filled) within the experiment"""
""":return: list of :py:class:`mlflow.entities.Run` (with only RunInfo filled)"""
run_infos = self.store.list_run_infos(experiment_id)
return [Run(run_info.run_uuid, run_info) for run_info in run_infos]

def list_experiments(self):
""":return: list of mlflow.entities.Experiment"""
""":return: list of :py:class:`mlflow.entities.Experiment`"""
return self.store.list_experiments()

def get_experiment(self, experiment_id):
""":return: mlflow.entities.Experiment associated with the given experiemnt id"""
""":return: :py:class:`mlflow.entities.Experiment`"""
return self.store.get_experiment(experiment_id)

def create_experiment(self, name, artifact_location=None):
"""Creates an experiment.
:param: name must be unique
:artifact_location: If not provided, the backing server will pick an appropriate default.
:param name: must be unique
:param artifact_location: If not provided, the server will pick an appropriate default.
:return: integer id of the created experiment
"""
return self.store.create_experiment(
Expand All @@ -98,22 +100,26 @@ def log_param(self, run_id, key, value):

def log_artifact(self, artifact_uri, local_path, artifact_path=None):
"""Writes a local file to the remote artifact_uri.
:param: local_path of the file to write
:param: artifact_path If provided, will be directory in artifact_uri to write to"""
:param local_path: of the file to write
:param artifact_path: If provided, will be directory in artifact_uri to write to"""
artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
artifact_repo.log_artifact(local_path, artifact_path)

def log_artifacts(self, artifact_uri, local_dir, artifact_path=None):
"""Writes a directory of files to the remote artifact_uri.
:param: local_dir of the file to write
:param: artifact_path If provided, will be directory in artifact_uri to write to"""
:param local_dir: of the file to write
:param artifact_path: If provided, will be directory in artifact_uri to write to"""
artifact_repo = ArtifactRepository.from_artifact_uri(artifact_uri, self.store)
artifact_repo.log_artifacts(local_dir, artifact_path)

def set_terminated(self, run_id, status=None, end_time=None):
"""Sets a Run's status to terminated
:param: status A string value of mlflow.entities.RunStatus. Defaults to FINISHED.
:param: end_time If not provided, defaults to the current time."""
:param status: A string value of :py:class:`mlflow.entities.RunStatus`.
Defaults to FINISHED.
:param end_time: If not provided, defaults to the current time."""
end_time = end_time if end_time else int(time.time() * 1000)
status = status if status else "FINISHED"
self.store.update_run_info(run_id, run_status=RunStatus.from_string(status),
Expand All @@ -122,7 +128,7 @@ def set_terminated(self, run_id, status=None, end_time=None):

def get_service(tracking_uri=None):
"""
:param: tracking_uri Address of local or remote tracking server. If not provided,
:param tracking_uri: Address of local or remote tracking server. If not provided,
this will default to the store set by mlflow.tracking.set_tracking_uri. See
https://mlflow.org/docs/latest/tracking.html#where-runs-get-recorded for more info.
:return: mlflow.tracking.MLflowService"""
Expand Down
2 changes: 1 addition & 1 deletion mlflow/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2018 Databricks, Inc.


VERSION = '0.4.2'
VERSION = '0.5.0'
6 changes: 3 additions & 3 deletions test-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ pytest-cov
rstcheck==3.2
scipy
tensorflow
http://download.pytorch.org/whl/cpu/torch-0.4.1-cp27-cp27mu-linux_x86_64.whl ; python_version == '2.7'
http://download.pytorch.org/whl/cpu/torch-0.4.1-cp36-cp36m-linux_x86_64.whl ; python_version == '3.6'
torch
torchvision
pysftp
keras
keras

0 comments on commit faf43d9

Please sign in to comment.