update samples from Release-6 as a part of 1.1.5rc0 SDK stable release

fomensah · Mar 11, 2020 · e443fd1 · e443fd1
1 parent 2165cf3
commit e443fd1
Show file tree

Hide file tree

Showing 17 changed files with 331 additions and 30 deletions.
diff --git a/configuration.ipynb b/configuration.ipynb
@@ -103,7 +103,7 @@
       "source": [
         "import azureml.core\n",
         "\n",
-        "print(\"This notebook was created using version 1.1.2rc0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.1.5rc0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/how-to-use-azureml/automated-machine-learning/README.md b/how-to-use-azureml/automated-machine-learning/README.md
@@ -144,7 +144,7 @@ jupyter notebook
     - Dataset: forecasting for a bike-sharing
     - Example of training an automated ML forecasting model on multiple time-series
 
-- [automl-forecasting-function.ipynb](forecasting-high-frequency/automl-forecasting-function.ipynb)
+- [auto-ml-forecasting-function.ipynb](forecasting-high-frequency/auto-ml-forecasting-function.ipynb)
     - Example of training an automated ML forecasting model on multiple time-series
 
 - [auto-ml-forecasting-beer-remote.ipynb](forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb)

diff --git a/how-to-use-azureml/automated-machine-learning/automl_env.yml b/how-to-use-azureml/automated-machine-learning/automl_env.yml
@@ -21,6 +21,7 @@ dependencies:
 - pip:
   # Required packages for AzureML execution, history, and data preparation.
   - azureml-defaults
+  - azureml-dataprep[pandas]
   - azureml-train-automl
   - azureml-train
   - azureml-widgets

diff --git a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -22,6 +22,7 @@ dependencies:
 - pip:
   # Required packages for AzureML execution, history, and data preparation.
   - azureml-defaults
+  - azureml-dataprep[pandas]
   - azureml-train-automl
   - azureml-train
   - azureml-widgets

diff --git a/...equency/automl-forecasting-function.ipynb → ...quency/auto-ml-forecasting-function.ipynb b/...equency/automl-forecasting-function.ipynb → ...quency/auto-ml-forecasting-function.ipynb
diff --git a/...frequency/automl-forecasting-function.yml → ...requency/auto-ml-forecasting-function.yml b/...frequency/automl-forecasting-function.yml → ...requency/auto-ml-forecasting-function.yml
@@ -1,4 +1,4 @@
-name: automl-forecasting-function
+name: auto-ml-forecasting-function
 dependencies:
 - fbprophet==0.5
 - py-xgboost<=0.80

diff --git a/...achine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.yml b/...achine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.yml
@@ -4,6 +4,7 @@ dependencies:
 - py-xgboost<=0.80
 - pip:
   - azureml-sdk
+  - pandas==0.23.4
   - azureml-train-automl
   - azureml-widgets
   - matplotlib
diff --git a/...featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb b/...featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb
@@ -532,8 +532,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "#### Create conda configuration for model explanations experiment\n",
-        "We need `azureml-explain-model`, `azureml-train-automl` and `azureml-core` packages for computing model explanations for your AutoML model on remote compute."
+        "#### Create conda configuration for model explanations experiment from automl_run object"
       ]
     },
     {
@@ -552,13 +551,9 @@
         "# Set compute target to AmlCompute\n",
         "conda_run_config.target = compute_target\n",
         "conda_run_config.environment.docker.enabled = True\n",
-        "azureml_pip_packages = [\n",
-        "    'azureml-train-automl', 'azureml-core', 'azureml-explain-model'\n",
-        "]\n",
         "\n",
         "# specify CondaDependencies obj\n",
-        "conda_run_config.environment.python.conda_dependencies = CondaDependencies.create(\n",
-        "    pip_packages=azureml_pip_packages)"
+        "conda_run_config.environment.python.conda_dependencies = automl_run.get_environment().python.conda_dependencies"
       ]
     },
     {

diff --git a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.yml b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.yml
@@ -2,6 +2,7 @@ name: auto-ml-regression
 dependencies:
 - pip:
   - azureml-sdk
+  - pandas==0.23.4
   - azureml-train-automl
   - azureml-widgets
   - matplotlib
diff --git a/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb b/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
@@ -341,9 +341,6 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "import json\n",
-        "\n",
-        "\n",
         "input_payload = json.dumps({\n",
         "    'data': [\n",
         "        [ 0.03807591,  0.05068012,  0.06169621, 0.02187235, -0.0442235,\n",
@@ -376,16 +373,101 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "### Model profiling\n",
+        "### Model Profiling\n",
         "\n",
-        "You can also take advantage of the profiling feature to estimate CPU and memory requirements for models.\n",
+        "Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n",
         "\n",
-        "```python\n",
-        "profile = Model.profile(ws, \"profilename\", [model], inference_config, test_sample)\n",
-        "profile.wait_for_profiling(True)\n",
-        "profiling_results = profile.get_results()\n",
-        "print(profiling_results)\n",
-        "```"
+        "In order to profile your model you will need:\n",
+        "- a registered model\n",
+        "- an entry script\n",
+        "- an inference configuration\n",
+        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
+        "\n",
+        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
+        "\n",
+        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from azureml.core import Datastore\n",
+        "from azureml.core.dataset import Dataset\n",
+        "from azureml.data import dataset_type_definitions\n",
+        "\n",
+        "\n",
+        "# create a string that can be utf-8 encoded and\n",
+        "# put in the body of the request\n",
+        "serialized_input_json = json.dumps({\n",
+        "    'data': [\n",
+        "        [ 0.03807591,  0.05068012,  0.06169621, 0.02187235, -0.0442235,\n",
+        "         -0.03482076, -0.04340085, -0.00259226, 0.01990842, -0.01764613]\n",
+        "    ]\n",
+        "})\n",
+        "dataset_content = []\n",
+        "for i in range(100):\n",
+        "    dataset_content.append(serialized_input_json)\n",
+        "dataset_content = '\\n'.join(dataset_content)\n",
+        "file_name = 'sample_request_data.txt'\n",
+        "f = open(file_name, 'w')\n",
+        "f.write(dataset_content)\n",
+        "f.close()\n",
+        "\n",
+        "# upload the txt file created above to the Datastore and create a dataset from it\n",
+        "data_store = Datastore.get_default(ws)\n",
+        "data_store.upload_files(['./' + file_name], target_path='sample_request_data')\n",
+        "datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]\n",
+        "sample_request_data = Dataset.Tabular.from_delimited_files(\n",
+        "    datastore_path,\n",
+        "    separator='\\n',\n",
+        "    infer_column_types=True,\n",
+        "    header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n",
+        "sample_request_data = sample_request_data.register(workspace=ws,\n",
+        "                                                   name='diabetes_sample_request_data',\n",
+        "                                                   create_new_version=True)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from datetime import datetime\n",
+        "\n",
+        "\n",
+        "environment = Environment('my-sklearn-environment')\n",
+        "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
+        "    'azureml-defaults',\n",
+        "    'inference-schema[numpy-support]',\n",
+        "    'joblib',\n",
+        "    'numpy',\n",
+        "    'scikit-learn'\n",
+        "])\n",
+        "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
+        "# if cpu and memory_in_gb parameters are not provided\n",
+        "# the model will be profiled on default configuration of\n",
+        "# 3.5CPU and 15GB memory\n",
+        "profile = Model.profile(ws,\n",
+        "            'rgrsn-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n",
+        "            [model],\n",
+        "            inference_config,\n",
+        "            input_dataset=sample_request_data,\n",
+        "            cpu=1.0,\n",
+        "            memory_in_gb=0.5)\n",
+        "\n",
+        "profile.wait_for_completion(True)\n",
+        "details = profile.get_details()"
       ]
     },
     {

diff --git a/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb b/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb
@@ -145,6 +145,110 @@
         "                                   environment=environment)"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Model Profiling\n",
+        "\n",
+        "Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n",
+        "\n",
+        "In order to profile your model you will need:\n",
+        "- a registered model\n",
+        "- an entry script\n",
+        "- an inference configuration\n",
+        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
+        "\n",
+        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
+        "\n",
+        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import json\n",
+        "from azureml.core import Datastore\n",
+        "from azureml.core.dataset import Dataset\n",
+        "from azureml.data import dataset_type_definitions\n",
+        "\n",
+        "\n",
+        "# create a string that can be put in the body of the request\n",
+        "serialized_input_json = json.dumps({\n",
+        "    'data': [\n",
+        "        [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n",
+        "        [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]\n",
+        "    ]\n",
+        "})\n",
+        "dataset_content = []\n",
+        "for i in range(100):\n",
+        "    dataset_content.append(serialized_input_json)\n",
+        "dataset_content = '\\n'.join(dataset_content)\n",
+        "file_name = 'sample_request_data_diabetes.txt'\n",
+        "f = open(file_name, 'w')\n",
+        "f.write(dataset_content)\n",
+        "f.close()\n",
+        "\n",
+        "# upload the txt file created above to the Datastore and create a dataset from it\n",
+        "data_store = Datastore.get_default(ws)\n",
+        "data_store.upload_files(['./' + file_name], target_path='sample_request_data_diabetes')\n",
+        "datastore_path = [(data_store, 'sample_request_data_diabetes' +'/' + file_name)]\n",
+        "sample_request_data_diabetes = Dataset.Tabular.from_delimited_files(\n",
+        "    datastore_path,\n",
+        "    separator='\\n',\n",
+        "    infer_column_types=True,\n",
+        "    header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n",
+        "sample_request_data_diabetes = sample_request_data_diabetes.register(workspace=ws,\n",
+        "                                                   name='sample_request_data_diabetes',\n",
+        "                                                   create_new_version=True)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from datetime import datetime\n",
+        "from azureml.core import Environment\n",
+        "from azureml.core.conda_dependencies import CondaDependencies\n",
+        "from azureml.core.model import Model, InferenceConfig\n",
+        "\n",
+        "\n",
+        "environment = Environment('my-sklearn-environment')\n",
+        "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
+        "    'azureml-defaults',\n",
+        "    'inference-schema[numpy-support]',\n",
+        "    'joblib',\n",
+        "    'numpy',\n",
+        "    'scikit-learn'\n",
+        "])\n",
+        "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
+        "# if cpu and memory_in_gb parameters are not provided\n",
+        "# the model will be profiled on default configuration of\n",
+        "# 3.5CPU and 15GB memory\n",
+        "profile = Model.profile(ws,\n",
+        "            'profile-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n",
+        "            [model],\n",
+        "            inference_config,\n",
+        "            input_dataset=sample_request_data_diabetes,\n",
+        "            cpu=1.0,\n",
+        "            memory_in_gb=0.5)\n",
+        "\n",
+        "profile.wait_for_completion(True)\n",
+        "details = profile.get_details()"
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {},