Skip to content

Commit

Permalink
update samples from Release-6 as a part of 1.1.5rc0 SDK stable release
Browse files Browse the repository at this point in the history
  • Loading branch information
vizhur committed Mar 11, 2020
1 parent 2165cf3 commit e443fd1
Show file tree
Hide file tree
Showing 17 changed files with 331 additions and 30 deletions.
2 changes: 1 addition & 1 deletion configuration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
"source": [
"import azureml.core\n",
"\n",
"print(\"This notebook was created using version 1.1.2rc0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.1.5rc0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
2 changes: 1 addition & 1 deletion how-to-use-azureml/automated-machine-learning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ jupyter notebook
- Dataset: forecasting for a bike-sharing
- Example of training an automated ML forecasting model on multiple time-series

- [automl-forecasting-function.ipynb](forecasting-high-frequency/automl-forecasting-function.ipynb)
- [auto-ml-forecasting-function.ipynb](forecasting-high-frequency/auto-ml-forecasting-function.ipynb)
- Example of training an automated ML forecasting model on multiple time-series

- [auto-ml-forecasting-beer-remote.ipynb](forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dependencies:
- pip:
# Required packages for AzureML execution, history, and data preparation.
- azureml-defaults
- azureml-dataprep[pandas]
- azureml-train-automl
- azureml-train
- azureml-widgets
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ dependencies:
- pip:
# Required packages for AzureML execution, history, and data preparation.
- azureml-defaults
- azureml-dataprep[pandas]
- azureml-train-automl
- azureml-train
- azureml-widgets
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: automl-forecasting-function
name: auto-ml-forecasting-function
dependencies:
- fbprophet==0.5
- py-xgboost<=0.80
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ dependencies:
- py-xgboost<=0.80
- pip:
- azureml-sdk
- pandas==0.23.4
- azureml-train-automl
- azureml-widgets
- matplotlib
Original file line number Diff line number Diff line change
Expand Up @@ -532,8 +532,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create conda configuration for model explanations experiment\n",
"We need `azureml-explain-model`, `azureml-train-automl` and `azureml-core` packages for computing model explanations for your AutoML model on remote compute."
"#### Create conda configuration for model explanations experiment from automl_run object"
]
},
{
Expand All @@ -552,13 +551,9 @@
"# Set compute target to AmlCompute\n",
"conda_run_config.target = compute_target\n",
"conda_run_config.environment.docker.enabled = True\n",
"azureml_pip_packages = [\n",
" 'azureml-train-automl', 'azureml-core', 'azureml-explain-model'\n",
"]\n",
"\n",
"# specify CondaDependencies obj\n",
"conda_run_config.environment.python.conda_dependencies = CondaDependencies.create(\n",
" pip_packages=azureml_pip_packages)"
"conda_run_config.environment.python.conda_dependencies = automl_run.get_environment().python.conda_dependencies"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ name: auto-ml-regression
dependencies:
- pip:
- azureml-sdk
- pandas==0.23.4
- azureml-train-automl
- azureml-widgets
- matplotlib
Original file line number Diff line number Diff line change
Expand Up @@ -341,9 +341,6 @@
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"\n",
"input_payload = json.dumps({\n",
" 'data': [\n",
" [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n",
Expand Down Expand Up @@ -376,16 +373,101 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Model profiling\n",
"### Model Profiling\n",
"\n",
"You can also take advantage of the profiling feature to estimate CPU and memory requirements for models.\n",
"Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n",
"\n",
"```python\n",
"profile = Model.profile(ws, \"profilename\", [model], inference_config, test_sample)\n",
"profile.wait_for_profiling(True)\n",
"profiling_results = profile.get_results()\n",
"print(profiling_results)\n",
"```"
"In order to profile your model you will need:\n",
"- a registered model\n",
"- an entry script\n",
"- an inference configuration\n",
"- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
"\n",
"At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
"\n",
"Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Datastore\n",
"from azureml.core.dataset import Dataset\n",
"from azureml.data import dataset_type_definitions\n",
"\n",
"\n",
"# create a string that can be utf-8 encoded and\n",
"# put in the body of the request\n",
"serialized_input_json = json.dumps({\n",
" 'data': [\n",
" [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n",
" -0.03482076, -0.04340085, -0.00259226, 0.01990842, -0.01764613]\n",
" ]\n",
"})\n",
"dataset_content = []\n",
"for i in range(100):\n",
" dataset_content.append(serialized_input_json)\n",
"dataset_content = '\\n'.join(dataset_content)\n",
"file_name = 'sample_request_data.txt'\n",
"f = open(file_name, 'w')\n",
"f.write(dataset_content)\n",
"f.close()\n",
"\n",
"# upload the txt file created above to the Datastore and create a dataset from it\n",
"data_store = Datastore.get_default(ws)\n",
"data_store.upload_files(['./' + file_name], target_path='sample_request_data')\n",
"datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]\n",
"sample_request_data = Dataset.Tabular.from_delimited_files(\n",
" datastore_path,\n",
" separator='\\n',\n",
" infer_column_types=True,\n",
" header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n",
"sample_request_data = sample_request_data.register(workspace=ws,\n",
" name='diabetes_sample_request_data',\n",
" create_new_version=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime\n",
"\n",
"\n",
"environment = Environment('my-sklearn-environment')\n",
"environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
" 'azureml-defaults',\n",
" 'inference-schema[numpy-support]',\n",
" 'joblib',\n",
" 'numpy',\n",
" 'scikit-learn'\n",
"])\n",
"inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
"# if cpu and memory_in_gb parameters are not provided\n",
"# the model will be profiled on default configuration of\n",
"# 3.5CPU and 15GB memory\n",
"profile = Model.profile(ws,\n",
" 'rgrsn-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n",
" [model],\n",
" inference_config,\n",
" input_dataset=sample_request_data,\n",
" cpu=1.0,\n",
" memory_in_gb=0.5)\n",
"\n",
"profile.wait_for_completion(True)\n",
"details = profile.get_details()"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,110 @@
" environment=environment)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Model Profiling\n",
"\n",
"Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n",
"\n",
"In order to profile your model you will need:\n",
"- a registered model\n",
"- an entry script\n",
"- an inference configuration\n",
"- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
"\n",
"At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
"\n",
"Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"from azureml.core import Datastore\n",
"from azureml.core.dataset import Dataset\n",
"from azureml.data import dataset_type_definitions\n",
"\n",
"\n",
"# create a string that can be put in the body of the request\n",
"serialized_input_json = json.dumps({\n",
" 'data': [\n",
" [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n",
" [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]\n",
" ]\n",
"})\n",
"dataset_content = []\n",
"for i in range(100):\n",
" dataset_content.append(serialized_input_json)\n",
"dataset_content = '\\n'.join(dataset_content)\n",
"file_name = 'sample_request_data_diabetes.txt'\n",
"f = open(file_name, 'w')\n",
"f.write(dataset_content)\n",
"f.close()\n",
"\n",
"# upload the txt file created above to the Datastore and create a dataset from it\n",
"data_store = Datastore.get_default(ws)\n",
"data_store.upload_files(['./' + file_name], target_path='sample_request_data_diabetes')\n",
"datastore_path = [(data_store, 'sample_request_data_diabetes' +'/' + file_name)]\n",
"sample_request_data_diabetes = Dataset.Tabular.from_delimited_files(\n",
" datastore_path,\n",
" separator='\\n',\n",
" infer_column_types=True,\n",
" header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n",
"sample_request_data_diabetes = sample_request_data_diabetes.register(workspace=ws,\n",
" name='sample_request_data_diabetes',\n",
" create_new_version=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime\n",
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"from azureml.core.model import Model, InferenceConfig\n",
"\n",
"\n",
"environment = Environment('my-sklearn-environment')\n",
"environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
" 'azureml-defaults',\n",
" 'inference-schema[numpy-support]',\n",
" 'joblib',\n",
" 'numpy',\n",
" 'scikit-learn'\n",
"])\n",
"inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
"# if cpu and memory_in_gb parameters are not provided\n",
"# the model will be profiled on default configuration of\n",
"# 3.5CPU and 15GB memory\n",
"profile = Model.profile(ws,\n",
" 'profile-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n",
" [model],\n",
" inference_config,\n",
" input_dataset=sample_request_data_diabetes,\n",
" cpu=1.0,\n",
" memory_in_gb=0.5)\n",
"\n",
"profile.wait_for_completion(True)\n",
"details = profile.get_details()"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
Loading

0 comments on commit e443fd1

Please sign in to comment.