Skip to content

Commit

Permalink
update samples from Release-8 as a part of 1.4.0 SDK stable release
Browse files Browse the repository at this point in the history
  • Loading branch information
vizhur committed Apr 27, 2020
1 parent 7970209 commit fd2b09e
Show file tree
Hide file tree
Showing 30 changed files with 774 additions and 72 deletions.
2 changes: 1 addition & 1 deletion configuration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
"source": [
"import azureml.core\n",
"\n",
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ dependencies:
- azureml-pipeline
- pytorch-transformers==1.0.0
- spacy==2.1.8
- onnxruntime==1.0.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz

channels:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ dependencies:
- azureml-pipeline
- pytorch-transformers==1.0.0
- spacy==2.1.8
- onnxruntime==1.0.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz

channels:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -194,8 +194,8 @@
" '''\n",
" remove = ('headers', 'footers', 'quotes')\n",
" categories = [\n",
" 'alt.atheism',\n",
" 'talk.religion.misc',\n",
" 'rec.sport.baseball',\n",
" 'rec.sport.hockey',\n",
" 'comp.graphics',\n",
" 'sci.space',\n",
" ]\n",
Expand Down Expand Up @@ -345,7 +345,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can test the model locally to get a feel of the input/output. This step may require additional package installations such as pytorch."
"You can test the model locally to get a feel of the input/output. When the model contains BERT, this step will require pytorch and pytorch-transformers installed in your local environment. The exact versions of these packages can be found in the **automl_env.yml** file located in the local copy of your MachineLearningNotebooks folder here:\n",
"MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/automl_env.yml"
]
},
{
Expand Down Expand Up @@ -481,7 +482,7 @@
"source": [
"script_folder = os.path.join(os.getcwd(), 'inference')\n",
"os.makedirs(script_folder, exist_ok=True)\n",
"shutil.copy2('infer.py', script_folder)"
"shutil.copy('infer.py', script_folder)"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ dependencies:
- azureml-train-automl
- azureml-widgets
- matplotlib
- azurmel-train
- https://download.pytorch.org/whl/cpu/torch-1.1.0-cp35-cp35m-win_amd64.whl
- sentencepiece==0.1.82
- pytorch-transformers==1.0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -572,7 +572,7 @@
"\n",
"script_folder = os.path.join(os.getcwd(), 'inference')\n",
"os.makedirs(script_folder, exist_ok=True)\n",
"shutil.copy2('infer.py', script_folder)"
"shutil.copy('infer.py', script_folder)"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -453,8 +453,8 @@
"\n",
"script_folder = os.path.join(os.getcwd(), 'forecast')\n",
"os.makedirs(script_folder, exist_ok=True)\n",
"shutil.copy2('forecasting_script.py', script_folder)\n",
"shutil.copy2('forecasting_helper.py', script_folder)"
"shutil.copy('forecasting_script.py', script_folder)\n",
"shutil.copy('forecasting_helper.py', script_folder)"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -355,9 +355,24 @@
" label_column_name=target_label,\n",
" **time_series_settings)\n",
"\n",
"remote_run = experiment.submit(automl_config, show_output=False)\n",
"remote_run.wait_for_completion()\n",
"\n",
"remote_run = experiment.submit(automl_config, show_output=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"remote_run.wait_for_completion()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Retrieve the best model to use it further.\n",
"_, fitted_model = remote_run.get_output()"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@
"\n",
"from azureml.core.workspace import Workspace\n",
"from azureml.core.experiment import Experiment\n",
"from azureml.train.automl import AutoMLConfig"
"from azureml.train.automl import AutoMLConfig\n",
"from azureml.automl.core.featurization import FeaturizationConfig"
]
},
{
Expand All @@ -81,7 +82,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -318,6 +319,36 @@
"target_column_name = 'Quantity'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Customization\n",
"\n",
"The featurization customization in forecasting is an advanced feature in AutoML which allows our customers to change the default forecasting featurization behaviors and column types through `FeaturizationConfig`. The supported scenarios include,\n",
"1. Column purposes update: Override feature type for the specified column. Currently supports DateTime, Categorical and Numeric. This customization can be used in the scenario that the type of the column cannot correctly reflect its purpose. Some numerical columns, for instance, can be treated as Categorical columns which need to be converted to categorical while some can be treated as epoch timestamp which need to be converted to datetime. To tell our SDK to correctly preprocess these columns, a configuration need to be add with the columns and their desired types.\n",
"2. Transformer parameters update: Currently supports parameter change for Imputer only. User can customize imputation methods, the supported methods are constant for target data and mean, median, most frequent and constant for training data. This customization can be used for the scenario that our customers know which imputation methods fit best to the input data. For instance, some datasets use NaN to represent 0 which the correct behavior should impute all the missing value with 0. To achieve this behavior, these columns need to be configured as constant imputation with `fill_value` 0.\n",
"3. Drop columns: Columns to drop from being featurized. These usually are the columns which are leaky or the columns contain no useful data.\n",
"\n",
"This step requires an Enterprise workspace to gain access to this feature. To learn more about creating an Enterprise workspace or upgrading to an Enterprise workspace from the Azure portal, please visit our [Workspace page.](https://docs.microsoft.com/azure/machine-learning/service/concept-workspace#upgrade)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"featurization_config = FeaturizationConfig()\n",
"featurization_config.drop_columns = ['logQuantity'] # 'logQuantity' is a leaky feature, so we remove it.\n",
"# Force the CPWVOL5 feature to be numeric type.\n",
"featurization_config.add_column_purpose('CPWVOL5', 'Numeric')\n",
"# Fill missing values in the target column, Quantity, with zeros.\n",
"featurization_config.add_transformer_params('Imputer', ['Quantity'], {\"strategy\": \"constant\", \"fill_value\": 0})\n",
"# Fill missing values in the INCOME column with median value.\n",
"featurization_config.add_transformer_params('Imputer', ['INCOME'], {\"strategy\": \"median\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -349,8 +380,8 @@
"|**debug_log**|Log file path for writing debugging information|\n",
"|**time_column_name**|Name of the datetime column in the input data|\n",
"|**grain_column_names**|Name(s) of the columns defining individual series in the input data|\n",
"|**drop_column_names**|Name(s) of columns to drop prior to modeling|\n",
"|**max_horizon**|Maximum desired forecast horizon in units of time-series frequency|"
"|**max_horizon**|Maximum desired forecast horizon in units of time-series frequency|\n",
"|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Setting this enables AutoML to perform featurization on the input to handle *missing data*, and to perform some common *feature extraction*.|"
]
},
{
Expand All @@ -362,7 +393,6 @@
"time_series_settings = {\n",
" 'time_column_name': time_column_name,\n",
" 'grain_column_names': grain_column_names,\n",
" 'drop_column_names': ['logQuantity'], # 'logQuantity' is a leaky feature, so we remove it.\n",
" 'max_horizon': n_test_periods\n",
"}\n",
"\n",
Expand All @@ -374,6 +404,7 @@
" label_column_name=target_column_name,\n",
" compute_target=compute_target,\n",
" enable_early_stopping=True,\n",
" featurization=featurization_config,\n",
" n_cross_validations=3,\n",
" verbosity=logging.INFO,\n",
" **time_series_settings)"
Expand Down Expand Up @@ -425,6 +456,33 @@
"model_name = best_run.properties['model_name']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transparency\n",
"\n",
"View updated featurization summary"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"custom_featurizer = fitted_model.named_steps['timeseriestransformer']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"custom_featurizer.get_featurization_summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down Expand Up @@ -370,7 +370,7 @@
"metadata": {},
"source": [
"#### Initialize the Mimic Explainer for feature importance\n",
"For explaining the AutoML models, use the MimicWrapper from azureml.explain.model package. The MimicWrapper can be initialized with fields in automl_explainer_setup_obj, your workspace and a LightGBM model which acts as a surrogate model to explain the AutoML model (fitted_model here). The MimicWrapper also takes the automl_run object where engineered explanations will be uploaded."
"For explaining the AutoML models, use the MimicWrapper from azureml.explain.model package. The MimicWrapper can be initialized with fields in automl_explainer_setup_obj, your workspace and a surrogate model to explain the AutoML model (fitted_model here). The MimicWrapper also takes the automl_run object where engineered explanations will be uploaded."
]
},
{
Expand All @@ -379,13 +379,14 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel\n",
"from azureml.explain.model.mimic_wrapper import MimicWrapper\n",
"explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator, LGBMExplainableModel, \n",
"explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,\n",
" explainable_model=automl_explainer_setup_obj.surrogate_model, \n",
" init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,\n",
" features=automl_explainer_setup_obj.engineered_feature_names, \n",
" feature_maps=[automl_explainer_setup_obj.feature_map],\n",
" classes=automl_explainer_setup_obj.classes)"
" classes=automl_explainer_setup_obj.classes,\n",
" explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel
from azureml.explain.model.mimic_wrapper import MimicWrapper
from automl.client.core.common.constants import MODEL_PATH
from azureml.automl.core.shared.constants import MODEL_EXPLANATION_TAG
from azureml.explain.model.scoring.scoring_explainer import TreeScoringExplainer, save


Expand Down Expand Up @@ -69,9 +68,6 @@
raw_feature_names=automl_explainer_setup_obj.raw_feature_names,
eval_dataset=automl_explainer_setup_obj.X_test_transform)

# Set tag that explanations completed
automl_run.tag(MODEL_EXPLANATION_TAG, 'True')

print("Engineered and raw explanations computed successfully")

# Initialize the ScoringExplainer
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.4.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -696,6 +696,7 @@
"1. [Save model explanations via Azure Machine Learning Run History](../run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a locally-trained keras model and explainer](../scoring-time/train-explain-model-keras-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -591,6 +591,7 @@
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a locally-trained keras model and explainer](../scoring-time/train-explain-model-keras-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
]
},
Expand Down
Loading

0 comments on commit fd2b09e

Please sign in to comment.