pip3 install -r requirements.txt
make mlflow_up
Split the raw data into train/val/test folder and tagging to v1.0
python3 src/data_processing.py --version v1.0
Train model "renset_18" with the data version 1.0. The model will be logged to MLflow.
python3 src/model_training.py --data_version v1.0 --model_name resnet_18 --device cpu
Registry the model trained to MLflow by compared the metric "val_loss", tagging "Production" and save config file in /src/config/raw_data.json
python3 src/model_registry.py --best_metric best_val_loss --model_alias Production --config_name raw_data
Retrieve model stored in mlflow server from "model_name" and "model_alias" then deploy to API
make model_name=resnet_18 model_alias=Production port=5000 serving_up
Merge labled data from /data_source/collected/ with raw_data and split into train/val/test folder. Tagging the version as well as the folder name to v1.1
python3 src/data_processing.py --merge_collected --version v1.1
Train model with new dataset and log to MLflow.
python3 src/model_training.py --data_version v1.1 --model_name resnet_18 --device cpu
Retrieve models are trained on dataset version v1.1 and use the metric to choose the best model. After that, tag alias to "Challenger" and log into MLflow. Save config file in /src/config/add_collect.json
python3 src/model_registry.py --filter_string "tags.data_version LIKE 'v1.1'" --best_metric best_val_loss --model_alias Challenger --config_name add_collect
Restart the container to pull model which have "model_config" == "add_collect" for new serving
make serving_down
make model_name=resnet_18 model_alias=Challenger port=5000 serving_up
Turn on/off both mlflow and serving containers
make all_down
make all_up