Skip to content

sailfish009/applied-ml

 
 

Repository files navigation

Applied ML · MLOps · Production
Join 20K+ developers in learning how to responsibly deliver value with applied ML.

     

If you need refresh yourself on ML algorithms, check out our ML Foundations repository (🔥  Among the top ML repositories on GitHub)


📦  Product 🔢  Data 📈  Modeling
Objective Annotation Baselines
Solution Exploratory data analysis Experiment tracking
Evaluation Splitting Optimization
Iteration Preprocessing
📝  Scripting (cont.) 📦  Application ✅  Testing
Organization Styling CLI Code
Packaging Makefile API Data
Documentation Logging Models
⏰  Version control 🚀  Production (cont.)
Git Dashboard Serving
Precommit Docker Feature stores
Versioning CI/CD Workflows
Monitoring Active learning

📆  new lesson every week!
Subscribe for our monthly updates on new content.


Set up

export venv_name="venv"
make venv name=${venv_name}
source ${venv_name}/bin/activate
make assets

Start Jupyterlab

python -m ipykernel install --user --name=tagifai
jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install @jupyterlab/toc
jupyter lab

You can also run all notebooks on Google Colab.

Directory structure

app/
├── api.py        - FastAPI app
└── cli.py        - CLI app
├── schemas.py    - API model schemas
tagifai/
├── config.py     - configuration setup
├── data.py       - data processing components
├── eval.py       - evaluation components
├── main.py       - training/optimization pipelines
├── models.py     - model architectures
├── predict.py    - inference components
├── train.py      - training components
└── utils.py      - supplementary utilities

Documentation can be found here.

Workflow

  1. Prepare environment
export venv_name="venv"
make venv name=${venv_name}
source ${venv_name}/bin/activate
make install-dev
  1. Prepare assets (loading data, previous runs, etc.)
make assets
  1. Optimize using distributions specified in tagifai.main.objective. This also writes the best model's args to config/args.json
tagifai optimize --args-fp config/args.json --study-name optimization --num-trials 100

We'll cover how to train using compute instances on the cloud from Amazon Web Services (AWS) or Google Cloud Platforms (GCP) in later lessons. But in the meantime, if you don't have access to GPUs, check out the optimize.ipynb notebook for how to train on Colab and transfer to local. We essentially run optimization, then train the best model to download and transfer it's arguments and artifacts. Once we have them in our local machine, we can run tagifai set-artifact-metadata to match all metadata as if it were run from your machine.

  1. Train a model (and save all it's artifacts) using args from config/args.json
tagifai train-model --args-fp config/args.json --experiment-name best --run-name model
  1. Predict tags for an input sentence. It'll use the best model saved from train-model but you can also specify a run-id to choose a specific model.
tagifai predict-tags --text "Transfer learning with BERT"

API

uvicorn app.api:app --host 0.0.0.0 --port 5000 --reload --reload-dir tagifai --reload-dir app # start API (make app)
gunicorn -c config/gunicorn.py -k uvicorn.workers.UvicornWorker app.api:app  # gunicorn (make app-prod)

MLFlow

mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri assets/experiments/

Mkdocs

python -m mkdocs serve

Testing

make test
make test-non-training

FAQ

Why is this free?

While this content is for everyone, it's especially targeted towards people who don't have as much opportunity to learn. I firmly believe that creativity and intelligence are randomly distributed but opportunity is siloed. I want to enable more people to create and contribute to innovation.

Who is the author?

  • I've deployed large scale ML systems at Apple as well as smaller systems with constraints at startups and want to share the common principles I've learned along the way.
  • I created Made With ML so that the community can explore, learn and build ML and I learned how to build it into an end-to-end product that's currently used by over 20K monthly active users.
  • Connect with me on Twitter and LinkedIn

To cite this course, please use:
@article{madewithml,
    title  = "Applied ML - Made With ML",
    author = "Goku Mohandas",
    url    = "https://madewithml.com/courses/applied-ml/"
    year   = "2021",
}

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.9%
  • Python 5.0%
  • Other 0.1%