A carefully-curated repository of Data Science literature and Jupyter Notebook "skill-builders".
The DSBC team has spent years developing tools and training materials for Applied Math and Data Science. In this project you will find the following tools.
- "Notes on ..." these are cheet sheets on various topics of math for Machine Learning, e.g. algebra, calculus, probability theory, etc., as well as Machine Learning topics, e.g. linear regression, hypothesis testing, etc.
- "Cheat-Sheets" are for various Python toolboxes, e.g. Numpy, MatPlotLib, SciKitLearn, Keras, etc.
- "10 steps to Data Science" is a series of notebooks to teach you the most common tools used in Data Science.
- "10 steps to Machine Learning" is a series of notebooks to teach you some advanced tools used in Machine Learning.
- "Python in 2 days" is a series of notebooks to get you started in Juypter notebooks and Python.
- "Machine Learning in 1 day" is a series of notebooks focused on a basic toolbox for Machine Learning.
- "Deep Learning in 1 day" is a series of notebooks focused on advanced CNNs, RNNs, GANs and Transformers.
- "Examples" is a series of notebooks that most folks will find useful for a variety of real-life applications of Data Science.
Our new Hi-Res Machine Learning flow-chart will help you navigate the enormous number of algorithms. This will help you select what algorithm you need to complete your task, based on the data that you have and the desired outcome or story you are trying to tell.
Have you heard about Jupyter Notebooks, but dont know how to get started? Here is a quick tutorial.
- First, navigate to the Anaconda website to download the software and install it on your computer (PC or Mac).
- Now, download this GitHub repo to your computer.
- Un-zip the files to the following Anaconda default directory (for PC users): "C:\Users(your user name)\Notebooks".
- Open Anaconda, and click on the launch icon for Jupyter. This will open the Jupyter UI in your web browser.
- We recommend that you start with the "01_Notebooks.ipynb", but for this demo we click on the "02_Python.ipynb".
- Once the notebook has opened in your browser, you can read through it and run each cell (to learn how to do this, go back to "01_Notebooks.ipynb").
- You are now ready to tackle all of our notebooks for your self-paced training. Enjoy! We wish you great success in your journey to becoming a Data Scientist!
We use TortiseSVN for versioning. For the versions available, see the tags on this repository.
This project is licensed under the MIT License - see the LICENSE.md file for details.
This repo was built using material from our private industry and acedemic experience, as well as material borrowed from:
- UCLA ECE 239AS.
- UPenn CIS 229.
- UPenn CIS 520.
- Stanford CS 229.
- Python Data Science Handbook.
- Machine Learning Mastery.
- Towards Data Science.
- Randy Olson's data analysis and machine learning projects.
- Many thanks to Andreas Mueller for some of his examples in the Machine Learning section. We drew inspiration from several of his excellent examples.
- Many thanks to Kaggle for the datasets.
- Numerous others that we cannot remember.