Text Classification Machine Learning Starter Pack

I have come across many different tutorials for machine learning that uses the 20,000 articles dataset, but not many showing how to use your own data.

This starter pack gets you started with using your own data. You can easily delete the subfolders in categories and add your own training data.

This starter kit uses: Scikit-learn (install link) and Jupyter notebok (install link)

This current sample data set is a collection of verses from the American Standard Bible (Public domain Text).

Steps to get started

Clone or download this repo
run jupyter notebook from the main folder.
From the top right select New from your juypter notebook. Copy and the paste the code.
When ready, just replace the subfolders in categories and adjust the categories definition in the ml_starter.py.

Because of the small data size 7 categories with 10 samples each, the results are not very accurate.

A Large sample size will increase accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
categories		categories
ml_starter.py		ml_starter.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classification Machine Learning Starter Pack

Steps to get started

About

Releases

Packages

Languages

djitz/text-classification-ml-starter

Folders and files

Latest commit

History

Repository files navigation

Text Classification Machine Learning Starter Pack

Steps to get started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages