Skip to content

A Streamlit data app for scraping texts from urls and training a word2vec model

License

Notifications You must be signed in to change notification settings

wolfgangB33r/ai-text-model-studio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Text Scraping and AI Model Training Studio

A Streamlit data app for scraping texts from urls and training a word2vec model. Read the companion blog

Features

  • Automatic scraping of texts from given Web urls
  • Extraction of sentences of words
  • Cleanup of scraped sentences
  • Download scraped and cleaned sentences as JSON file
  • Training of word2vec model
  • Persists the model and offer as download

Screenshot

Build the Docker container

docker build --tag text-studio . docker run -it text-studio /bin/sh docker run -p 8501:8501 text-studio

About

A Streamlit data app for scraping texts from urls and training a word2vec model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published