ASA is an app that takes a complete csv file from the user as an input and assigns a sentiment label to each row. It then returns an output csv file with a new column for the sentiment labels.
It uses a DistilBERT model that I fintuned on the IMDB dataset and the VADER sentiment analyzer to assign the sentiment labels.
What to choose from these models depends on the number of rows in the csv file, as using DistilBERT with larger files will significantly increase the runtime.
- You'll first need to install PyTorch. It will be needed for using the DistilBERT model.
- Then install the requirements using
pip install -r reqirements.txt
- Finally run the python file
download_packages.py
. It will download the needed NLTK packages and the DistilBERT model from my Google Drive.
You can start the app by running app.py
file.
On the first page, you must upload the csv file and provide the column name for which you want sentiments to be analysed, and then click on submit.
These are the last few columns of the input csv file that I've chosen in this example
After some time a new page will open. Here you can press the Download button to download the output csv file. It will be named {input_filename}_output.csv
. If you scroll down, you can see a countplot, piechart and two wordclouds for the negative and positive labels.
The resultant csv file now has an additional column with the name 'sentiment'
There are two main files app.py
and functions.py
.
app.py
file contains the code for the Flask app itself, while functions.py
file contains all the required functions for loading the DistilBERT model and the VADER sentiment analyzer, reading the csv file, and assigning the sentiment labels to create the output csv file.