Skip to content

The aim of this work is to predict similar text data, given a text data. The text vectorization is done using CountVectorizer.

Notifications You must be signed in to change notification settings

PriyankaSett/news_recommendation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

News Recommendation

The aim of this project is to predict similar text given an input text.

This project involves textual data. The basic text preprocessing involves remvoving of url, emoticons, punctuations and digits. The text vectorization is performed using Countvectorizer from nltk library.

Next the recommendation are built using different 'Similarity Algoithms'. One can find the distance between two textual data using several approaches as - Hamming Distance, Cosine Similarity, Euclidean distance, Jaccard Coefficient, Manhattan distance etc. In this project we will use only the above mentioned algorithms to find the similarity between texts.

About

The aim of this work is to predict similar text data, given a text data. The text vectorization is done using CountVectorizer.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published