Skip to content

nandajavarma/document-similarity

Repository files navigation

Install the required modules

pip install -r requirements.txt

Create the (title, vector) from the pdf

python create_vector.py pdf1 pdf2

To check the similarity of documents

python similarity.py pdf1 pdf2 pdf3 [pdf4 pdf5 ...]

The result will show how similar pdf1 is to the rest of pdfs

About

Document similiarity learning experiment with LSH

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages