I have uploaded "Assignment_1_231110004.ipynb" file
there are three files that requires to be in the same directory as namely
- corpus.txt :- this is the 100 MB data file
- text.txt :- 25 sentences those were on the https://bangla.iitk.ac.in/cs689/main
- truth.txt :- these are the word groups which i created on the lines of text.txt file
the output files will also be in the same directory
I ran the code on 20000 characters only, my laptop is not that good, it crashed multiple times