Build Dataset containing most relevant places from the website AtlasObscura.com using craping function from BeatifulSoup pyhton module.
With the complete dataset build a search engine on it.
Execute the query.
Define a new score for your search engine.
Visualize the most relevant places in the dataset using the new score defined.
Answer a theoretical question on Sorting Algorithm.
- main.ipynb: Main notebook it starts from part 1,2,3,4,7
- CommandLine.sh: file .sh containing the command line solution
- RankingList.txt: output of the sorting query
- TSV_FILES.zip: all the .tsv files for each place in atlas obscura ordered by page, output of part 1.2
- inverted_index.pkl, vocabulary.pkl: files need for the search engine
- map.png: screenshot of the scatter_mapbox obtained in part 4
- merged.tsv: .tsv files containing the entire dataset
- places_url.txt: txt with all the urls, output of part 1.1
- web_scraping_functions.py: .py file with web_scraping functions used for part 1.3
- sorting.py: .py file with function for part 7