Skip to content

Nepali IMDB crawler to extract Nepali movies data and analyze

License

Notifications You must be signed in to change notification settings

NISH1001/nepalimdb

Repository files navigation

nepalimdb

Nepali IMDB crawler to extract Nepali movies data and analyze

Dependencies

The crawler uses BeautifulSoup and requests modules for python3.
Install using requirements.txt as:

pip install -r requirements.txt

Crawler Usage

Run nepalimdb.py script using python:

python nepalimdb.py

Data

The data is dumped as JSON in data/nepali-movies.json.
The data consists of list of json object with each object holding a single movie with information like:

  • imdb_url
  • title
  • year
  • runtime
  • genre
  • rating
  • plot
  • votes

Here's the sample json object:

{
    "imdb_url": "https://www.imdb.com/title/tt6944688/?ref_=adv_li_tt",
    "title": "A Mero Hajur 2",
    "year": 2017,
    "runtime": "138 min",
    "genre": "Drama, Romance",
    "rating": 8.2,
    "plot": "A Man stalks a girl, after while they fall in love, but their relative don't want the girl to be with that man.",
    "votes": 164
},

You can find the data so far: here

Data Analysis

I have done basic analysis in this jupyter notebook.

Contributions

Feel free to use the data and the crawler any way you like. But, if you feel like giving me a credit, mention and this repo.
Pull requests are welcome. Feel free to tweak the code and optimize. You might send a pull request too.

Cheers...

About

Nepali IMDB crawler to extract Nepali movies data and analyze

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published