Breaking Bad is an American crime drama television series created and produced by Vince Gilligan.
Plot
Set and filmed in Albuquerque, New Mexico, the series follows Walter White, an underpaid, overqualified, and dispirited high-school chemistry teacher who is struggling with a recent diagnosis of stage-three lung cancer. Walt turns to a life of crime and partners with a former student, Jesse Pinkman, to produce and distribute crystal meth to secure his family's financial future before he dies, while navigating the dangers of the criminal underworld.
Some interesting things
- In the series viewers watch Walt go from protagonist to antagonist.
- Breaking Bad aired on AMC from 2008 to 2013 with five successful seasons and 62 total episodes.
- In 2013, Breaking Bad entered the Guinness World Records as the most critically acclaimed TV show of all-time.
- The cold opens became one of the most legendary parts of the show and were works of art in their own right.
- The number one ranked cold open is season 2 episode 7 when there is a music video titled “The Ballad of Heisenberg” about his ascension to the top of a drug empire.
- There’s an incredible cast, award-winning script, several spinoffs, multiple languages, and of course, that creative plot.
- Breaking Bad is one of the most successful series of all time, with a legacy that continues to inform new generations of shows.
- From the influential cold opens, incredible writing, sympathetic acting, and compelling plot, the series changed television forever.
- Just take a chance on the pilot and we know you’ll be hooked!
It was inspired from DataCamp's A Network Analysis of Game of Thrones. While checking for already existing curated data, I failed to find the data which I needed. Thus decided to split the project into 2 parts.
- Data Gathering / Web Scrapping
- Relationship Network analysis
- The first step to the project was to scrape data from breakingbad.fandom.com page.
- A total of 62 total episodes were relased for Breaking Bad series.
- Summary of all 62 episodes and record it based on season.
- ie, for 6 seasons, intersted in having only 6 summary files.
- Character list of each episodes.
- Browser automation : Selenium with Python
- Web page parser : Beautiful Soup
- Data cleaning : pandas
- In Part-1 using Scraper.ipynb we have generated the summaeirs all 62 episodes and character list.
- In this part using Relationship_Finder.ipynb we will analyse the relationship between characters.
- From the scraped data using named entity recognition and our own defined rules 'll create a relationship dataset.
- Over that relationship data we will perform centrality measures to find the most important character.
- Using community detection, we will try to find-out what all communities are present and leader of those clans.
- We will tkae the help of visualization techniques to make the analysis more simple.
- For further analysis, we will export the network data for the visualization in Gephi.
- Natural Language Processing : spaCy
- Network Analysis : NetworkX
- Newtwork Visualization : pyvis
- Data cleaning : pandas
├── LICENSE
├── README.md
├── breaking_bad.yml
├── requirements.txt
│
├── Scrapper.ipynb
├── Relationship_Finder.ipynb
│
├── data
│ ├── character_df_cleaned.csv
│ ├── character_df.csv
│ ├── season_nd_episode_links.txt
│ ├── gephi_files
│ │ └── # directory for storing CSV files Gephi visualisation tool
│ └── summaries
│ └── # all 6 seasons summary files stored here
│
├── lib
│ └── # pyvis created directory
│
└── src
│ ├── char_imp
│ │ └── # directory to store character importance plots
│ ├── htmls
│ │ └── # directory to store pyvis plots
│ ├── imgs
│ │ └── # all images used in IPYNB notebooks and README files
│ ├── plots
│ │ └── # directory to store centrality plots
│ └── plt_style
│
└── docker_files
└── # files related to Docker and Contributions guides
To recreate this project on your own computer, do the following.
I have used JupyterLab through-out the project and assuming you'll too (If not, feel free to change the below code accordingly).
- Download the whole the_breaking_bad_network directory to your local work space.
wget https://github.com/jishnukoliyadan/the_breaking_bad_network/archive/refs/heads/master.zip -O the_breaking_bad_network-master.zip
unzip the_breaking_bad_network-master.zip
cd the_breaking_bad_network-master
- Once we are in the_breaking_bad_network directory, lets create conda environment and launch JupyterLab
conda env create -f breaking_bad.yml
conda activate breaking_bad
jupyter-lab
- Download the whole the_breaking_bad_network directory to your local work space.
wget https://github.com/jishnukoliyadan/the_breaking_bad_network/archive/refs/heads/master.zip -O the_breaking_bad_network-master.zip
unzip the_breaking_bad_network-master.zip
cd the_breaking_bad_network-master
- Let's create conda environment and activate it.
conda create -n breaking_bad python=3.10.8 -y
conda activate breaking_bad
- Install prerequisite libraries & lauch JupyterLab.
pip install -r requirements.txt --upgrade
jupyter-lab
We can use the docker image for re-creating this project. A detailed explanation on how to use docker image can be found in the docker guide.
All contributions, bug reports, bug fixes, documentation improvements, and enhancements are welcome.
A detailed overview on how to contribute can be found in the contributing guide.
- Official documentations of all libraries
- IMDb, Rotten Tomatoes, Wikipedia
- GeeksforGeeks, Stack Overflow
- Cambridge Intelligence, Engineering for Data Science
- Kaggle : data-collection-web-scrapping-tutorial
- Chapter 3 - Network Structure and Measures : Analyzing the Social Web by Jennifer Golbeck
- Translating Networks: Assessing Correspondence Between Network Visualisation and Analytics
- pijamasurf.com, usmagazine.com
The license can be found in the LICENSE file.