Skip to content

In this project, we will use python to analyse the relationships between the characters in the Breaking Bad television series.

License

Notifications You must be signed in to change notification settings

jishnukoliyadan/the_breaking_bad_network

Repository files navigation

Breaking Bad : Network Analysis

Breaking Bad (An bird-eye view)

Breaking_Bad_Poster2

Breaking Bad is an American crime drama television series created and produced by Vince Gilligan.

Plot

Set and filmed in Albuquerque, New Mexico, the series follows Walter White, an underpaid, overqualified, and dispirited high-school chemistry teacher who is struggling with a recent diagnosis of stage-three lung cancer. Walt turns to a life of crime and partners with a former student, Jesse Pinkman, to produce and distribute crystal meth to secure his family's financial future before he dies, while navigating the dangers of the criminal underworld.

Some interesting things

  • In the series viewers watch Walt go from protagonist to antagonist.
  • Breaking Bad aired on AMC from 2008 to 2013 with five successful seasons and 62 total episodes.
  • In 2013, Breaking Bad entered the Guinness World Records as the most critically acclaimed TV show of all-time.

  • The cold opens became one of the most legendary parts of the show and were works of art in their own right.
  • The number one ranked cold open is season 2 episode 7 when there is a music video titled “The Ballad of Heisenberg” about his ascension to the top of a drug empire.
  • There’s an incredible cast, award-winning script, several spinoffs, multiple languages, and of course, that creative plot.
  • Breaking Bad is one of the most successful series of all time, with a legacy that continues to inform new generations of shows.
  • From the influential cold opens, incredible writing, sympathetic acting, and compelling plot, the series changed television forever.
  • Just take a chance on the pilot and we know you’ll be hooked!

The Project

It was inspired from DataCamp's A Network Analysis of Game of Thrones. While checking for already existing curated data, I failed to find the data which I needed. Thus decided to split the project into 2 parts.

  1. Data Gathering / Web Scrapping
  2. Relationship Network analysis
  • The first step to the project was to scrape data from breakingbad.fandom.com page.
  • A total of 62 total episodes were relased for Breaking Bad series.

What all the data interested to gather ?

  • Summary of all 62 episodes and record it based on season.
    • ie, for 6 seasons, intersted in having only 6 summary files.
  • Character list of each episodes.

Tools used for this part

  • In Part-1 using Scraper.ipynb we have generated the summaeirs all 62 episodes and character list.
  • In this part using Relationship_Finder.ipynb we will analyse the relationship between characters.

How we gonna proceed this part ?

  • From the scraped data using named entity recognition and our own defined rules 'll create a relationship dataset.
  • Over that relationship data we will perform centrality measures to find the most important character.
  • Using community detection, we will try to find-out what all communities are present and leader of those clans.
  • We will tkae the help of visualization techniques to make the analysis more simple.
  • For further analysis, we will export the network data for the visualization in Gephi.

Tools used for this part

Project directory structure

├── LICENSE
├── README.md
├── breaking_bad.yml
├── requirements.txt
│
├── Scrapper.ipynb
├── Relationship_Finder.ipynb
│
├── data
│   ├── character_df_cleaned.csv
│   ├── character_df.csv
│   ├── season_nd_episode_links.txt
│   ├── gephi_files
│   │   └──  # directory for storing CSV files Gephi visualisation tool
│   └── summaries
│       └── # all 6 seasons summary files stored here
│
├── lib
│   └── # pyvis created directory
│
└── src
│   ├── char_imp
│   │   └── # directory to store character importance plots
│   ├── htmls
│   │   └── # directory to store pyvis plots 
│   ├── imgs
│   │   └── # all images used in IPYNB notebooks and README files
│   ├── plots
│   │   └── # directory to store centrality plots 
│   └── plt_style
│
└── docker_files
    └── # files related to Docker and Contributions guides

Reproducing the project

To recreate this project on your own computer, do the following.
I have used JupyterLab through-out the project and assuming you'll too (If not, feel free to change the below code accordingly).

1. Using yml file

wget https://github.com/jishnukoliyadan/the_breaking_bad_network/archive/refs/heads/master.zip -O the_breaking_bad_network-master.zip
unzip the_breaking_bad_network-master.zip
cd the_breaking_bad_network-master
  • Once we are in the_breaking_bad_network directory, lets create conda environment and launch JupyterLab
conda env create -f breaking_bad.yml
conda activate breaking_bad
jupyter-lab

2. Using requirements.txt file

wget https://github.com/jishnukoliyadan/the_breaking_bad_network/archive/refs/heads/master.zip -O the_breaking_bad_network-master.zip
unzip the_breaking_bad_network-master.zip
cd the_breaking_bad_network-master
  • Let's create conda environment and activate it.
conda create -n breaking_bad python=3.10.8 -y
conda activate breaking_bad
  • Install prerequisite libraries & lauch JupyterLab.
pip install -r requirements.txt --upgrade
jupyter-lab

3. Using Docker image

We can use the docker image for re-creating this project. A detailed explanation on how to use docker image can be found in the docker guide.

Contributing to the project

All contributions, bug reports, bug fixes, documentation improvements, and enhancements are welcome.

A detailed overview on how to contribute can be found in the contributing guide.

Reference & Credits

License

The license can be found in the LICENSE file.