MLB data streamlit visualzation

Introduction

In competitive sports like baseball, personalized analysis on the opponents or your team members is as important as the analysis over all players and matches. For example, coaches can decide strategies or the training content if they have a tool to know the playing style of their opponents before matches.

Therefore, we developed a web service which provides an easy-to-use interactive interface for non-programmers to access personalized analysis of a player. This service presents overall statistics of a player and automatically find notable correlation or association among the pitch/strike events and other conditions.

Try it here: http://140.112.29.149:8501/ (will be closed once the final project is finished)

To run the service: streamlit run streamlit_test.py

Tools

Streamlit: This is a very convenient python framework to develop ML web tools. It’s compatible to other python packages of data science ecosystem. Visit their website if you want more information: https://www.streamlit.io/
Analysis tools: numpy, pandas, mlextend
Visualization: matplotlib, seaborn, plotly

Interface

On the left side, users can choose the analysis type (pitcher or batter), which player, and some filtering conditions such as the starting year of data to be used. And, the automatic analyzation results will be shown on the right.

Provided analysis

We provide some basic distribution analysis like pitch types and strike events at the top of our analysis, therefore users can quickly get a rough concept of that player.

Besides, we also provide some spatial distribution visualization to help users dig into the tendency of the analyzed player. For instance, users can compare the distribution of two kinds of pitch types from that pitcher like the example above.

Since our service is interactive, users can also choose an arbitrary number of types to compare. Other spatial distribution like strike zone are also provided.

Correlation/association mining

Besides those distribution visualization, we provide notable correlation and association. We first choose interesting outcome columns and possible cause columns (e.g. ball spin rate, pitch type, or wind direction). Then, we calculate the correlation or association between each pair of outcome columns and cause columns, filter out those with small absolute values, and show the notable correlation/association.

The codes of this service are available here: https://github.com/Nash2325138/MLB-data-streamlit-visualzation-practice

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
streamlit_test.py		streamlit_test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLB data streamlit visualzation

Introduction

Tools

Interface

Provided analysis

Correlation/association mining

About

Releases

Packages

Languages

Nash2325138/MLB-data-streamlit-visualzation-practice

Folders and files

Latest commit

History

Repository files navigation

MLB data streamlit visualzation

Introduction

Tools

Interface

Provided analysis

Correlation/association mining

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages