The course site for the Data Processing in Python from IES. See information on SIS. The course is taught by Martin Hronec, Vítek Macháček and Jan Šíla.
Stable link for online attendance: Join Zoom Meeting https://cesnet.zoom.us/j/92851968819?pwd=L296R2N1T1RNR2VPdVMxQjdQR1Iydz09
Meeting ID: 928 5196 8819 Passcode: pythonFTW
Hand in midterm through this google form
Make sure you accces it through university google account
- If you are looking for a partner use this google sheet with your CUNI account logged in. If you have a partner, delete your info, please, to make it easier for others.
Date | Topic | who | Project | HW | |
---|---|---|---|---|---|
15/2 | Intro, Jupyter, Git (+ GitHub) | Martin | |||
21/2 | Seminar (Git) | Martin | HW 1 | ||
22/2 | Strings, Floats, Lists, Dictionaries, Functions | Vitek | HW 0 | ||
1/3 | Numpy, Pandas, Matplotlib | Jan | HW 2 | ||
7/3 | Seminar | Jan | |||
8/3 | Object-Oriented Programming | Jan | HW 3 | ||
15/3 | HTML, XML, JSON, requests, APIs, BeautifulSoup | Jan | |||
21/3 | IES Web Scraper | Vitek | HW 4 | ||
22/3 | Seminar | Vitek | |||
29/3 | Advanced Pandas | Vitek | HW 5 | ||
4/4 | Introduction to Databases | Jan | Project Topic Proposal | HW 6 | |
5/4 | Seminar - MIDTERM | full house | |||
11/4 | Packaging and Documentation | Martin | |||
12/4 | Testing (and decorators) | Martin | |||
19/4 | Seminar | Martin | Project Topic Approval | ||
26/4 | Guest lecture + Beer after | TBD | |||
2/5 | Project Work 2 (Seminar) | full house | Work-in-progress | ||
3/5 | Project Work 2 | full house | Work-in-progress | ||
X/X | Project Deadline | full house |
The requirements for passing the course are DataCamp assignments (5pts), the midterm (25pts), work in-progress-presentation (10pts), and the final project - including the final delivery presentation (60pts). At least 50% from the DataCamp assignments and work-in-progress presentation is required for passing the course.
- Students in teams by 2
- Deadline: X.X.
- The task is to download any data from API or directly from the web. These data should be processed and visualized in the Jupyter Notebook, with auxiliary scripts consisting of functions and classes definitions as .py files. The project should be submitted as a GitHub repository.
- The selection of the data is up to the students. (Conditional on our approval.)
- Git collaboration as a proof of collaboration of both students.
- More details during the lecture.
- Make sure you include requirements.txt and have configured .gitignore, such as this.
- Make sure the project is runable from scratch, i.e. restart your kernel and make sure you everything is imported and runs.
- Submitted as a Jupyter notebook in a Git repository. All team members pushed to the repo.
- Code is runnable and replicable (after installation of necessary packages).Exception only due to good reasons (data availablity, etc)
- OOP and code structure
- Analysis and visualization
- Code Readibility + Documentation
See example project from the previous semesters here from last year.
- Presentation of work-in-progress related to the final project.
Takes place May 5 - Live coding (80 minutes), "open browser", no collaboration between the students. More details during the lecture week before
3 assignments out of assignments 1-6 submitted on time is required.
- Introduction to Python - Python Basics
- Introduction to Python - Python Lists
- Introduction to Python - Functions and packages
- Web Scraping in Python - Introduction to HTML
- Web Scraping in Python - XPaths and Selectors
- Web Scraping in Python - CSS Locators, Chaining, and Responses
TBA
Introduction to Git for Data Science
Intermediate Python for Data Science
Manipulating DataFrames with pandas
Merging DataFrames with pandas
Importing Data in Python (Part 1)
Importing Data in Python (Part 2)
Introduction to Data Visualization
Interactive Data Visualization in Bokeh
Introduction to SQL for Data Science
Introduction to Databases in Python
Econometrics II. (JEB110) is an explicit prerequisite for bachelor students.
The course is designed for students that have at least some basic coding experience. It does not need to be very advanced, but they should be aware of concepts such as for
loop ,if
and else
,variable
or function
.
No knowledge of Python is required for entering the course.
Passing the course is rewarded with 5 ECTS credits.
Pro Git book, Atlassian Git tutorials, Github resources for learning Git
Resources from the official Python webpage
Python, Pandas, Numpy, requests, BeautifulSoup and Matplotlib.