Midterm - November 29, 18:30 - 19:55, in person, but can be done remotely
-
create a git repo for submission beforehand to save time
-
make sure you are enrolled in SIS! If not and want to take the midterm, let us know ASAP
-
Have a working knowledge how to:
- iterate multiple objects
- be able to construct a dataset from pieces
- receive and send data through HTTP protocol (requests)
- perform financial analysis
- basic statistical manipulation
- basic plotting
- open book: google as much as you wish
(Hopefully) stable link for online attendance now: Join Zoom Meeting https://cesnet.zoom.us/j/92851968819?pwd=L296R2N1T1RNR2VPdVMxQjdQR1Iydz09
Meeting ID: 928 5196 8819 Passcode: pythonFTW
The course site for the Data Processing in Python from IES. See information on SIS. The course is taught by Martin Hronec, Vítek Macháček and Jan Šíla.
Date | Topic | who | Project | HW | |
---|---|---|---|---|---|
5/10 | Intro, Jupyter, Git (+ GitHub) | Martin | |||
11/10 | Seminar (Git) | Martin | HW 1 | ||
12/10 | Strings, Floats, Lists, Dictionaries, Functions | Vitek | HW 0 | ||
19/10 | Numpy, Pandas, Matplotlib | Jan | HW 2 | ||
25/10 | Seminar | Jan | |||
26/10 | Object-Oriented Programming | Martin | HW 3 | ||
2/11 | HTML, XML, JSON, requests, APIs, BeautifulSoup | Jan | |||
8/11 | IES Web Scraper | Vitek | HW 4 | ||
9/11 | Seminar | Vitek | |||
22/11 | Advanced Pandas | Vitek | HW 5 | ||
23/11 | Introduction to Databases | Jan | Project Topic Proposal | HW 6 | |
29/11 | Seminar - MIDTERM | full house | |||
30/11 | Packaging and Documentation | Martin | |||
6/12 | Testing (and decorators) | Martin | |||
7/12 | Seminar | Martin | Project Topic Approval | ||
14/12 | Guest lecture | TBD | |||
20/12 | Project Work 2 (Seminar) | full house | Work-in-progress | ||
21/12 | Project Work 2 | full house | Work-in-progress | ||
TBA | Project Deadline | full house |
The requirements for passing the course are DataCamp assignments (5pts), the midterm (25pts), work in-progress-presentation (10pts), and the final project - including the final delivery presentation (60pts). At least 50% from the DataCamp assignments and work-in-progress presentation is required for passing the course.
- Students in teams by 2
- Deadline: TBA
- The task is to download any data from API or directly from the web. These data should be processed and visualized in the Jupyter Notebook, with auxiliary scripts consisting of functions and classes definitions as .py files. The project should be submitted as a GitHub repository.
- The selection of the data is up to the students. (Conditional on our approval.)
- Git collaboration as a proof of collaboration of both students.
- More details during the lecture.
- Submitted as a Jupyter notebook in a Git repository. All team members pushed to the repo.
- Code is runnable and replicable (after installation of necessary packages).Exception only due to good reasons (data availablity, etc)
- OOP and code structure
- Analysis and visualization
- Code Readibility + Documentation
See example project from the previous semesters here from last year.
- Presentation of work-in-progress related to the final project.
22/11. Live coding (80 minutes), "open browser", no collaboration between the students. More details during the lecture week before
3 assignments out of assignments 1-6 submitted on time is required.
10/10 18:20 - Introduction to Git for Data Science
*Deadline extended to Oct 17th at 23:59
- Introduction to Python - Python Basics
- Introduction to Python - Python Lists
- Introduction to Python - Functions and packages
19/10 18:30 - Introduction to Data Science in Python
26/10 18:30 - Object-Oriented Programming in Python
- Web Scraping in Python - Introduction to HTML
- Web Scraping in Python - XPaths and Selectors
- Web Scraping in Python - CSS Locators, Chaining, and Responses
TBA
Introduction to Git for Data Science
Intermediate Python for Data Science
Manipulating DataFrames with pandas
Merging DataFrames with pandas
Importing Data in Python (Part 1)
Importing Data in Python (Part 2)
Introduction to Data Visualization
Interactive Data Visualization in Bokeh
Introduction to SQL for Data Science
Introduction to Databases in Python
Econometrics II. (JEB110) is an explicit prerequisite for bachelor students.
The course is designed for students that have at least some basic coding experience. It does not need to be very advanced, but they should be aware of concepts such as for
loop ,if
and else
,variable
or function
.
No knowledge of Python is required for entering the course.
Passing the course is rewarded with 5 ECTS credits.
Pro Git book, Atlassian Git tutorials, Github resources for learning Git
Resources from the official Python webpage
Python, Pandas, Numpy, requests, BeautifulSoup and Matplotlib.