Skip to content

A list of useful resources to learn Data Engineering from scratch

Notifications You must be signed in to change notification settings

danielvdende/Data-Engineering-HowTo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 

Repository files navigation

How To Become a Data Engineer

Useful articles

Algorithms & Data Structures

SQL

Programming

Databases

Distributed Systems

Books

Courses

Blogs

  • Martin Kleppmann author of Designing Data-Intensive Application
  • BaseDS by Vaidehi Joshi about Distributed Systems

Tools

  • Apache Airflow is a platform to programmatically author, schedule and monitor workflows in Python
  • Apache Spark is a unified analytics engine for large-scale data processing
  • Apache Kafka is a distribyted streaming platform
  • Luigi is a Python package that helps you build complex pipelines of batch jobs.

Cloud Platforms

Other

About

A list of useful resources to learn Data Engineering from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published