mlincon

👋

M Lincon mlincon

👋

Achievements

Block or Report

Block or report mlincon

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

dataEngineering

15 repositories

uhussain / WebCrawlerForOnlineInflation

Price Crawler - Tracking Price Inflation

Python 179 53 Updated Jun 23, 2020

GokuMohandas / Made-With-ML

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Jupyter Notebook 36,679 5,844 Updated Jul 5, 2024

dylanzenner / business_closures_de_pipeline

Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database

Python 14 6 Updated Oct 26, 2021

san089 / goodreads_etl_pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Python 1,261 212 Updated Mar 9, 2020

citusdata / cstore_fdw

Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.

C 1,755 171 Updated Mar 8, 2021

mrpaulandrew / procfwk

A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.

C# 180 113 Updated Feb 13, 2024

great-expectations / great_expectations

Always know what to expect from your data.

Python 9,702 1,502 Updated Jul 25, 2024

awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,194 523 Updated Jul 2, 2024

MassStreetAnalytics / etl-framework

A framework for moving data into a data warehouse.

Jupyter Notebook 52 22 Updated Sep 7, 2021

adidas / m3d-api

Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of data lakes.

Python 29 8 Updated May 23, 2023

garystafford / aws-airflow-demo

Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for Apache Airflow (MWAA) on AWS.

Python 42 14 Updated Jul 6, 2022

astronomer / airflow-guides

Guides and docs to help you get up and running with Apache Airflow.

JavaScript 796 99 Updated Oct 13, 2022

shafiab / HashtagCashtag

My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggregates Twitter and US stock market data for user sentiment anal…

Scala 465 123 Updated Aug 24, 2022

garystafford / emr-demo

Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.

Python 38 15 Updated Sep 1, 2022

josephmachado / e2e_datapipeline_test

Example repo to create end to end tests for data pipeline.

Python 21 3 Updated Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

M Lincon mlincon

Achievements

Achievements

Block or report mlincon

dataEngineering

uhussain / WebCrawlerForOnlineInflation

GokuMohandas / Made-With-ML

dylanzenner / business_closures_de_pipeline

san089 / goodreads_etl_pipeline

citusdata / cstore_fdw

mrpaulandrew / procfwk

great-expectations / great_expectations

awslabs / deequ

MassStreetAnalytics / etl-framework

adidas / m3d-api

garystafford / aws-airflow-demo

astronomer / airflow-guides

shafiab / HashtagCashtag

garystafford / emr-demo

josephmachado / e2e_datapipeline_test