Stars
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Turns Data and AI algorithms into production-ready web applications in no time.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
🧙 Build, run, and manage data pipelines for integrating and transforming data.
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.
GitPython is a python library used to interact with Git repositories.
An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks
Operate and manipulate physical quantities in Python
Gin provides a lightweight configuration framework for Python
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
A full spaCy pipeline and models for scientific/biomedical documents.
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Graph the import dependancies in an Objective-C project
SQL Lineage Analysis Tool powered by Python
Document, sample code and other materials for SQLFlow
Uses tokenized query returned by python-sqlparse and generates query metadata
A Custom Jupyter Widget Library for Power BI
🏥 Medical Text Mining and Information Extraction with spaCy
(Legacy) Command Line Interface for Databricks
Handle, manipulate, and convert data with units in Python
Minimalist Python library for building static websites with Jinja