Skip to content
View HuanjieGuo's full-sized avatar
Block or Report

Block or report HuanjieGuo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

10 Weeks, 20 Lessons, Data Science for All!

Jupyter Notebook 27,289 5,628 Updated Jun 24, 2024

Curated list of resources about Apache Airflow

Shell 3,623 490 Updated Jun 12, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 38,202 5,205 Updated Jul 19, 2024

Protocol Buffers - Google's data interchange format

C++ 64,673 15,374 Updated Jul 19, 2024

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs,…

C 65,722 23,603 Updated Jul 19, 2024

Supervisor process control system for Unix (supervisord)

Python 8,352 1,237 Updated Jul 19, 2024

Most popular Mocking framework for unit tests written in Java

Java 14,723 2,532 Updated Jul 19, 2024

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 35,448 13,847 Updated Jul 19, 2024

The Java gRPC implementation. HTTP/2 based RPC

Java 11,293 3,799 Updated Jul 19, 2024

Apache Jena

Java 1,077 644 Updated Jul 19, 2024

DataX是阿里云DataWorks数据集成的开源版本。

Java 15,532 5,345 Updated Jul 19, 2024

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 6,744 2,923 Updated Jul 19, 2024

ClickHouse® is a real-time analytics DBMS

C++ 35,612 6,649 Updated Jul 19, 2024

Apache NiFi

Java 4,613 2,644 Updated Jul 19, 2024

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

Scala 871 597 Updated Jul 10, 2024

Apache Drill is a distributed MPP query layer for self describing data

Java 1,925 985 Updated Jul 19, 2024

Apache Impala

C++ 1,106 501 Updated Jul 18, 2024

Apache Storm

Java 6,577 4,073 Updated Jul 18, 2024

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules

Scala 4,378 529 Updated Jun 29, 2022

Azkaban workflow manager.

Java 4,431 1,585 Updated Jul 3, 2024

A damn simple library for building production-ready RESTful web services.

Java 8,482 3,434 Updated Jul 19, 2024

Mirror of Apache Giraph

Java 617 299 Updated Apr 14, 2023

Toy single-machine implementation of the Pregel graph-based framework

Python 113 40 Updated Jan 5, 2017

Apache Hive

Java 5,426 4,635 Updated Jul 19, 2024

Mirror of Apache Pig

Java 676 449 Updated May 17, 2024

LinkedIn's previous generation Kafka to HDFS pipeline.

Java 883 461 Updated Aug 27, 2020

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-like data

Java 2,518 1,569 Updated May 13, 2024

Mirror of Apache Sqoop

Java 968 587 Updated Apr 8, 2021

Apache Thrift

C++ 10,247 4,001 Updated Jul 19, 2024

Code repository for O'Reilly Hadoop Application Architectures book

Java 166 101 Updated May 26, 2015
Next