leehanchung

👾

Han leehanchung

👾

Machine Learning Engineer

150 followers · 15 following

https://leehanchung.github.io/

Achievements

x2 x4 x3

Achievements

x2 x4 x3

Organizations

Stars

Retrieval

64 repositories

oaqa / FlexNeuART

Flexible classic and NeurAl Retrieval Toolkit

Java 214 36 Updated Jul 16, 2024

beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python 1,562 189 Updated Jul 28, 2024

castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 1,640 365 Updated Sep 29, 2024

shmsw25 / AmbigQA

An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"

Python 116 22 Updated Apr 23, 2022

stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 2,937 376 Updated Sep 4, 2024

quickwit-oss / tantivy

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Rust 11,865 662 Updated Sep 26, 2024

weaviate / weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …

Go 10,960 754 Updated Sep 29, 2024

seungkee / google_landmark_retrieval_2020_1st_place_solution

Jupyter Notebook 52 7 Updated Oct 5, 2020

deepset-ai / haystack

🔍 AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your da…

Python 16,944 1,850 Updated Sep 26, 2024

milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications

Go 29,661 2,849 Updated Sep 29, 2024

jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack

Python 20,983 2,216 Updated Sep 26, 2024

PaddlePaddle / RocketQA

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

Python 765 129 Updated Dec 19, 2023

microsoft / MSMARCO-Document-Ranking

MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage/document ranking

Python 119 13 Updated Jan 3, 2022

arvind-neural / beir_eval

Python 3 1 Updated Jan 31, 2022

allenai / specter

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Python 511 55 Updated Jun 12, 2023

due-benchmark / baselines

The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."

Python 36 4 Updated Mar 2, 2023

facebookresearch / LASER

Language-Agnostic SEntence Representations

Jupyter Notebook 3,582 460 Updated May 2, 2024

allenai / longformer

Longformer: The Long-Document Transformer

Python 2,031 271 Updated Feb 8, 2023

allenai / ir_datasets

Provides a common interface to many IR ranking datasets.

Python 316 42 Updated Aug 12, 2024

o19s / es-tmdb

Elasticsearch TMDB examples

Python 21 16 Updated Jul 6, 2024

sebastian-hofstaetter / teaching

Open-Source Information Retrieval Courses @ TU Wien

Python 589 84 Updated Jun 12, 2023

Agrover112 / awesome-semantic-search

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

335 29 Updated Dec 6, 2023

terrier-org / cikm2021tutorial

Jupyter Notebook 54 9 Updated Apr 10, 2022

sebastian-hofstaetter / neural-ir-explorer

Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results

Vue 32 3 Updated Dec 13, 2019

koursaros-ai / nboost

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)

Python 675 69 Updated Sep 30, 2020

meilisearch / meilisearch

A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

Rust 46,686 1,804 Updated Sep 26, 2024

princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,374 511 Updated Jul 2, 2024

spotify / annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,140 1,161 Updated Jul 29, 2024

qdrant / qdrant

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 19,966 1,349 Updated Sep 27, 2024

twitter / typeahead.js

typeahead.js is a fast and fully-featured autocomplete library

JavaScript 16,521 3,207 Updated Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Han leehanchung

Achievements

Achievements

Organizations

Block or report leehanchung

Retrieval

oaqa / FlexNeuART

beir-cellar / beir

castorini / pyserini

shmsw25 / AmbigQA

stanford-futuredata / ColBERT

quickwit-oss / tantivy

weaviate / weaviate

seungkee / google_landmark_retrieval_2020_1st_place_solution

deepset-ai / haystack

milvus-io / milvus

jina-ai / jina

PaddlePaddle / RocketQA

microsoft / MSMARCO-Document-Ranking

arvind-neural / beir_eval

allenai / specter

due-benchmark / baselines

facebookresearch / LASER

allenai / longformer

allenai / ir_datasets

o19s / es-tmdb

sebastian-hofstaetter / teaching

Agrover112 / awesome-semantic-search

terrier-org / cikm2021tutorial

sebastian-hofstaetter / neural-ir-explorer

koursaros-ai / nboost

meilisearch / meilisearch

princeton-nlp / SimCSE

spotify / annoy

qdrant / qdrant

twitter / typeahead.js