Skip to content
View leehanchung's full-sized avatar
👾
👾

Organizations

@ncov19-us

Block or report leehanchung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Retrieval

64 repositories

Flexible classic and NeurAl Retrieval Toolkit

Java 214 36 Updated Jul 16, 2024

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python 1,562 189 Updated Jul 28, 2024

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 1,640 365 Updated Sep 29, 2024

An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"

Python 116 22 Updated Apr 23, 2022

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 2,937 376 Updated Sep 4, 2024

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Rust 11,865 662 Updated Sep 26, 2024

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …

Go 10,960 754 Updated Sep 29, 2024

🔍 AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your da…

Python 16,944 1,850 Updated Sep 26, 2024

A cloud-native vector database, storage for next generation AI applications

Go 29,661 2,849 Updated Sep 29, 2024

☁️ Build multimodal AI applications with cloud-native stack

Python 20,983 2,216 Updated Sep 26, 2024

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

Python 765 129 Updated Dec 19, 2023

MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage/document ranking

Python 119 13 Updated Jan 3, 2022
Python 3 1 Updated Jan 31, 2022

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Python 511 55 Updated Jun 12, 2023

The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."

Python 36 4 Updated Mar 2, 2023

Language-Agnostic SEntence Representations

Jupyter Notebook 3,582 460 Updated May 2, 2024

Longformer: The Long-Document Transformer

Python 2,031 271 Updated Feb 8, 2023

Provides a common interface to many IR ranking datasets.

Python 316 42 Updated Aug 12, 2024

Elasticsearch TMDB examples

Python 21 16 Updated Jul 6, 2024

Open-Source Information Retrieval Courses @ TU Wien

Python 589 84 Updated Jun 12, 2023

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

335 29 Updated Dec 6, 2023
Jupyter Notebook 54 9 Updated Apr 10, 2022

Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results

Vue 32 3 Updated Dec 13, 2019

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)

Python 675 69 Updated Sep 30, 2020

A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

Rust 46,686 1,804 Updated Sep 26, 2024

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,374 511 Updated Jul 2, 2024

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,140 1,161 Updated Jul 29, 2024

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 19,966 1,349 Updated Sep 27, 2024

typeahead.js is a fast and fully-featured autocomplete library

JavaScript 16,521 3,207 Updated Apr 14, 2023