Stars
A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
TF-ID: Table/Figure IDentifier for academic papers
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
A library for embedding documents and clustering them by layout -- augmented with image features! This is a class project for Stanford CS 231n: Computer Vision with Deep Learning (Spring 2022).
A collection of awesome video generation studies.
Fish-like autosuggestions for zsh
Python library for converting Python calculations into rendered latex.
A web-based collaborative LaTeX editor
A Python library that adds Latex functionality to the Texttable package.
OCR, layout analysis, reading order, line detection in 90+ languages
Vocabulary list of GPT-4o (o200k_base) and GPT-4/GPT-3.5 (cl100k_base) tokenizers. Special tokens are excluded.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Given a scholarly PDF, extract figures, tables, captions, and section titles.
Convert PDF to markdown quickly with high accuracy
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models