Skip to content
View alexeykosh's full-sized avatar
🦔
🦔

Highlights

  • Pro

Organizations

@lingcorpora @LingConLab

Block or report alexeykosh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 185 22 Updated May 30, 2024

A high-performance, zero-overhead, extensible Python compiler using LLVM

C++ 15,014 517 Updated Oct 1, 2024

🗺️ Data Cleaning and Textual Data Visualization 🗺️

Python 134 13 Updated Jun 18, 2024

Code to download and tokenize wikipedia data.

Python 5 Updated Jul 2, 2024

Tesseract Open Source OCR Engine (main repository)

C++ 61,312 9,412 Updated Sep 19, 2024

An R implementation of Reinforced Poisson Process (RPP) model

R 3 1 Updated Nov 4, 2018

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 167,207 44,196 Updated Oct 3, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 5,998 561 Updated Oct 3, 2024

Python library for ngram collection and frequency smoothing

Python 1 Updated Feb 16, 2023

EGG: Emergence of lanGuage in Games

Jupyter Notebook 287 99 Updated Apr 4, 2024

A module for getting data into python from large data sources

C++ 172 20 Updated Mar 13, 2024

Tools for checking ACL paper submissions

Python 576 47 Updated May 15, 2024

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

Python 808 64 Updated Apr 26, 2024

Scripts for creating a clean copy of the compressed tagged files of the COHA corpus.

Python 3 1 Updated Jul 29, 2020

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,496 5,734 Updated Aug 19, 2024

EGG: Emergence of lanGuage in Games

Jupyter Notebook 7 2 Updated Jan 25, 2022

Get raw text from wikipedia dumps

Rust 1 Updated Oct 24, 2022

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files

C++ 358 75 Updated Sep 11, 2024

KenLM: Faster and Smaller Language Model Queries

C++ 2,498 512 Updated Jul 30, 2024

LYT Mode is for "Linking Your Thinking". It invokes sensemaking and lateral thinking.

CSS 223 21 Updated Feb 28, 2024
Jupyter Notebook 9 Updated Jul 26, 2023

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 26,408 2,901 Updated Oct 3, 2024

An autoregressive character-level language model for making more things

Python 2,496 659 Updated Jun 4, 2024

A curated list of awesome ggplot2 tutorials, packages etc.

1,568 167 Updated Oct 1, 2024

Studying phonotactics and how it relates to other language features

Python 10 1 Updated Jan 30, 2020

Multilingual Generative Pretrained Model

Jupyter Notebook 200 22 Updated May 13, 2024

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

Jupyter Notebook 21 1 Updated Feb 4, 2022

Calculating difference between expected and real homophony.

HTML 1 Updated Mar 16, 2022
Next