Skip to content
/ nlpatl Public
forked from makcedward/nlpatl

Experimenting with ssl without input prompt for human annotation

License

Notifications You must be signed in to change notification settings

arkob/nlpatl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLPatl (NLP Active Learning)

This python library helps you to perform Active Learning in NLP. NLPatl built on top of transformers, scikit-learn and other machine learning package. It can be applied into both cold start scenario (no any labeled data) and limited labeled data scenario.

The goal of NLPatl is to make use of the state-of-the-art (SOTA) NLP models to estimate the most valueable data and making use of subject matter experts (SMEs) by having them to label limited amount data.


At the beginning, you have unlabeled (and limited labeled data) only. NLPatl apply transfer learning to convert your texts into vectors (or embeddings). After that, vectors go through unsupervised learning or supervised learning to estimate the most uncertainty (or valuable) data. SMEs perform label on it and feedback to models until accumulated enough high quailty data.

Installation

pip install nlpatl

or

pip install git+https://github.com/makcedward/nlpatl.git

Examples

Release

0.0.3dev, Mar, 2022

  • Support sci-kit learn extra library (Clustering)
  • Support Nemo's speaker recognition embeddings layer

Citation

@misc{ma2021nlpatl,
  title={Active Learning for NLP},
  author={Edward Ma},
  howpublished={https://github.com/makcedward/nlpatl},
  year={2021}
}

About

Experimenting with ssl without input prompt for human annotation

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%