#

nlp-datasets

Here are 146 public repositories matching this topic...

claudiu1989 / Synonyms-detection

Experiments with word2vec embeddings for synonyms detection, for the Romanian language.

nlp embeddings romanian nlp-resources nlp-machine-learning nlp-datasets

Updated Sep 10, 2023
Python

Kevinlee49 / analysis-youtube-comment-krisandme

I tried to figure out positive and negative comments on my Youtube videos. So, I used NLP to analyze comments. I set the main language as Korean, but you can try setting English as the main language.

nlp nlp-machine-learning nlp-datasets

Updated Aug 23, 2021
Jupyter Notebook

RaiBP / incidental-bilingualism

Python program for detecting unintentional bilingual and translation instances in NLP datasets.

python nlp machine-learning natural-language-processing deep-learning language-detection nlp-resources nlp-datasets code-switching

Updated Feb 11, 2024
Python

readerbench / ro-offense-sequences

nlp romanian offensive-language nlp-datasets romanian-language hate-speech-detection nlp-data nlp-dataset

Updated Jun 27, 2023

Sentiment-Analysis-on-Product-Reviews

SamDineshSD777 / Sentiment-Analysis-on-Product-Reviews

Sentiment Analysis on Product Reviews ( Project Associated with Zummit Infolabs ).

nlp natural-language-processing nlp-resources nlp-library nlp-machine-learning nlp-keywords-extraction nlp-datasets

Updated Mar 13, 2023

BrunoGianetti / MyNLPProjects

My project storage in NLP

Updated Feb 15, 2024
Jupyter Notebook

kaanala / python-webcrawler-turkish-news

Webcrawler for Turkish news.

python nlp natural-language-processing turkish scrapy webcrawler turkish-language turkce nlp-datasets dogal-dil-isleme

Updated Sep 3, 2019
Python

anirudhsom / CAPP-Dataset

Official repository for "Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning".

acl nlp-datasets paraphrase-generation offensive-content-paraphrasing acl-2024-findings acl-2024

Updated Jun 17, 2024
Jupyter Notebook

vishnuchilamakuru / coursera-reviews-analsis

nlp reviews nlp-keywords-extraction review-sentiments nlp-datasets

Updated Dec 7, 2019
Jupyter Notebook

readerbench / news-ro-offense

a novel Romanian language dataset for offensive message detection with manually annotated comment from a local Romanian news website (stiri de cluj) into five classes

nlp romanian nlp-resources offensive-language nlp-datasets romanian-language hate-speech-detection nlp-dataset

Updated Jun 13, 2023

nikitaeverywhere / news-articles-dataset

A dataset of 2095 plain text articles of 5 categories with over 805k words in total.

nlp dataset datasets news-data nlp-datasets articles-data

Updated Jan 30, 2018

josherich / nlp-dataset-explorer

NLP datasets explorer

nlp datasets nlp-datasets

Updated Dec 11, 2022
Vue

tdude92 / reddit-short-stories

4,308 short stories (4 million words) scraped from https://reddit.com/r/WritingPrompts

nlp dataset machine-learning-dataset nlp-datasets

Updated Apr 28, 2021

Vitamins-Supplements-Reviews

turkish-nlp-suite / Vitamins-Supplements-Reviews

Repo for Turkish sentiment analysis dataset, "Vitamins and Supplements Customer Reviews"

nlp nlp-datasets sentiment-analysis-dataset turkish-nlp turkce-veriseti medical-nlp turkish-nlp-dataset turkce-sentiment-analysis-veriseti

Updated Jul 11, 2023

saakolch / procedure_of_extracting_data

Data preprocessing and training on Drug Review Dataset using Hugging Face library

classifier-model nlp-datasets nlp-deep-learning

Updated May 19, 2024
Jupyter Notebook

apple-fritter / ploop.sh

➰Loop through a TSV file and pass columns of data to an external program. A Bash script.

bash tsv wrapper data-science machine-learning corpus machinelearning mit-license bash-script batch-processing nlp-machine-learning corpus-processing nlp-datasets

Updated Apr 23, 2023
Shell

letuananh / texttaglib

a Python library for managing and annotating text corpuses in different formats.

nlp text pypi annotations corpus elan nlp-datasets

Updated May 13, 2021
Python

Text-Classification-of-SMS-as-Spam-or-Non-Spam

NakulLakhotia / Text-Classification-of-SMS-as-Spam-or-Non-Spam

Classifying a SMS as spam or non-spam using Natural Language Processing (NLP) and Machine Learning

nlp machine-learning text-classification nlp-machine-learning nlp-datasets

Updated Aug 4, 2020
Python

murali1996 / eacl2021-OffensEval-Dravidian

EACL 2021 paper (SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification)

nlp pretrained-models nlp-resources nlp-machine-learning dravidian offensive-language nlp-datasets codeswitching codemixed 2021 codeswitch eacl dravidian-languages

Updated Feb 22, 2021
Python

prateeksawhney97 / NLP-Pipeline-to-Clean-Movie-Reviews-Data

Creating a NLP Pipeline to 'Clean' Movie Reviews Data and writing cleaned data to output file

clean-code nlp-machine-learning nlp-keywords-extraction nlp-datasets

Updated Nov 12, 2019
Jupyter Notebook

Improve this page

Add a description, image, and links to the nlp-datasets topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nlp-datasets topic, visit your repo's landing page and select "manage topics."