Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added health_fact dataset #953

Merged
merged 17 commits into from
Dec 1, 2020
Merged

added health_fact dataset #953

merged 17 commits into from
Dec 1, 2020

Conversation

bhavitvyamalik
Copy link
Contributor

Added dataset Explainable Fact-Checking for Public Health Claims (dataset_id: health_fact)

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thank you for adding this one !

I left a few comments about the usage of the -1 value as a placeholder when labels are missing.

Also could you add the dataset card please ? you can find more info here : https://github.com/huggingface/datasets/blob/master/ADD_NEW_DATASET.md#tag-the-dataset-and-write-the-dataset-card
Only the yaml tags at the top are mandatory.
If you feel like adding the rest of the rest of the paragraphs take too much time you can just leave the fields as [More Information Needed]

datasets/health_fact/health_fact.py Show resolved Hide resolved
datasets/health_fact/health_fact.py Outdated Show resolved Hide resolved
datasets/health_fact/health_fact.py Outdated Show resolved Hide resolved
datasets/health_fact/health_fact.py Outdated Show resolved Hide resolved
datasets/health_fact/health_fact.py Outdated Show resolved Hide resolved
datasets/health_fact/health_fact.py Outdated Show resolved Hide resolved
@bhavitvyamalik
Copy link
Contributor Author

Hi @lhoestq,
Initially I tried int(-1) only in place of nan labels and missing values but I kept on getting this error pyarrow.lib.ArrowTypeError: Expected bytes, got a 'int' object maybe because I'm sending int values (-1) to objects which are string type

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thank you very much !

@lhoestq lhoestq merged commit cf41215 into huggingface:master Dec 1, 2020
bharatr21 pushed a commit to bharatr21/datasets that referenced this pull request Dec 2, 2020
* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* Add NumerSense (huggingface#933)

* add NumerSense dataset

* trailing whitespace

* mention empty strings in test splits

* add src dataset tag

* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* bug fixes in main script

Co-authored-by: Joe Davison <josephddavison@gmail.com>
ggdupont pushed a commit to ggdupont/datasets that referenced this pull request Dec 4, 2020
* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* Add NumerSense (huggingface#933)

* add NumerSense dataset

* trailing whitespace

* mention empty strings in test splits

* add src dataset tag

* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* bug fixes in main script

Co-authored-by: Joe Davison <josephddavison@gmail.com>
sileod pushed a commit to sileod/datasets that referenced this pull request Dec 7, 2020
* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* Add NumerSense (huggingface#933)

* add NumerSense dataset

* trailing whitespace

* mention empty strings in test splits

* add src dataset tag

* added dataset Explainable Fact-Checking for Public Health Claims (health_fact)

* trailing spaces removed in health_fact

* added encoding when opening and reading file

* dataset card added and README updated.

* added dataset card and updated README

* updated script, minor fixes

* updated README.md

* bug fixes in main script

Co-authored-by: Joe Davison <josephddavison@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants