Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Inconsistent NER Grouping (Pipeline) #4987

Merged
merged 22 commits into from
Jul 8, 2020

Conversation

enzoampil
Copy link
Contributor

@enzoampil enzoampil commented Jun 14, 2020

This PR solves issue #4816 by:

  1. Applying entity grouping to similar entity types with different prefixes (i.e. B and I)
  2. Ensuring that separate entities at the last filtered index are no longer excluded from grouping.

Running the sample script below (based on reference issue #4816) returns the expected results. Do note that the entity_group is based on the entity_type of the first entity in the group.

from transformers import pipeline
NER_MODEL = "mrm8488/bert-spanish-cased-finetuned-ner"
nlp_ner = pipeline("ner", model=NER_MODEL,
                   grouped_entities=True,
                   tokenizer=(NER_MODEL, {"use_fast": False}))

t = """Consuelo Araújo Noguera, ministra de cultura del presidente Andrés Pastrana (1998.2002) fue asesinada por las Farc luego de haber permanecido secuestrada por algunos meses."""
nlp_ner(t)

[{'entity_group': 'B-PER', 'score': 0.9710702640669686, 'word': 'Consuelo Araújo Noguera'},
 {'entity_group': 'B-PER', 'score': 0.9997273534536362, 'word': 'Andrés Pastrana'},
 {'entity_group': 'B-ORG', 'score': 0.8589080572128296, 'word': 'Farc'}]

I also ran another test to ensure that number 2 (separate entity at the last index) is working properly. I confirmed that it is working properly now.

nlp = pipeline('ner', grouped_entities=False)
nlp("Enzo works at the the UN")
[{'entity': 'I-PER', 'index': 1, 'score': 0.9968166351318359, 'word': 'En'},
 {'entity': 'I-PER', 'index': 2, 'score': 0.9957635998725891, 'word': '##zo'},
 {'entity': 'I-ORG', 'index': 7, 'score': 0.9986497163772583, 'word': 'UN'}]

nlp2 = pipeline('ner', grouped_entities=True)
nlp2("Enzo works at the the UN")
[{'entity_group': 'I-PER', 'score': 0.9962901175022125, 'word': 'Enzo'},
 {'entity_group': 'I-ORG', 'score': 0.9986497163772583, 'word': 'UN'}]

You can test these out yourself in this colab notebook.

cc @dav009 @mfuntowicz

@codecov
Copy link

codecov bot commented Jun 14, 2020

Codecov Report

Merging #4987 into master will decrease coverage by 1.40%.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4987      +/-   ##
==========================================
- Coverage   77.83%   76.43%   -1.41%     
==========================================
  Files         141      141              
  Lines       24634    24638       +4     
==========================================
- Hits        19175    18832     -343     
- Misses       5459     5806     +347     
Impacted Files Coverage Δ
src/transformers/pipelines.py 76.16% <91.66%> (+0.16%) ⬆️
src/transformers/modeling_tf_mobilebert.py 23.62% <0.00%> (-73.11%) ⬇️
src/transformers/modeling_tf_electra.py 26.92% <0.00%> (-68.47%) ⬇️
src/transformers/tokenization_roberta.py 76.71% <0.00%> (-21.92%) ⬇️
src/transformers/tokenization_utils_base.py 85.75% <0.00%> (-7.85%) ⬇️
src/transformers/tokenization_transfo_xl.py 38.73% <0.00%> (-3.76%) ⬇️
src/transformers/tokenization_utils_fast.py 92.02% <0.00%> (-2.18%) ⬇️
src/transformers/tokenization_openai.py 82.30% <0.00%> (-1.54%) ⬇️
src/transformers/tokenization_bert.py 90.86% <0.00%> (-0.46%) ⬇️
src/transformers/tokenization_utils.py 89.55% <0.00%> (-0.41%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58cca47...05f50d9. Read the comment docs.

@julien-c
Copy link
Member

Looks great, but this sort of code/feature also looks like a perfect candidate for more unit-testing coverage.

What do you think?

@enzoampil
Copy link
Contributor Author

Agree, can add these as test cases in test_pipelines.

@LysandreJik
Copy link
Member

That would be great @enzoampil!

@enzoampil
Copy link
Contributor Author

enzoampil commented Jun 17, 2020

@julien-c I've added the original issue bug as a test case (Number 1 in the original post). Do note that I only included it in the torch version because mrm8488/bert-spanish-cased-finetuned-ner seems to only work for torch. Please let me know if this is enough for this PR.

For future PRs to add new test cases coming from issues found on top of this (e.g. those from issue #5077), I was hoping to get some guidance on how we'd include them to the test coverage without making it too heavy. For context, different cases are typically based on different models, which means we'll have to run separate models to add them as test cases.

@julien-c
Copy link
Member

I think we should try to make the tests more unitary, meaning that for instance you would feed them fixed model outputs (no actual forward pass) and check that the actual formatted output is correct.

This might require splitting the call method in smaller more testable functions, which is totally fine IMO.

@enzoampil
Copy link
Contributor Author

I see what you mean. Yes, that makes more sense than running different models. Will work on this.

Copy link
Contributor

@probavee probavee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I was looking for this !
It worked using a Camembert model with 8 different entities, but only when there are entities, else it raises an IndexError.

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-65-2315ccc0d111> in <module>()
      1 seq6 = "De nombreux particuliers s’interrogent sur la meilleure manière de se dessaisir d’un véhicule accidenté."
----> 2 ner(seq6)

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in __call__(self, *args, **kwargs)
    948                 if self.model.config.id2label[label_idx] not in self.ignore_labels
    949             ]
--> 950             last_idx, _ = filtered_labels_idx[-1]
    951 
    952             for idx, label_idx in filtered_labels_idx:

IndexError: list index out of range

It runs by setting ignore_labels=[] in the pipeline instance.

[{'entity_group': 'O',
  'score': 0.969656412601471,
  'word': '<s> De nombreux particuliers s’interrogent sur la meilleure manière de se dessaisir d’un véhicule accidenté.</s>'}]

So I suggest adding a little condition.
Hope it could help !

src/transformers/pipelines.py Outdated Show resolved Hide resolved
@enzoampil
Copy link
Contributor Author

enzoampil commented Jul 4, 2020

@julien-c @LysandreJik I've performed the following adjustments to the PR:

  1. I've separated the group_entities function from the raw NER forward pass altogether so that it's easy to run tests that feed fixed model outputs and check that the actual formatted output is correct.

group_entities now takes as an argument a list[dict] of raw NER model outputs, and converts them to the grouped equivalent.

  1. I've added a new NerPipelineTests class in test_pipelines which contains all the NER related tests, and includes new tests for the group_entities function.

The test simply confirms if the expected formatted output (grouped) is equivalent to the actual formatted output given the raw model outputs. For the test cases, I used the two samples from the original PR post. It should be straight forward to continue adding test cases moving forward.

Please do let me know what you guys think! 😄

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very clean! What do you think @julien-c, @mfuntowicz? (Let's wait for v3.0.2 before merging)

@julien-c
Copy link
Member

julien-c commented Jul 7, 2020

Yes, looks good. I would add some typings to (at least) the group_entities and group_sub_entities but we can do that in a subsequent PR.

@enzoampil
Copy link
Contributor Author

@LysandreJik @julien-c Thanks for the feedback. I've added typings for the group_entities and group_sub_entities functions 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants