Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DuplicateFinder] Add penalty if no common gram cat group #2445

Merged
merged 6 commits into from
Aug 3, 2023

Conversation

imnasnainaec
Copy link
Collaborator

@imnasnainaec imnasnainaec commented Jul 31, 2023

Closes #604

Todo: generate test data for acceptance testing


This change is Reviewable

@imnasnainaec imnasnainaec marked this pull request as draft July 31, 2023 20:12
@codecov-commenter
Copy link

codecov-commenter commented Jul 31, 2023

Codecov Report

Patch coverage: 77.77% and project coverage change: +0.04% 🎉

Comparison is base (12b6621) 67.91% compared to head (0cf8bd6) 67.96%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2445      +/-   ##
==========================================
+ Coverage   67.91%   67.96%   +0.04%     
==========================================
  Files         240      240              
  Lines        8254     8270      +16     
  Branches      508      508              
==========================================
+ Hits         5606     5621      +15     
- Misses       2289     2290       +1     
  Partials      359      359              
Flag Coverage Δ
backend 79.66% <77.77%> (+0.05%) ⬆️
frontend 54.31% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
Backend/Helper/DuplicateFinder.cs 86.59% <77.77%> (+0.64%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@imnasnainaec imnasnainaec marked this pull request as ready for review July 31, 2023 21:16
Copy link
Contributor

@jasonleenaylor jasonleenaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @imnasnainaec)


Backend/Helper/DuplicateFinder.cs line 298 at r2 (raw file):

        /// or if the grammatical category group is unspecified in every sense of one of the words.
        /// </summary>
        public static bool MightShareGramCatGroups(Word wordA, Word wordB)

Not sure that I like Might. Sounds too wishy washy. I'd rather negate it and go with HasDifferentGramCat, or something similar.

Code quote:

MightShare

Copy link
Collaborator Author

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @jasonleenaylor)


Backend/Helper/DuplicateFinder.cs line 298 at r2 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

Not sure that I like Might. Sounds too wishy washy. I'd rather negate it and go with HasDifferentGramCat, or something similar.

I've changed it toHaveCommonGramCatGroup; does that work better? I was also considering negating HasDifferentGramCatGroups.

If WordA has a Noun sense and WordB has an Unspecified sense, is it better to say they have a common grammatical category group or that they don't have different grammatical category groups?

If WordA has a Noun sense and WordC has one Verb sense and one Unspecified sense, is it better to say they don't have a common grammatical category group or that they do have different grammatical category groups?

Copy link
Contributor

@jasonleenaylor jasonleenaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @imnasnainaec)


Backend/Helper/DuplicateFinder.cs line 298 at r2 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I've changed it toHaveCommonGramCatGroup; does that work better? I was also considering negating HasDifferentGramCatGroups.

If WordA has a Noun sense and WordB has an Unspecified sense, is it better to say they have a common grammatical category group or that they don't have different grammatical category groups?

If WordA has a Noun sense and WordC has one Verb sense and one Unspecified sense, is it better to say they don't have a common grammatical category group or that they do have different grammatical category groups?

For your first example I would say it is better to say they don't have a different one.
The second example does muddy the waters.

Copy link
Collaborator Author

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jasonleenaylor)


Backend/Helper/DuplicateFinder.cs line 298 at r2 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

For your first example I would say it is better to say they don't have a different one.
The second example does muddy the waters.

Done.

Copy link
Contributor

@jasonleenaylor jasonleenaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @imnasnainaec)

@imnasnainaec imnasnainaec merged commit 37ad51d into master Aug 3, 2023
14 checks passed
@imnasnainaec imnasnainaec deleted the dup-finder-gram-cat branch August 3, 2023 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include Grammatical Category (Part of Speech) in Merge Duplicates word list
3 participants