Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where simultaneous edits could lose data #1313

Merged
merged 1 commit into from
Feb 23, 2022
Merged

Conversation

rmunn
Copy link
Collaborator

@rmunn rmunn commented Feb 22, 2022

Description

Fixes #1248.

The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A WITH ACUTE. The diff algorithm naturally compared those two as different (since it's doing a basic byte comparison on strings), and therefore a delta update was being submitted which changed á to á. Mongo was therefore writing that field's value in the database, accidentally overwriting any previous edits to that field from a different user.

The answer is to normalize the pristineEntry value (the value that came out of the database) before passing it to the diffing algorithm, so that it's comparing NFC to NFC. With this change, I can no longer reproduce the test case for #1248, which I was consistently reproducing before the change. So although there might be other situations that can also cause edits to be lost, I'm confident that the most mysterious one is fixed.

Type of Change

Only keep lines below that describe this change, then delete the rest.

  • Bug fix (non-breaking change which fixes an issue)

Testing on your branch

Wrote a Playwright test (which I've preserved in draft PR #1307) to help me reproduce the bug reliably. Once the test went from failing every time to passing every time, I knew I'd gotten the fix correct.

I haven't included the Playwright test in this PR, because the test is still quite messy. Once I get the test pared down to something neat and tidy, I'll turn #1307 into a real PR and ask for a review. But I wanted to get the bugfix merged in right away, without waiting for the E2E test to be neat and tidy, so that we can get the fix deployed to users as soon as possible.

Checklist

  • I have performed a self-review of my own code
  • I have reviewed the title/description of this PR which will be used as the squashed PR commit message
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works (tests are in PR Bugfix/inconsistent save behavior #1307)

qa.languageforge.org testing

Reviewers: add/replace your name below and check the box to sign-off/attest the feature works as expected on qa.languageforge.org

  • Robin Munn (2022-03-01 14:31)
  • Reviewer2 (YYYY-MM-DD HH:MM)

The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A
followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for
example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A
WITH ACUTE. The diff algorithm naturally compared those two as different
(since it's doing a basic byte comparison on strings), and therefore a
delta update was being submitted which changed á to á. Mongo was
therefore writing that field's value in the database, accidentally
overwriting any previous edits to that field from a different user.

The answer is to normalize the pristineEntry value (the value that came
out of the database) before passing it to the diffing algorithm, so that
it's comparing NFC to NFC. With this change, I can no longer reproduce
the test case for #1248, which I was consistently reproducing before the
change. So although there might be other situations that can also cause
edits to be lost, I'm confident that the most mysterious one is fixed.
@github-actions
Copy link

Unit Test Results

    1 files      1 suites   12s ⏱️
373 tests 373 ✔️ 0 💤 0

Results for commit 81aea30.

Copy link
Collaborator

@megahirt megahirt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change, with hopefully a big fix. Nice catch @rmunn. Approved.

@rmunn rmunn merged commit 750bde8 into develop Feb 23, 2022
@rmunn rmunn deleted the bugfix/1248 branch February 23, 2022 00:40
@rmunn
Copy link
Collaborator Author

rmunn commented Mar 1, 2022

Instructions for testing:

  • Have a project with data that is in decomposed form (crucial for triggering the bug).
  • Optional but recommended: have a way to reset the project data to be in decomposed form after testing (because testing will replace the data with the composed form)
  • User A loads an entry
  • Wait five seconds
  • User B loads the same entry
  • User A edits a field that had decomposed-form data
  • User B edits a different field
  • User A waits until user B's edits show up on his screen
  • If bug was triggered: user A sees user B's edits show up, but his edited field replaced by its original version at the same time
  • If bug was NOT triggered: user A sees user B's edits show up, and his edited field keeps his edit. Also, user B sees user A's edit to that field show up

@rmunn
Copy link
Collaborator Author

rmunn commented Mar 1, 2022

So far the only project I've found with decomposed-form data is the Thai Food Dictionary, which is a WeSay project and thus not a Send/Receive project. So to load the project data, we'd need to import a LIFT file. Then the reset procedure would be simple: delete the project and recreate it from LIFT.

@rmunn
Copy link
Collaborator Author

rmunn commented Mar 1, 2022

I've been able to test this on my own, and confirm that it does fix the bug. (At least, the instance of the bug that I've found). First I deployed a version without the bugfix and confirmed that I could trigger the data-loss situation by following the testing instructions. Then I deployed the version with this bugfix, followed the same procedure, and could no longer trigger the data-loss bug.

To set up the project data, and reset the project, the easiest way I found was to have a Mongo shell open, and repeatedly paste the following line into the Mongo shell to cause the entry to be updated:

db.lexicon.update({_id: ObjectId("621c3d5084d59854df96527f")}, {'$set': {'lexeme.th.value': 'ข้าวผัดหมู', 'lexeme.fr.value': 'khâaw phàt mǔu', 'dateModified': new Date()}})

In your own testing, you'll need to substitute a different ObjectId, and you'll probably need to change the language codes, e.g. lexeme.th.value might become lexeme.en.value while lexeme.fr.value might become lexeme.en-fonipa.value. Substitute as appropriate for the project you're testing with (do db.lexicon.findOne() to find an appropriate entry to modify, and pull the field writing systems and ObjectId from that entry). The important thing is to do that update in Mongo, so that the field will contain NFD-form data. If you paste 'khâaw phàt mǔu' (which is NFD) into Language Forge, it will be automatically converted to NFC before being saved and you won't have set up the test data correctly. But if you paste that db.lexicon.update line as-is, and just change the ObjectId and possibly the writing system codes, then you'll be properly loading NFD data into the project and you'll trigger the bug (or, conversely, NOT trigger the bug once the bugfix is deployed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: inconsistent save behavior
2 participants