Fix bug where simultaneous edits could lose data #1313

rmunn · 2022-02-22T09:21:24Z

Description

The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A WITH ACUTE. The diff algorithm naturally compared those two as different (since it's doing a basic byte comparison on strings), and therefore a delta update was being submitted which changed á to á. Mongo was therefore writing that field's value in the database, accidentally overwriting any previous edits to that field from a different user.

The answer is to normalize the pristineEntry value (the value that came out of the database) before passing it to the diffing algorithm, so that it's comparing NFC to NFC. With this change, I can no longer reproduce the test case for #1248, which I was consistently reproducing before the change. So although there might be other situations that can also cause edits to be lost, I'm confident that the most mysterious one is fixed.

Type of Change

Only keep lines below that describe this change, then delete the rest.

Bug fix (non-breaking change which fixes an issue)

Testing on your branch

Wrote a Playwright test (which I've preserved in draft PR #1307) to help me reproduce the bug reliably. Once the test went from failing every time to passing every time, I knew I'd gotten the fix correct.

I haven't included the Playwright test in this PR, because the test is still quite messy. Once I get the test pared down to something neat and tidy, I'll turn #1307 into a real PR and ask for a review. But I wanted to get the bugfix merged in right away, without waiting for the E2E test to be neat and tidy, so that we can get the fix deployed to users as soon as possible.

Checklist

I have performed a self-review of my own code
I have reviewed the title/description of this PR which will be used as the squashed PR commit message
I have commented my code, particularly in hard-to-understand areas
I have added tests that prove my fix is effective or that my feature works (tests are in PR Bugfix/inconsistent save behavior #1307)

qa.languageforge.org testing

Reviewers: add/replace your name below and check the box to sign-off/attest the feature works as expected on qa.languageforge.org

Robin Munn (2022-03-01 14:31)
Reviewer2 (YYYY-MM-DD HH:MM)

The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A WITH ACUTE. The diff algorithm naturally compared those two as different (since it's doing a basic byte comparison on strings), and therefore a delta update was being submitted which changed á to á. Mongo was therefore writing that field's value in the database, accidentally overwriting any previous edits to that field from a different user. The answer is to normalize the pristineEntry value (the value that came out of the database) before passing it to the diffing algorithm, so that it's comparing NFC to NFC. With this change, I can no longer reproduce the test case for #1248, which I was consistently reproducing before the change. So although there might be other situations that can also cause edits to be lost, I'm confident that the most mysterious one is fixed.

github-actions · 2022-02-22T09:23:36Z

Unit Test Results

    1 files     1 suites 12s ⏱️
373 tests 373 ✔️ 0 💤 0 ❌

Results for commit 81aea30.

megahirt

Small change, with hopefully a big fix. Nice catch @rmunn. Approved.

rmunn · 2022-03-01T02:33:39Z

Instructions for testing:

Have a project with data that is in decomposed form (crucial for triggering the bug).
Optional but recommended: have a way to reset the project data to be in decomposed form after testing (because testing will replace the data with the composed form)
User A loads an entry
Wait five seconds
User B loads the same entry
User A edits a field that had decomposed-form data
User B edits a different field
User A waits until user B's edits show up on his screen
If bug was triggered: user A sees user B's edits show up, but his edited field replaced by its original version at the same time
If bug was NOT triggered: user A sees user B's edits show up, and his edited field keeps his edit. Also, user B sees user A's edit to that field show up

rmunn · 2022-03-01T03:14:11Z

So far the only project I've found with decomposed-form data is the Thai Food Dictionary, which is a WeSay project and thus not a Send/Receive project. So to load the project data, we'd need to import a LIFT file. Then the reset procedure would be simple: delete the project and recreate it from LIFT.

rmunn · 2022-03-01T07:50:44Z

I've been able to test this on my own, and confirm that it does fix the bug. (At least, the instance of the bug that I've found). First I deployed a version without the bugfix and confirmed that I could trigger the data-loss situation by following the testing instructions. Then I deployed the version with this bugfix, followed the same procedure, and could no longer trigger the data-loss bug.

To set up the project data, and reset the project, the easiest way I found was to have a Mongo shell open, and repeatedly paste the following line into the Mongo shell to cause the entry to be updated:

db.lexicon.update({_id: ObjectId("621c3d5084d59854df96527f")}, {'$set': {'lexeme.th.value': 'ข้าวผัดหมู', 'lexeme.fr.value': 'khâaw phàt mǔu', 'dateModified': new Date()}})

In your own testing, you'll need to substitute a different ObjectId, and you'll probably need to change the language codes, e.g. lexeme.th.value might become lexeme.en.value while lexeme.fr.value might become lexeme.en-fonipa.value. Substitute as appropriate for the project you're testing with (do db.lexicon.findOne() to find an appropriate entry to modify, and pull the field writing systems and ObjectId from that entry). The important thing is to do that update in Mongo, so that the field will contain NFD-form data. If you paste 'khâaw phàt mǔu' (which is NFD) into Language Forge, it will be automatically converted to NFC before being saved and you won't have set up the test data correctly. But if you paste that db.lexicon.update line as-is, and just change the ObjectId and possibly the writing system codes, then you'll be properly loading NFD data into the project and you'll trigger the bug (or, conversely, NOT trigger the bug once the bugfix is deployed).

megahirt approved these changes Feb 22, 2022

View reviewed changes

rmunn merged commit 750bde8 into develop Feb 23, 2022

rmunn deleted the bugfix/1248 branch February 23, 2022 00:40

rmunn mentioned this pull request Mar 18, 2022

bug: inconsistent save behavior #1248

Closed

rmunn mentioned this pull request Jun 2, 2023

bug: custom MultiParagraph fields fail after #1313 #1755

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug where simultaneous edits could lose data #1313

Fix bug where simultaneous edits could lose data #1313

rmunn commented Feb 22, 2022 •

edited

Loading

github-actions bot commented Feb 22, 2022

megahirt left a comment

rmunn commented Mar 1, 2022

rmunn commented Mar 1, 2022

rmunn commented Mar 1, 2022

Fix bug where simultaneous edits could lose data #1313

Fix bug where simultaneous edits could lose data #1313

Conversation

rmunn commented Feb 22, 2022 • edited Loading

Description

Type of Change

Testing on your branch

Checklist

qa.languageforge.org testing

github-actions bot commented Feb 22, 2022

Unit Test Results

megahirt left a comment

Choose a reason for hiding this comment

rmunn commented Mar 1, 2022

rmunn commented Mar 1, 2022

rmunn commented Mar 1, 2022

rmunn commented Feb 22, 2022 •

edited

Loading