-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug where simultaneous edits could lose data #1313
Conversation
The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A WITH ACUTE. The diff algorithm naturally compared those two as different (since it's doing a basic byte comparison on strings), and therefore a delta update was being submitted which changed á to á. Mongo was therefore writing that field's value in the database, accidentally overwriting any previous edits to that field from a different user. The answer is to normalize the pristineEntry value (the value that came out of the database) before passing it to the diffing algorithm, so that it's comparing NFC to NFC. With this change, I can no longer reproduce the test case for #1248, which I was consistently reproducing before the change. So although there might be other situations that can also cause edits to be lost, I'm confident that the most mysterious one is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small change, with hopefully a big fix. Nice catch @rmunn. Approved.
Instructions for testing:
|
So far the only project I've found with decomposed-form data is the Thai Food Dictionary, which is a WeSay project and thus not a Send/Receive project. So to load the project data, we'd need to import a LIFT file. Then the reset procedure would be simple: delete the project and recreate it from LIFT. |
I've been able to test this on my own, and confirm that it does fix the bug. (At least, the instance of the bug that I've found). First I deployed a version without the bugfix and confirmed that I could trigger the data-loss situation by following the testing instructions. Then I deployed the version with this bugfix, followed the same procedure, and could no longer trigger the data-loss bug. To set up the project data, and reset the project, the easiest way I found was to have a Mongo shell open, and repeatedly paste the following line into the Mongo shell to cause the entry to be updated: db.lexicon.update({_id: ObjectId("621c3d5084d59854df96527f")}, {'$set': {'lexeme.th.value': 'ข้าวผัดหมู', 'lexeme.fr.value': 'khâaw phàt mǔu', 'dateModified': new Date()}}) In your own testing, you'll need to substitute a different ObjectId, and you'll probably need to change the language codes, e.g. |
Description
Fixes #1248.
The cause of the issue was normalization: U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT was in the Mongo database (for example), but what got saved into Mongo was U+00E1 LATIN SMALL LETTER A WITH ACUTE. The diff algorithm naturally compared those two as different (since it's doing a basic byte comparison on strings), and therefore a delta update was being submitted which changed á to á. Mongo was therefore writing that field's value in the database, accidentally overwriting any previous edits to that field from a different user.
The answer is to normalize the
pristineEntry
value (the value that came out of the database) before passing it to the diffing algorithm, so that it's comparing NFC to NFC. With this change, I can no longer reproduce the test case for #1248, which I was consistently reproducing before the change. So although there might be other situations that can also cause edits to be lost, I'm confident that the most mysterious one is fixed.Type of Change
Only keep lines below that describe this change, then delete the rest.
Testing on your branch
Wrote a Playwright test (which I've preserved in draft PR #1307) to help me reproduce the bug reliably. Once the test went from failing every time to passing every time, I knew I'd gotten the fix correct.
I haven't included the Playwright test in this PR, because the test is still quite messy. Once I get the test pared down to something neat and tidy, I'll turn #1307 into a real PR and ask for a review. But I wanted to get the bugfix merged in right away, without waiting for the E2E test to be neat and tidy, so that we can get the fix deployed to users as soon as possible.
Checklist
qa.languageforge.org testing
Reviewers: add/replace your name below and check the box to sign-off/attest the feature works as expected on qa.languageforge.org