Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node script to copy projects from staging or prod #1816

Merged
merged 27 commits into from
Jun 14, 2024

Conversation

rmunn
Copy link
Collaborator

@rmunn rmunn commented May 21, 2024

Fixes #1542

Description

Bash script (on Windows, you'll want to run it with gitbash) to copy projects from staging or production to your local Docker dev environment.

Usage: First, edit the script and make sure the staging_context and prod_context values match the names you've given to your Kubernetes contexts. (If unsure, run kubectl config get-contexts to see what context names you have on your system).

Then run node backup.mjs MongoID, e.g. to copy https://staging.languageforge.org/app/lexicon/5dbf805650b51914727e06c4, you'd copy the Mongo ID out of that URL and run backup.sh 5dbf805650b51914727e06c4.

Alternately, you can just paste the URL as the command-line argument, at which point the script will automatically extract the project ID. Be careful to quote the URL, as some characters might have special meaning to the shell. (For example, on Linux, the ! character means "Find a previous command that starts with this text". If you don't put quotes around the URL, you'll get an error from Bash saying bash: !/editor/entry/5dbf806cbea602641cc27e61?sortBy=Default: event not found. And they must be single quotes, because double-quotes don't remove the special meaning of !).

Dependencies

You need kubectl and docker installed. Also, the script first tries to copy assets using kubectl cp, but if that fails, it falls back to rsync. On Windows, you might need to install a Windows build of rsync from the msys project.

Checklist

  • I have labeled my PR with: bug, feature, engineering, security fix or testing
  • I have performed a self-review of my own code
  • I have reviewed the title & description of this PR which I will use as the squashed PR commit message
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have enabled auto-merge (optional)

Testing

Testers, use the following instructions against our staging environment. Post your findings as a comment and include any meaningful screenshots, etc.

Describe how to verify your changes and provide any necessary test data.

  • Go to staging.languageforge.org and pick a project you'd like to copy to your local LF instance
  • Copy its project ID
  • Run backup.sh copiedProjectID
  • Load localhost:8080 and make sure the project you just copied is available
  • If backup.sh did not print any tar errors, then you may have all the project's sound and image assets
  • If it did print tar errors, you may only have some but not all of the assets

Often fails to copy assets because `kubectl cp` or `kubectl exec tar`
get cut off partway through, but that should go away once Kubernetes
version 1.30 is released.
@rmunn rmunn added the engineering Tasks which do not directly relate to a user-facing feature or fix label May 21, 2024
@rmunn rmunn self-assigned this May 21, 2024
@rmunn rmunn requested a review from megahirt May 21, 2024 09:01
Copy link

github-actions bot commented May 21, 2024

Unit Test Results

362 tests   362 ✅  13s ⏱️
 37 suites    0 💤
  1 files      0 ❌

Results for commit 1e3ced2.

♻️ This comment has been updated with latest results.

@rmunn
Copy link
Collaborator Author

rmunn commented May 21, 2024

In our discussion from #1542 (comment) we talked about modifying all the userRef values from the lexicon (entries, comments, and so on). The script does not do that yet. So far I haven't encountered errors due to userRef values pointing to non-existent users, but I haven't tested that extensively. That might end up being unnecessary, but more testing is needed to prove that.

@hahn-kev
Copy link
Collaborator

I'm wondering if it would make more sense to write a script like this in JS which has a pretty high confidence of running cross platform, as it is Chris has stopped working on Mac so the majority of the team will be running this on windows.

As it is bash scripts are difficult to maintain since most of us don't write bash unless we need to. Additionally with JS we could just use kubectl to open a port to the db and use a mongo connection directly which would make it much simpler to write than a line like this admin_id=$(docker exec lf-db mongosh -u admin -p pass --authenticationDatabase admin scriptureforge --eval "db.users.findOne({username: 'admin'}, {_id: 1})" | cut -d"'" -f 2)
thoughts?

@megahirt
Copy link
Collaborator

@hahn-kev you bring up some good points. I didn't think of writing this in NodeJS. I had been planning on writing it in PHP and making something that ran server side. Since I hadn't done it yet I asked Robin to. We both agreed that avoiding PHP would be simpler, since the approach was primarily shelling out to mongo commands. Once we realized we could run the entire thing remotely and not involve the server, it seemed natural to write it in bash. I plan to ensure it runs on Windows.

If we could port it to JS for free now that would be cool, but as long as it works as advertised, I am fine leaving this as bash. If we have issues with cross platform or maintainability friction because not everyone writes bash, then I'd go with a three-strikes-and-we-port-it philosophy.

@megahirt
Copy link
Collaborator

@rmunn I really wanted to test this however my windows machine doesn't have kube contexts or a wireguard tunnel setup :( I need my existing tunnel and contexts which are on my mac at home. So maybe tomorrow I will test on windows.

rmunn added 6 commits May 22, 2024 12:18
This should ensure that the project assets eventually get copied over to
the local Docker setup even under conditions where `kubectl exec` is
flaky and fails every couple of minutes.
Now that this is working, I can get rid of the `ls -lR` step (which is
effectively redundant anyway as `docker cp` is chatty about what files
it's copying), and enable the final cleanup of the temporary directory.
The docker cp command was preserving the UID/GID of the copied files
even though I didn't pass it the `-a` parameter (whose purpose is to
preserve the UID/GID of the copied files). To work around this issue, we
set the file ownership to 33/33 before copying the files into Docker.
The previous solution was too Linux-y; this one doesn't rely on `sudo`
or `id` working in a Git Bash environment on Windows.
@rmunn
Copy link
Collaborator Author

rmunn commented May 22, 2024

I'm willing to rewrite in JS if necessary, though I'd like to see how it performs in a Git Bash environment first, to avoid unnecessary work if it does turn out to be unnecessary.

@megahirt
Copy link
Collaborator

I'm willing to rewrite in JS if necessary, though I'd like to see how it performs in a Git Bash environment first, to avoid unnecessary work if it does turn out to be unnecessary.

I asked chatgpt to port it over to Node Typescript and I thought it did a good first pass: https://chat.team-gpt.com/lt-lexical-tools/664585252bf6048a1b9a3f67

Not done yet, so don't try to run this yet.
@rmunn
Copy link
Collaborator Author

rmunn commented May 23, 2024

It seems kubectl exec is just not reliable, so the mongodump/mongorestore approach to copying the project database is not going to work. I just added a loop to the Node.JS script to keep running mongodump/mongorestore until it succeeds, and my most recent run has now been going for over two hours without mongodump succeeding one single time. And that's on a project with just 30 entries, which takes just a few seconds to mongodump when it succeeds.

Edit: Switching to a different Internet connection made no difference. kubectl exec connections still got torn down so fast I couldn't rely on them.

I'm going to rewrite the mongodump/mongorestore step to use a MongoClient connection instead. Since I can't use the .copyDatabase() feature that Mongo removed in version 4.2, I'll list the collections on the remote database, grab all the data from one collection at a time, and on the local database, drop the existing collection before doing an insertMany or bulkWrite operation to load the data.

backup.mjs Outdated Show resolved Hide resolved
rmunn added 4 commits May 23, 2024 16:13
Was used during development as a way to test remote connection, no
longer needed.
Defaults to qa/staging for obvious reasons
Our staging server has a port defined on the db service, but our
production server does not. Switching to port forward to `deploy/db`,
which will automatically select the Mongo pod (which *does* have a port
open to forward to).
This will save a bit of time when kubectl cp is being reliable

Also force languageforge namespace just in case
@rmunn
Copy link
Collaborator Author

rmunn commented May 23, 2024

NOTE: If you get an error like tar: ./audio: File removed before we read it, then it means you copied a project where one of the two directories (audio and pictures) was a broken symlink. This isn't likely to happen often, so I'm not adding mitigation to the script. The answer is simply to go into deploy/app, delete the broken symlink, and replace it with an empty directory.

If this happens a lot, I'll open a separate issue to track that bugfix.

UPDATE: Nope, this is happening a lot; it's apparently quite common. I'll include a mitigation in this PR rather than a separate issue.

@rmunn rmunn changed the title Bash script to copy projects from staging or prod Node script to copy projects from staging or prod May 23, 2024
Before including pictures and audio in the tarball, make sure they're
really there, and skip them if they are a broken symlink.
Copy link
Collaborator

@hahn-kev hahn-kev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work, this will be nice to have for debugging issues. Let me know when you would like us to test it out.

backup.mjs Outdated Show resolved Hide resolved
backup.mjs Outdated Show resolved Hide resolved
Copy link
Collaborator

@myieye myieye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is like crazy white magic to me 🪄 🤓 👏

It seems to work 🚀 , but not flawlessly just yet. I'm getting:

  throw new Error(`Unexpected result from readlink ${name}: ${result}`);
        ^

Error: Unexpected result from readlink /var/www/html/assets/lexicon/sf_test-chris-01/audio: -n no'

Might have something to do with windows.
Changing the occurences of echo -n to use printf (e.g.) printf "no" seems to fix it for me.

backup.mjs Outdated Show resolved Hide resolved
backup.mjs Outdated Show resolved Hide resolved
package-lock.json Outdated Show resolved Hide resolved
@rmunn
Copy link
Collaborator Author

rmunn commented Jun 10, 2024

Might have something to do with windows. Changing the occurences of echo -n to use printf (e.g.) printf "no" seems to fix it for me.

Grrr. Windows, what are you doing? That echo was clearly inside quotes, it should not have been processed by your shell! It was supposed to be part of the sh -c input.

It's those little subtle differences that get you when trying to write cross-platform scripts. Another thing I could do here is give up on the -n option and instead strip newlines from the command output before comparing it to "yes" or "no". (The -n option to echo means "no newline"; normally echo will automatically add a newline after the text you give it). I think I'll do that, as -n might be a little too much magic for Windows.

It's possible that Windows is doing something strange here that's
causing the `echo` tobe handled by the Windows shell instead of as part
of the kubectl input passed to `sh`. Switching to plain echo and then
stripping newlines from the result should produce the same result
without any cross-platform hiccups.
@rmunn rmunn requested a review from myieye June 10, 2024 08:42
@hahn-kev
Copy link
Collaborator

hahn-kev commented Jun 10, 2024

I want to say windows typically uses double quote's (") instead of single to do something like that.

I'll try and take a look at this today and get it working on windows sadly LF is not working locally for me right now so I can't try this out.

@rmunn
Copy link
Collaborator Author

rmunn commented Jun 11, 2024

I want to say windows typically uses double quote's (") instead of single to do something like that.

That makes things difficult, because Linux assigns different meaning to single quotes vs double quotes; for example, single quotes don't do $variable expansion. There are other differences too, so I use single-quotes routinely when doing sh -c 'some long command' because that works far more often. Plus it allows me to put double-quotes inside the command.

I might be able to rewrite the exec calls to use the version of exec where you pass each parameter as a separate string and let Node take care of appropriate quoting on each OS.

@megahirt
Copy link
Collaborator

That makes things difficult, because Linux assigns different meaning to single quotes vs double quotes;

Yup, that's true. Are you saying you cannot make it work with double quotes?

`kubectl --context=${context} --namespace=languageforge exec -c app deploy/app -- sh -c "readlink -eq ${name} >/dev/null && echo yes || echo no"`,

Does it work on Linux like this?

Windows has issues with single-quotes for quoting command-line params,
but thankfully Linux handles double-quotes correctly in all the places I
used single-quotes, so we'll just switch to double-quotes everywhere.
@rmunn
Copy link
Collaborator Author

rmunn commented Jun 11, 2024

`kubectl --context=${context} --namespace=languageforge exec -c app deploy/app -- sh -c "readlink -eq ${name} >/dev/null && echo yes || echo no"`,

Does it work on Linux like this?

Just pushed a commit making it work with double quotes, which I started even before your comment. :-) I would have pushed it an hour ago, but DockerHub was giving me just a trickle of bandwidth so it took nearly an hour to download the Docker images and run local LF to test it.

@rmunn
Copy link
Collaborator Author

rmunn commented Jun 11, 2024

Just found a bug: if the lexicon collection exists but is empty, you get MongoInvalidArgumentError: Invalid BulkOperation, Batch cannot be empty.

I'll fix it, but if it takes too long then I won't spend too much time on it — because a project with no lexical entries at all is not one that we're likely to need to copy to local LF in order to troubleshoot. :-)

Mongo doesn't like it when you call `.insertMany` and pass it an empty
list. You'd think they would handle that case gracefully, but they don't
and Mongo throws an error "Invalid BulkOperation, Batch cannot be empty".
So we will skip calling `.insertMany` if there are no records to insert.
Copy link
Collaborator

@myieye myieye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I'm pretty stuck.
Copying the assets tar to my Windows machine has multiple problems.

  1. I had to normalize the path (actually, I'm not sure if that was necessary, because it didn't help)
  2. It seems that kubectl cp interprets the Drive in a windows path as a pod name (after the bug has already been fixed twice, apparently 😉). kubectl cp command fails when passing full Windows path such as c:\Temp\foobar.txt kubernetes/kubernetes#101985 (comment)
  3. I can't use a relative path, because my code is on D and my temp folder on C
  4. When I copied the code to C to try it out I got:
error: unexpected EOF
kubectl cp failed Error: Command failed: kubectl --context="aws-rke" --namespace=languageforge cp app-7659cf6f57-n9cmx:/tmp/assets-sf_test-chris-02_lf.tar ..\..\AppData\Local\Temp\lfbackup-i56u10\assets-sf_test-chris-02_lf.tar 
error: unexpected EOF
. Will try to continue with rsync...
Ensuring rsync exists in target container...
Error from server (Forbidden): deployments.apps "app" is forbidden: User "u-j7l8z" cannot get resource "deployments" in API group "apps" in the namespace "default"
Cleaning up temporary directory C:\Users\tim\AppData\Local\Temp\lfbackup-i56u10...
node:internal/errors:932
  const err = new Error(message);
              ^

Error: Command failed: kubectl exec --context="aws-rke" -c app deploy/app -- bash -c "which rsync || (apt update && apt install rsync -y)"
Error from server (Forbidden): deployments.apps "app" is forbidden: User "u-j7l8z" cannot get resource "deployments" in API group "apps" in the namespace "default"

I have no idea what the error: unexpected EOF is about 🙁.

I don't really want to set up k8s in WSL just to test this 😕.

If it's working now in Linux, then maybe you should just merge it. 🤷

@rmunn
Copy link
Collaborator Author

rmunn commented Jun 12, 2024

The "unexpected EOF" error is because kubectl cp is unreliable, and will continue to be unreliable until the server is running Kubernetes 1.30 or later and you have a kubectl with version 1.30 or later. That's precisely why I designed the script to fall back to rsync.

The bit about deploy/app not being available to your user account is fixable: I already have the pod name from the earlier kubectl cp step, so I should use use the pod name instead of deploy/app. So I'll fix that first.

If it's working now in Linux, then maybe you should just merge it. 🤷

Okay, I'll dismiss your "changes requested" review from earlier so that GitHub will allow me to merge this.

@rmunn
Copy link
Collaborator Author

rmunn commented Jun 12, 2024

You might be able to work around the "drive letter interpreted as a pod name" error by using the obscure \\localhost\c$\my_dir format for paths. https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#unc-paths

@rmunn
Copy link
Collaborator Author

rmunn commented Jun 12, 2024

Feature request from meeting: we want to auto-cleanup the tar file from the server on script exit, so we don't leave a bunch of asset tarballs lying around until the next container restart.

Also use pod name instead of deploy/app since not every user account has
access to deploy objects, at least on production
@rmunn
Copy link
Collaborator Author

rmunn commented Jun 12, 2024

@myieye -

As we discussed, leaving this bit of the work for you. Commit 45d8294 adds a comment in the place where you'd want to make that substitution.

You might be able to work around the "drive letter interpreted as a pod name" error by using the obscure \\localhost\c$\my_dir format for paths. https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#unc-paths

@rmunn rmunn requested a review from myieye June 12, 2024 09:17
Copy link
Collaborator

@myieye myieye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the path change you suggested (1e3ced2) and it worked! 🥳
image

@rmunn rmunn merged commit 97e3f37 into develop Jun 14, 2024
17 checks passed
@rmunn rmunn deleted the feat/backup-projects-to-local-mongodb branch June 14, 2024 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engineering Tasks which do not directly relate to a user-facing feature or fix
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

feat: Backup/Restore project as zip file
4 participants