Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: baseline/ignore migrations #348

Closed
depombo opened this issue Dec 17, 2022 · 2 comments
Closed

Feature request: baseline/ignore migrations #348

depombo opened this issue Dec 17, 2022 · 2 comments

Comments

@depombo
Copy link

depombo commented Dec 17, 2022

We perform data migrations in IaSQL which is built on PostgreSQL and we have the need to ignore certain migrations as databases sometimes are recreated and the migration table gets blown away. This can be done either with an explicit blacklist, but ideally by defining a baseline based on a timestamp that ignores migrations before the provided date similar to Flyway. The interface should be as simple as an additional, optional configuration value called BASELINE_TIMESTAMP.

This is a high priority for us and we are considering building our own migration system as we have not found one that fits all of our needs. If guidance can be provided someone from our team can contribute as soon as next week :)

@amacneil
Copy link
Owner

Can you explain the workflow where you would need this?

Generally, when restoring schema from a backup, you should insert rows into the schema_migrations table for any migrations that are already applied. Dbmate will insert these automatically for you when generating the schema.sql file.

I have never heard of IaSQL before, but I read your linked issue:

Data migrations are needed for IaSQL, more so than the traditional schema migrations, and require baselining migrations to avoid duplicate record violations when IaSQL databases are recreated from scratch (e.g. connected and disconnected to the same cloud account)...

I guess this is similar to terraform import? What stops you from populating the schema_migrations table with any migrations you already consider applied, when connecting to an existing cloud account. That doesn't strike me as any more difficult than keeping track of a "baseline date".

...or when changes happen outside of IaSQL and are “synced” from the cloud by the cron job.

That seems to defeat the point of using a migration system.

@depombo
Copy link
Author

depombo commented Dec 19, 2022

Can you explain the workflow where you would need this?

IaSQL maintains a two-way connection between a PostgreSQL database and the cloud. When connecting a database to your cloud account, IaSQL treats the cloud as the source of truth and backfills the database with what is in the cloud account. It is common to disconnect a database from your cloud account and delete the database since it is so easy to create/connect a brand new database to your cloud account again. This mechanism is used by IaSQL internally when you upgrade to the latest schema version of tables in a simple fashion. However, after a database-to-cloud-account disconnect and subsequent connect as used during upgrade/debugging/etc running migrate in the migration system over the data migrations would produce duplicate key errors so for it to work once again you need to do one of the following:

  • set up a migration system anew / delete all the data migrations files
  • baseline migrations to today in the migration system
  • manually populate the schema_migrations table for every single data migration defined

I guess this is similar to terraform import? What stops you from populating the schema_migrations table with any migrations you already consider applied, when connecting to an existing cloud account. That doesn't strike me as any more difficult than keeping track of a "baseline date".

Nothing stops you from populating the schema_migrations table manually but depending on the number of required columns it might be unnecessarily tedious especially if the workflow is being run in CI/CD. I haven't looked at the schema_migrations table for dbmate, but certain migrations systems take care to store the md5 / file signature to make sure that migrations don't get modified once they have been run. Setting a "baseline date" which can be as simple as today's date is much simpler in this particular workflow than understanding the internals of the schema_migrations table and doing as many INSERTs as you have data migrations which could be a high number. This is definitely an atypical workflow since most migration systems are used for schema migrations in the first place and some migration systems are very opinionated about only running schema migrations. Baselining is a pretty common use case for data migrations even if data migrations themselves are rare. dbmate strikes me as less opinionated than other migration systems which is what brought me here. Would you want to support a "baseline date" feature?

Repository owner locked and limited conversation to collaborators Dec 19, 2022
@amacneil amacneil converted this issue into discussion #349 Dec 19, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants