Skip to content

Commit

Permalink
Migrations (#143)
Browse files Browse the repository at this point in the history
* Add file extension field, ensure schema type transformation and load works, change size to bigint (bug) and create view tables

* init alembic

* dont hardcode db for alembic

* Re-enable some of the DB unit tests

* WIP

* Remove test that will be re-added in a different branch

* update for pr comments, update tests

* add to documentation, tests

* add to cicd

* Update README.md

* Update README.md

* Update __main__.py
  • Loading branch information
MDunitz committed May 23, 2019
1 parent b6f5bac commit 546a570
Show file tree
Hide file tree
Showing 23 changed files with 685 additions and 161 deletions.
1 change: 1 addition & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ deploy:
- staging
script:
- make deploy
- make apply-migrations

integration_test:
stage: integration_test
Expand Down
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ install:

before_script:
- source environment
- make build-chalice-config init-db load-test-data
- make build-chalice-config init-db apply-migrations load-test-data

script:
- make test
Expand Down
17 changes: 15 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -89,19 +89,23 @@ lint:
source environment
unset TF_CLI_ARGS_init; cd tests/terraform; terraform init; terraform validate

test: lint docs unit-test
test: lint docs unit-test migration-test

unit-test:
unit-test: load-test-data
coverage run --source $(APP_NAME) -m unittest discover --start-directory tests/unit --top-level-directory . --verbose

integration-test:
python -m unittest discover --start-directory tests/integration --top-level-directory . --verbose

migration-test:
python -m unittest discover --start-directory tests/migration --top-level-directory . --verbose

fetch:
scripts/fetch.py

init-db:
python -m $(APP_NAME).db init
$(MAKE) apply-migrations

drop-db:
python -m $(APP_NAME).db drop
Expand Down Expand Up @@ -143,5 +147,14 @@ requirements-dev.txt : requirements.txt.in
docs:
$(MAKE) -C docs html

# create a migration file for changes made to db table definitions inheriting from the SQLAlchemyBase in dcpquery/db
create-migration:
alembic revision --autogenerate

# apply all migration files to the database
apply-migrations:
alembic upgrade head

.PHONY: deploy init-secrets install-webhooks install-secrets build-chalice-config package init-tf init-db destroy
.PHONY: clean lint test fetch init-db load load-test-data update-lambda get-logs refresh-all-requirements docs
.PHONY: apply-migrations create-migration migration-test
32 changes: 32 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,35 @@ Contributions are welcome; please read [CONTRIBUTING.md](CONTRIBUTING.md).
[![Test Coverage](https://codecov.io/gh/HumanCellAtlas/query-service/branch/master/graph/badge.svg)](https://codecov.io/gh/HumanCellAtlas/query-service)
[![Production Health Check](https://status.data.humancellatlas.org/service/query-service-prod.svg)]()
[![Master Build Status](https://status.dev.data.humancellatlas.org/build/HumanCellAtlas/metrics/master.svg)](https://allspark.dev.data.humancellatlas.org/HumanCellAtlas/query-service/commits/master)


## Migrations
### When to Create a Migration
- Anytime you make changes to the database schema (adding a table, changing a field name, creating or updating an enum etc)

### Creating Migration Files
- Autogenerate a migration file based on changes made to the ORM. On the command line run
`make create-migration`
- This will create a migration in `dcpquery/alembic/versions`. Take a look at the generated SQL to ensure it represents the changes you wish to make to the database. Potential issues with migration autogeneration are [listed below](#Autogenerate can't detect)
- If you get this error you need to apply the migrations you've already created to the db (or delete them) before you can create a new migration
```
ERROR [alembic.util.messaging] Target database is not up to date.
FAILED: Target database is not up to date.
```
- Note that this will create a migration even if you have not made any changes to the db (in that case it will just be an empty migration file which you should delete)

- To create a blank migration file run
`alembic revision -m "description of changes"`
- The description of changes will be appended to the migration file's name so you'll want to keep it short (less than 40 chars); spaces will be replaced with underscores
- You can then edit the newly created migration file (in `dcpquery/alembic/versions`)

### Applying new migrations to the database
- Ensure you are connected to the correct database (run `python -m dcpquery.db connect` to see the database url)
- From the command line run `make apply-migrations`
- To unapply a migration run `alembic downgrade migration_id` (the migration_id is the string in front of the underscore in the migration name, for file 000000000000_init_db.py the migration id is 000000000000)
### Autogenerate can't detect
- Changes of table name. These will come out as an add/drop of two different tables, and should be hand-edited into a name change instead.
- Changes of column name. Like table name changes, these are detected as a column add/drop pair, which is not at all the same as a name change.
- Anonymously named constraints. Give your constraints a name, e.g. UniqueConstraint('col1', 'col2', name="my_name").
- Special SQLAlchemy types such as Enum when generated on a backend which doesn’t support ENUM directly - this because the representation of such a type in the non-supporting database, i.e. a CHAR+ CHECK constraint, could be any kind of CHAR+CHECK. For SQLAlchemy to determine that this is actually an ENUM would only be a guess, something that’s generally a bad idea.

45 changes: 45 additions & 0 deletions alembic.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# A generic, single database configuration for alembic migrations.

[alembic]
# path to migration scripts
script_location = dcpquery/alembic

# max length of characters to apply to the
# "slug" field
truncate_slug_length = 40

# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = WARN
handlers = console
qualname =

[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine

[logger_alembic]
level = INFO
handlers =
qualname = alembic

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

33 changes: 18 additions & 15 deletions dcpquery/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,23 +64,26 @@ def db(self):
connect_opts = " -c statement_timeout={}s".format(self.db_statement_timeout_seconds)
self._db_engine_params["connect_args"]["options"] += connect_opts
self._db_engine_params["echo"] = self.echo
if self.local_mode:
db_user = getpass.getuser()
db_password = ""
db_host = ""
self._db = sqlalchemy.create_engine(self.db_url, **self._db_engine_params)
return self._db

@property
def db_url(self):
if self.local_mode:
db_user = getpass.getuser()
db_password = ""
db_host = ""
else:
db_user = AwsSecret(f"{self.app_name}/{os.environ['STAGE']}/postgresql/username").value.strip()
db_password = AwsSecret(f"{self.app_name}/{os.environ['STAGE']}/postgresql/password").value.strip()
if self._readonly_db:
db_host_secret_name = f"{self.app_name}/{os.environ['STAGE']}/postgresql/readonly_hostname"
else:
db_user = AwsSecret(f"{self.app_name}/{os.environ['STAGE']}/postgresql/username").value.strip()
db_password = AwsSecret(f"{self.app_name}/{os.environ['STAGE']}/postgresql/password").value.strip()
if self._readonly_db:
db_host_secret_name = f"{self.app_name}/{os.environ['STAGE']}/postgresql/readonly_hostname"
else:
db_host_secret_name = f"{self.app_name}/{os.environ['STAGE']}/postgresql/hostname"
db_host = AwsSecret(db_host_secret_name).value.strip()
db_name = self.app_name
self._db = sqlalchemy.create_engine(f"postgresql+psycopg2://{db_user}:{db_password}@{db_host}/{db_name}",
**self._db_engine_params)
db_host_secret_name = f"{self.app_name}/{os.environ['STAGE']}/postgresql/hostname"
db_host = AwsSecret(db_host_secret_name).value.strip()
db_name = self.app_name

return self._db
return f"postgresql+psycopg2://{db_user}:{db_password}@{db_host}/{db_name}"

@property
def db_session(self):
Expand Down
11 changes: 11 additions & 0 deletions dcpquery/alembic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# `dcpquery/alembic`
This directory contains the configuration files for the [Alembic](https://alembic.sqlalchemy.org/en/latest/)
database migration tool. See the [Query Service Migrations documentation](../../docs/migrations.md) for more
information about creating and applying migrations.


| File/Directory name | Purpose |
|:-----------------------|:------------------|
| versions/ | Contains the migration files for the project |
| env.py | Configuration information for alembic, in particular connecting the SQLAlchemyBase to allow for autogeneration of migrations based on changes made to the Sqlalchemy ORM. |
| script.py.mako | Outline for the generated migration files |
79 changes: 79 additions & 0 deletions dcpquery/alembic/env.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
from __future__ import with_statement

import os
import sys
from logging.config import fileConfig

from sqlalchemy import engine_from_config
from sqlalchemy import pool

from alembic import context

# this line is necessary to allow for imports from dcpquery
sys.path.insert(0, os.getcwd())

from dcpquery.db import SQLAlchemyBase # noqa
from dcpquery import DCPQueryConfig # noqa

# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.

alembic_config = context.config

# Interpret the config file for Python logging.
fileConfig(alembic_config.config_file_name)

# add your model's MetaData object here for 'autogenerate' support
target_metadata = SQLAlchemyBase.metadata


def run_migrations_offline():
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = DCPQueryConfig().db_url
context.configure(
url=url, target_metadata=target_metadata, literal_binds=True
)

with context.begin_transaction():
context.run_migrations()


def run_migrations_online():
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
config_info = alembic_config.get_section(alembic_config.config_ini_section)
config_info['sqlalchemy.url'] = DCPQueryConfig().db_url

connectable = engine_from_config(
config_info,
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)

with connectable.connect() as connection:
context.configure(
connection=connection, target_metadata=target_metadata
)

with context.begin_transaction():
context.run_migrations()


if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
24 changes: 24 additions & 0 deletions dcpquery/alembic/script.py.mako
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""${message}

Revision ID: ${up_revision}
Revises: ${down_revision | comma,n}
Create Date: ${create_date}

"""
from alembic import op
import sqlalchemy as sa
${imports if imports else ""}

# revision identifiers, used by Alembic.
revision = ${repr(up_revision)}
down_revision = ${repr(down_revision)}
branch_labels = ${repr(branch_labels)}
depends_on = ${repr(depends_on)}


def upgrade():
${upgrades if upgrades else "pass"}


def downgrade():
${downgrades if downgrades else "pass"}
28 changes: 28 additions & 0 deletions dcpquery/alembic/versions/000000000000_init_db.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""create tables
Revision ID: f86d2cea09a9
Revises:
Create Date: 2019-05-08 21:54:18.415650
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql

# revision identifiers, used by Alembic.
revision = '000000000000'
down_revision = None
branch_labels = None
depends_on = None


def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###


def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###
Loading

0 comments on commit 546a570

Please sign in to comment.