Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary flag helper column type needed #133

Closed
jace opened this issue Aug 8, 2017 · 7 comments
Closed

Primary flag helper column type needed #133

jace opened this issue Aug 8, 2017 · 7 comments

Comments

@jace
Copy link
Member

jace commented Aug 8, 2017

We often have use cases where in a set of objects of the same type, one—and only one—of them needs to be designated primary or default. For example, if a user has multiple email addresses and one of them is the default for them to receive email at.

SQL does not provide an obvious way to do this:

  1. If it's a boolean flag, more than one item in the set can accidentally have that flag turned on.
  2. A CHECK constraint on the column cannot examine the flag on other rows.
  3. An app-side validator will have to load from the database to confirm validity, triggering a flush and risking concurrent conflicting updates.
  4. A reverse foreign key from the parent to one of the child objects is effective, but table construction requires an ALTER TABLE (SQLAlchemy's use_alter flag does this), and importing data into the table requires temporarily suspending foreign key validation or importing in two passes. In particular, this breaks pg_dump's data-only import.

Fortunately, SQL's tendency to treat NULL values as unique and distinct from other NULL values can be exploited to make such a flag:

  • Use a boolean column that is nullable, with the expected values being True and NULL, transparently cast into Python booleans on the way in and out.
  • A CHECK constraint on the column to ensure the value is never False.
  • A unique constraint on the column, typically as a multi-column constraint involving a parent object. The constraint will now ensure no two rows can have a True value, but any number of rows can have a NULL value.

A new column type derived from boolean can automate all of this except the multi-column unique constraint.

@jace
Copy link
Member Author

jace commented Aug 8, 2017

Caveat: using this structure gives us an at most one scenario, but not the corresponding at least one scenario that is also required.

@jace
Copy link
Member Author

jace commented Aug 8, 2017

Additional caveat: Flagging one row as primary does not remove the primary flag from other rows unless all rows are loaded from database (or updated in the database without loading), as in scenario 3 above. An app-side validator on the parent object is still needed. The only difference is that the database will catch errors.

Scenario 4, despite its cumbersomeness, is the most foolproof method.

@jace jace changed the title Unique flag helper column type needed Primary flag helper column type needed Aug 8, 2017
@jace
Copy link
Member Author

jace commented Aug 8, 2017

Using foreign keys from the parent breaks normalisation. The parent may refer to a child that does not refer back to the parent, and SQL does not provide a direct way to validate that.

@jace
Copy link
Member Author

jace commented Aug 9, 2017

From Stack Overflow: the correct approach is to use a secondary join table to record the default. This is worth examining.

Some things remain the same from scenario 4:

  1. Only one default value can exist.
  2. Changing the default does not require examining other rows.
  3. The parent can determine its default by joining with the secondary table. Examining each individual row is unnecessary. Since defaults are usually required from the parent, this is significant.

One improvement:

  1. Data import into SQL no longer requires suspending foreign key validation during the import.

One regression (seemingly):

  1. Rather than a single guaranteed default (if the scenario 4 foreign key was non-nullable), we're back to at most one. However, in practice the foreign key has to be non-nullable as you can't have a default when there's no data. Therefore even that scenario is at most one.

One remaining problem:

  1. Still breaks normalization. The "default" being pointed at may have a different parent.

@jace
Copy link
Member Author

jace commented Aug 9, 2017

Counterpoint: Because of the unique index, having flags on each row still makes it easy to discover which one is set to default.

@jace
Copy link
Member Author

jace commented Aug 9, 2017

Using the foreign key method, (a) an app-side validator and (b) a PostgreSQL trigger can do an assert self.id == target.parent_id. This also avoids redundant flag columns per row and does not require an index.

@jace
Copy link
Member Author

jace commented Aug 11, 2017

Implementation and additional discussion in #134.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant