Remove the dependency on SHOW statements. Use info_schema and pg_catalog more. #59

knz · 2018-07-04T15:59:38Z

Needed to fix cockroachdb/cockroach#26983
and to complete cockroachdb/cockroach#27098
Informs cockroachdb/cockroach#26993

Fixes #57.
Fixes #50.

bdarnell

Thanks for all the updates!

bdarnell · 2018-07-05T18:10:58Z

cockroachdb/sqlalchemy/dialect.py


    def has_table(self, conn, table, schema=None):
        # Upstream implementation needs pg_table_is_visible().
        return any(t == table for t in self.get_table_names(conn, schema=schema))

    # The upstream implementations of the reflection functions below depend on
-    # get_table_oid() which needs pg_table_is_visible().
-
+    # correlated subqueries which are not yet supported.
    def get_columns(self, conn, table_name, schema=None, **kw):
        res = []
        # TODO(bdarnell): escape table name


This TODO can go away now.

bdarnell · 2018-07-05T18:17:57Z

cockroachdb/sqlalchemy/dialect.py

+        for row in conn.execute('''
+        SELECT column_name, data_type, is_nullable::bool, column_default
+        FROM information_schema.columns
+        WHERE table_schema = %s AND table_name = %s''', (schema or self.default_schema_name, table_name)):


This looks like a backwards-incompatible change in the handling of schemas. Previously we used schema as a catalog, and now it's being used as a schema. Maybe that's necessary for broader compatibility, but it may be disruptive.

We discussed this offline. The outcome of that discussion is that we'll be assuming users were not using cross-db SQLAlchemy schemas in prod.

bdarnell · 2018-07-05T18:37:07Z

cockroachdb/sqlalchemy/dialect.py

+        for row in conn.execute('''
+SELECT index_name, column_name, (not non_unique::bool) as unique, implicit::bool as implicit
+FROM information_schema.statistics
+WHERE table_schema = %s AND table_name = %s


Is information_schema.statistics the right place to get this? Surprising (but I don't doubt it).

The TODO about escaping can be removed, as can the following lines about compatibility with ancient versions that don't have the implicit column. Since CI passes this is all compatible going back to 1.0.6, right?

Oh, CI's not even returning results on this PR, but we didn't have it set as required so github showed the green button anyway. Looks like a build was run but failed due to an inability to load the right python version :(

These queries on information_schema (and the others) will not work in CockroachDB 1.0 anyway.

Simplified the code accordingly.

bdarnell · 2018-07-05T18:57:39Z

cockroachdb/sqlalchemy/dialect.py

-    def get_foreign_keys(self, conn, table_name, schema=None, **kw):
-        fkeys = []
+    # This method is the same as the one in SQLAlchemy's pg dialect, with
+    # a tweak to the FK regular expressions to tolerate whitespace between


It's unfortunate we have to duplicate so much to change that regex , but I guess there's no way around it now.

bdarnell · 2018-07-05T19:01:17Z

cockroachdb/sqlalchemy/stmt_compiler.py


-class CockroachCompiler(PGCompiler_psycopg2):
+# Ideally we'd like to inherit most of the code from
+# PGIdentifierPreparer however SQLAlchemy does not export it. So we


It looks like it's just as exported as PGDIalect is (which we depend on in dialect.py). Can't we import it from sqlalchemy.dialects.postgresql.base.PGIdentifierPreparer?

Oh nice, that works. Done.

bdarnell · 2018-07-05T19:06:02Z

cockroachdb/sqlalchemy/stmt_compiler.py


+# This is extracted from CockroachDB's `sql.y`. Add keywords here if *NEW* reserved keywords


Alternately we could future-proof this by querying the server's list of reserved words in dialect.initialize (this is where some other version check queries are done).

That won't do -- we need to keep all the "old" keywords too here. If we're to keep a list of all the keywords that have ever been keywords in the past, we have to start with the entire current list.

Why do you need all the old keywords? The keyword list controls only controls quoting, and you only need to quote things that are considered keywords by the server you're talking to. This package needs to work with old servers, so we couldn't have a static list of the current version's keywords, but I don't see why loading the list dynamically wouldn't work.

There is a slight problem with doing this in dialect.initialize, in that this wouldn't refresh the keyword list if the server got rolling-upgraded out from under you, but that's an issue with the static list too. Ideally you'd refresh this on every connection, or maybe cache it based on a query of the server's version.

Ok filed as #62.

bdarnell · 2018-07-05T19:22:43Z

test/sqlalchemy/test_across_schema.py

-
+from sqlalchemy.orm import sessionmaker, load_only
+from sqlalchemy import func
+from sqlalchemy import distinct

 class AcrossSchemaTest(fixtures.TestBase):
    TEST_DATABASE = 'test_sqlalchemy_across_schema'


This constant is no longer used. As a result, this test isn't really testing what it used to. What's important here is references across two user-defined collections of tables (whether you call them catalogs, schemas, or databases), not references between public and information_schema.

See my other comment about cross-db schemas.

knz

Thanks for your comments.

As we discussed separately, I will run the tests against a 2.0 server and see what happens (I have not done so yet).

We'll also refrain from announcing backward compat with cross-db queries in 1.0.

knz · 2018-07-09T13:15:31Z

cockroachdb/sqlalchemy/dialect.py


    def has_table(self, conn, table, schema=None):
        # Upstream implementation needs pg_table_is_visible().
        return any(t == table for t in self.get_table_names(conn, schema=schema))

    # The upstream implementations of the reflection functions below depend on
-    # get_table_oid() which needs pg_table_is_visible().
-
+    # correlated subqueries which are not yet supported.
    def get_columns(self, conn, table_name, schema=None, **kw):
        res = []
        # TODO(bdarnell): escape table name


knz · 2018-07-09T13:16:17Z

cockroachdb/sqlalchemy/dialect.py

+        for row in conn.execute('''
+        SELECT column_name, data_type, is_nullable::bool, column_default
+        FROM information_schema.columns
+        WHERE table_schema = %s AND table_name = %s''', (schema or self.default_schema_name, table_name)):


We discussed this offline. The outcome of that discussion is that we'll be assuming users were not using cross-db SQLAlchemy schemas in prod.

knz · 2018-07-09T13:17:34Z

cockroachdb/sqlalchemy/dialect.py

+        for row in conn.execute('''
+SELECT index_name, column_name, (not non_unique::bool) as unique, implicit::bool as implicit
+FROM information_schema.statistics
+WHERE table_schema = %s AND table_name = %s


These queries on information_schema (and the others) will not work in CockroachDB 1.0 anyway.

Simplified the code accordingly.

knz · 2018-07-09T13:19:06Z

cockroachdb/sqlalchemy/stmt_compiler.py


+# This is extracted from CockroachDB's `sql.y`. Add keywords here if *NEW* reserved keywords


That won't do -- we need to keep all the "old" keywords too here. If we're to keep a list of all the keywords that have ever been keywords in the past, we have to start with the entire current list.

knz · 2018-07-09T13:20:50Z

cockroachdb/sqlalchemy/stmt_compiler.py


-class CockroachCompiler(PGCompiler_psycopg2):
+# Ideally we'd like to inherit most of the code from
+# PGIdentifierPreparer however SQLAlchemy does not export it. So we


Oh nice, that works. Done.

knz · 2018-07-09T13:21:26Z

test/sqlalchemy/test_across_schema.py

-
+from sqlalchemy.orm import sessionmaker, load_only
+from sqlalchemy import func
+from sqlalchemy import distinct

 class AcrossSchemaTest(fixtures.TestBase):
    TEST_DATABASE = 'test_sqlalchemy_across_schema'


See my other comment about cross-db schemas.

knz · 2018-07-12T22:56:06Z

It took me a while but I succeeded: the code is now simultaneously compatible with CockroachDB 1.1, 2.0 and 2.1. (I tested.)

PTAL.

I am bumping the version because of the significant change in the versioning approach.

bdarnell · 2018-07-15T18:24:56Z

CHANGES.md

@@ -1,3 +1,10 @@
+# Version 0.2.0
+
+Released July 13, 2018


Don't forget to update this date when this merges, and ping Justin or I to ship a release (Or I can give you permissions if you want to share ownership of this package).

Yes I will update and ping.

bdarnell · 2018-07-15T18:29:50Z

cockroachdb/sqlalchemy/dialect.py

        self._has_native_jsonb = True
+        sversion = connection.scalar("select version()")
+        self._is_v2plus = " v2" in sversion


Be consistent about the inclusion of the leading space and trailing dot here and on the next line.

bdarnell · 2018-07-15T18:31:30Z

cockroachdb/sqlalchemy/dialect.py

+        sversion = connection.scalar("select version()")
+        self._is_v2plus = " v2" in sversion
+        self._is_v21plus = self._is_v2plus and ("v2.0." not in sversion)
+        self._has_native_json = self._is_v2plus


Shouldn't _has_native_jsonb be set the same way?

I suppose so. Done.

bdarnell · 2018-07-15T18:32:20Z

cockroachdb/sqlalchemy/dialect.py

+        self._is_v21plus = self._is_v2plus and ("v2.0." not in sversion)
+        self._has_native_json = self._is_v2plus
+
+    def is_v2plus(self):


This doesn't appear to be used (and I don't think it should be; it's fine to just access the attribute directly)

Yes, thanks. Removed.

bdarnell · 2018-07-15T18:37:18Z

cockroachdb/sqlalchemy/dialect.py

        return fkeys

+    def get_pk_constraint(self, conn, table_name, schema=None, **kw):
+        if self._is_v21plus:
+            return PGDialect.get_pk_constraint(self, conn, table_name, schema, **kw)


Nit: The preferred way to do this in python is return super(CockroachDBDialect, self).get_pk_constraint(conn, table_name, schema, **kw) (although it doesn't really make a difference unless multiple inheritance is used).

Yes thanks I somehow knew that and still did it wrong. Thanks for pointing it out.

bdarnell · 2018-07-15T19:05:25Z

test/sqlalchemy/test_across_schema.py


 class AcrossSchemaTest(fixtures.TestBase):
-    TEST_DATABASE = 'test_sqlalchemy_across_schema'


As we discussed, I'm fine with some backwards-incompatibility here (since it's already kind of broken). But we should have some way to make cross-database references going forward (it may not be common, but remember that this capability did originate with a user request in #30). I think the way to do this in SQLAlchemy is to use a dotted schema name that includes the database: schema=self.TEST_DATABASE + '.public', which means that we should recognize this and adjust our information_schema queries accordingly. But the base postgres dialect doesn't do this, so I'm not certain about this (maybe cross-database references are rare in postgres because people use schemas for this purposes instead. Our lack of user-created schemas makes cross-database support more important).

Yes as you found out this does not work because the base dialect issues more queries on pg_catalog tables with the schema string as constrain on various pg_catalog columns.

I would like to move this PR forward nevertheless: the feature was broken in 2.0 anyway and this PR here is about ensuring that the code at least runs with 2.0 and 2.1.

I would really prefer to solve this problem by addressing cockroachdb/cockroach#26443

If we don't go in that direction we'll have no choice but to implement the entire sqlalchemy dialect from scratch.

This change also ensures the adapter is compatible across v1.1, v2.0 and v2.1.

knz · 2018-07-16T15:39:13Z

I bumped the date and informed justin. Merging. Thanks for all the feedback!

knz requested a review from bdarnell July 4, 2018 15:59

knz changed the title ~~Remove the dependency on SHOW statement. Use pg_catalog more.~~ Remove the dependency on SHOW statements. Use pg_catalog more. Jul 4, 2018

knz changed the title ~~Remove the dependency on SHOW statements. Use pg_catalog more.~~ Remove the dependency on SHOW statements. Use info_schema and pg_catalog more. Jul 4, 2018

bdarnell reviewed Jul 5, 2018

View reviewed changes

knz commented Jul 9, 2018

View reviewed changes

bdarnell reviewed Jul 15, 2018

View reviewed changes

knz mentioned this pull request Jul 16, 2018

Auto-generate the list of keywords upon dialect initialize #62

Open

bdarnell approved these changes Jul 16, 2018

View reviewed changes

Remove the dependency on SHOW statement. Use pg_catalog more.

c37c00b

This change also ensures the adapter is compatible across v1.1, v2.0 and v2.1.

knz merged commit 57dea9a into cockroachdb:master Jul 16, 2018

knz deleted the 20180704-bump branch July 17, 2018 07:39


		# This is extracted from CockroachDB's `sql.y`. Add keywords here if NEW reserved keywords


		class AcrossSchemaTest(fixtures.TestBase):
		TEST_DATABASE = 'test_sqlalchemy_across_schema'

Remove the dependency on SHOW statements. Use info_schema and pg_catalog more. #59

Remove the dependency on SHOW statements. Use info_schema and pg_catalog more. #59

Conversation

knz commented Jul 4, 2018 • edited Loading

bdarnell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knz commented Jul 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knz commented Jul 16, 2018

knz commented Jul 4, 2018 •

edited

Loading