Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stings vs. bites weirdness #4312

Closed
spencerkimball opened this issue Feb 11, 2016 · 10 comments
Closed

Stings vs. bites weirdness #4312

spencerkimball opened this issue Feb 11, 2016 · 10 comments
Assignees

Comments

@spencerkimball
Copy link
Member

I have the following schema:

CREATE TABLE IF NOT EXISTS loadgen.comments (
  photoID   BYTES,
  commentID BYTES DEFAULT EXPERIMENTAL_UUID_V4(),
  userID    INT,
  message   STRING,
  timestamp TIMESTAMP,

  PRIMARY KEY (photoID, commentID)
);

When trying to insert into the table with the following go code, it fails:

const insertSQL = `
INSERT INTO loadgen.comments VALUES ($1::bytea, DEFAULT, $2, $3, NOW());
`
    const minMessageLen = 32
    const maxMessageLen = 1024
    message := randStringBytes(minMessageLen + rand.Intn(maxMessageLen-minMessageLen))
    photoID := []byte{'a', 'b', 'c'}
    if _, err := tx.Exec(insertSQL, photoID, authorID, message); err != nil {
        log.Infof("insert into photos failed: %s", err)
        return err
    }

The failure is: pq: invalid cast: parameter -> BYTEA

It works if I don't add the cast. However, when doing a select:

SELECT commentID, userID, message, timestamp FROM loadgen.comments WHERE photoID = $1::bytea ORDER BY timestamp LIMIT 100

If I DON'T add the ::bytea cast, it fails with: pq: unsupported comparison operator: <bytes> = <string>

@spencerkimball
Copy link
Member Author

Also, even though I can get the insert to work if I leave out the ::bytea cast, it sometimes screws up royally, in particular if the photoID I'm sending starts with a 0x00 byte, which causes an empty photo ID.

@tamird
Copy link
Contributor

tamird commented Feb 11, 2016

I think the correct behaviour is that you wouldn't need the cast on the SELECT.

I dug into this issue a bit, and it's tricky! Basically, we do the type inference correctly during query preparation, but we don't distinguish between BYTES and STRING on the wire (see https://github.com/cockroachdb/cockroach/blob/master/sql/pgwire/types.go#L240). When the client sends back that same Oid with the Bind, we interpret it as a STRING (see https://github.com/cockroachdb/cockroach/blob/master/sql/pgwire/types.go#L234) and type checking (the second time) fails.

I tried to fix this using old.T_bytea instead of old.T_text but I got a fun error from lib/pq: unexpected error: pq: param $1: unsupported OID: 17.

Perhaps this is something our new type system can better contend with (strings need to coerce to byte arrays?). cc @knz @nvanbenschoten @mjibson

@knz
Copy link
Contributor

knz commented Feb 11, 2016

Ok so if we look at #4121 there are 3 things that could/should happen:

  • the AST should be preserved in the connection from PREPARE to EXECUTE.
  • the placeholder would get typed as "bytes" automatically, and this would be annotated in the AST
  • the client should really be informed that $1 is "bytes" during PREPARE. If for whatever reason this cannot be done in pgwire, the type "bytes" should still be preserved in the AST.
  • during EXECUTE, we would know (from the AST) what is the correct type for $1, and know to expect "bytes" from the client. If for whatever reason the client has to provide 'string' instead, we could envision a silent cast there (although my preference goes for an error, and suggest fixing the client and the intermediary libraries/protocols)

@petermattis
Copy link
Collaborator

I don't like the idea of string coercing automatically to bytes. They are distinct types for a reason: strings contain utf8 while bytes contain arbitrary octets. Let's take a whack at fixing this without introducing additional automatic coercions.

@tamird
Copy link
Contributor

tamird commented Feb 11, 2016

I think @knz's suggestion would work.
On Feb 11, 2016 8:25 AM, "Peter Mattis" notifications@github.com wrote:

I don't like the idea of string coercing automatically to bytes. They are
distinct types for a reason: strings contain utf8 while bytes contain
arbitrary octets. Let's take a whack at fixing this without introducing
additional automatic coercions.


Reply to this email directly or view it on GitHub
#4312 (comment)
.

@andreimatei
Copy link
Contributor

What knz said.
With "the new type system", the type "bytes" would be inferred in both the
insert and the select, so you'll no longer have to write any casts. You can
still write them if you want for whatever reason, and btw, an expression
like $1::bytes is not really a cast, but a type hint (so you're forcing $1
to be inferred as bytes, there's no CAST node in the AST). If you actually
want to cast $1 in your queries to, say, "text", you'd have to write
"$1::bytes::text"

On Thu, Feb 11, 2016 at 8:55 AM, Tamir Duberstein notifications@github.com
wrote:

I think @knz's suggestion would work.
On Feb 11, 2016 8:25 AM, "Peter Mattis" notifications@github.com wrote:

I don't like the idea of string coercing automatically to bytes. They are
distinct types for a reason: strings contain utf8 while bytes contain
arbitrary octets. Let's take a whack at fixing this without introducing
additional automatic coercions.


Reply to this email directly or view it on GitHub
<
#4312 (comment)

.


Reply to this email directly or view it on GitHub
#4312 (comment)
.

@JackKrupansky
Copy link

Minor typo in issue title: "Stings" s.b. "Strings". (Or maybe "bytes" s.b. "bites" as in "Stings vs. bites"?!)

@spencerkimball spencerkimball changed the title Stings vs. bytes weirdness Stings vs. bites weirdness Feb 11, 2016
@spencerkimball
Copy link
Member Author

Thanks @JackKrupansky I fixed it. Lol.

@petermattis
Copy link
Collaborator

@tamird The unsupported OID error is from our code. I think if you send back T_bytea you need to be prepared to receive it as well.

@tamird
Copy link
Contributor

tamird commented Feb 11, 2016

Duh. PR incoming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants