Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support full Unicode in lexer #3117

Merged
merged 1 commit into from
Jul 1, 2021
Merged

RFC: Support full Unicode in lexer #3117

merged 1 commit into from
Jul 1, 2021

Conversation

leebyron
Copy link
Contributor

Depends on #3115

Implements RFC at graphql/graphql-spec#849.

  • Replaces isSourceCharacter with isUnicodeScalarValue

  • Adds isSupplementaryCodePoint, used in String, BlockStrings, and Comments to ensure correct lexing of JavaScript's UTF-16 source.

  • Updates printCodePointAt to correctly print supplementary code points.

  • Adds variable-width Unicode escape sequences

  • Adds explicit support for legacy JSON-style fixed-width Unicode escape sequence surrogate pairs.

  • Adds printString to no longer rely on JSON.stringify. Borrows some implementation details from Node.js internals for string printing.

    Implements:

    When producing a {StringValue}, implementations should use escape sequences to
    represent non-printable control characters (U+0000 to U+001F and U+007F to
    U+009F). Other escape sequences are not necessary, however an implementation may
    use escape sequences to represent any other range of code points.

Closes #2449

Co-authored-by: Andreas Marek andimarek@fastmail.fm

@leebyron leebyron force-pushed the unicode-lexer branch 2 times, most recently from 22b5929 to 7b7a338 Compare May 21, 2021 22:17
src/language/lexer.js Outdated Show resolved Hide resolved
@leebyron leebyron added PR: feature 🚀 requires increase of "minor" version number spec RFC Implementation of a proposed change to the GraphQL specification labels May 27, 2021
Depends on #3115

Implements RFC at graphql/graphql-spec#849.

* Replaces `isSourceCharacter` with `isUnicodeScalarValue`
* Adds `isSupplementaryCodePoint`, used in String, BlockStrings, and Comments to ensure correct lexing of JavaScript's UTF-16 source.
* Updates `printCodePointAt` to correctly print supplementary code points.
* Adds variable-width Unicode escape sequences
* Adds explicit support for legacy JSON-style fixed-width Unicode escape sequence surrogate pairs.
* Adds `printString` to no longer rely on `JSON.stringify`. Borrows some implementation details from Node.js internals for string printing.

  Implements:

  > When producing a {StringValue}, implementations should use escape sequences to
  > represent non-printable control characters (U+0000 to U+001F and U+007F to
  > U+009F). Other escape sequences are not necessary, however an implementation may
  > use escape sequences to represent any other range of code points.

Closes #2449

Co-authored-by: Andreas Marek <andimarek@fastmail.fm>
Copy link
Member

@IvanGoncharov IvanGoncharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leebyron Looks good!
Feel free to merge if it approved on today WG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: feature 🚀 requires increase of "minor" version number spec RFC Implementation of a proposed change to the GraphQL specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants