Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP: Expose rustc_lexer::unescape Functionality in the proc_macro Crate for Standardized Literal Parsing #459

Open
lucarlig opened this issue Oct 8, 2024 · 0 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@lucarlig
Copy link

lucarlig commented Oct 8, 2024

Proposal


Problem Statement

Currently, proc-macros that handle string literals receive raw strings with escape sequences and surrounding quotes. For example:

#[my_macro]
#[my_attr("\u{x78} blabla")]
pub struct B;

In the my_attr proc-macro, the received value is "\u{x78} blabla", including escape sequences and quotes, instead of the parsed equivalent ("x blabla"). This makes working with string literals cumbersome, as proc-macro authors need to reimplement unescape logic that already exists within the Rust compiler.

Motivating Examples or Use Cases

  • Simplifying syn Library: Libraries like syn need to manually reimplement string literal unescaping. Having the unescape functionality available in the proc_macro crate would allow syn::LitStr::value() to use the standardized unescape function directly, leading to simplified and more reliable code.

  • Consistency Across Tools: The Rust compiler already provides unescape functionality in rustc_lexer::unescape. Making this available publicly would ensure that tools and proc-macros handle escape sequences consistently.

  • Reducing Code Duplication: Many proc-macro authors currently need to implement their own logic to handle escape sequences, resulting in duplicated code and potential inconsistencies. Exposing the compiler's unescape functionality would reduce redundancy.

Solution Sketch

  • Expose Unescape Functionality in proc_macro Crate: The unescape functionality from rustc_lexer::unescape should be exposed in the proc_macro crate, making it accessible for use in proc-macros.

  • Public API for Literal Processing: A new API can be added to the proc_macro crate that allows developers to parse and unescape string literals in an ergonomic and standardized way. This would significantly simplify the process of handling string literals in attributes and proc-macros.

Alternatives

  • Reimplement in Libraries: The current approach is for libraries like syn to reimplement the unescape logic. This is not ideal due to code duplication, maintenance burdens, and the potential for inconsistencies.

  • External Crate: Instead of adding the unescape functionality to the proc_macro crate, another option would be to create an external crate. However, considering that this functionality is tied to parsing Rust literals, adding it to the standard library seems more suitable.

  • Leave as Is: Another alternative is to continue requiring proc-macro authors to implement their own unescape logic. However, this is not desirable due to the associated complexity and inconsistency.

Additional Considerations

  • Extend to All Literals: Extending this unescape functionality to all literal types, such as C-strings, integers, and floats, would improve consistency across different literal types and make parsing easier for proc-macro authors working with diverse literals.

  • Refactoring to Work Outside Compiler: The proc_macro crate is being refactored to work even when run outside of the compiler. Therefore, the unescape functionality should be implemented in a way that does not depend on the compiler being available. This means making the unescape logic sufficiently library-agnostic so it can be used independently of the compiler context.

  • Library-First Approach: The unescape function can likely be developed in a library-agnostic way to avoid code duplication. This suggests an opportunity to make the unescape function reusable, without relying on tight coupling with compiler internals, and making it broadly available.

Links and Related Work

@lucarlig lucarlig added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

1 participant