Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for text blocks in Escapers #7421

Open
3 tasks done
graememorgan opened this issue Oct 8, 2024 · 3 comments
Open
3 tasks done

Support for text blocks in Escapers #7421

graememorgan opened this issue Oct 8, 2024 · 3 comments
Labels
P3 package=escape type=enhancement Make an existing feature better

Comments

@graememorgan
Copy link
Member

API(s)

`com.google.common.escape.SourceCodeEscapers`

How do you want it to be improved?

There are existing Escapers that support escaping a String to include in Java source code as a quoted string literal.

What I'd really like is an Escaper which escapes appropriately for Java 15 text blocks.

Why do we need it to be improved?

"Need" is a strong word! We have a string literal escaper; I think an escaper for text block literals is a natural (if low-priority) extension given the new feature.

Example

The existing escapers turn `"` into `\"`.

A text block escaper would leave `"` unchanged unless it's a _triple_ `"""`. It also wouldn't need to escape newlines, but _would_ need to worry about trailing spaces at the end of lines.

Current Behavior

I could use javaCharEscaper() and put that in a text block.

Desired Behavior

I'd like an escaper that makes better use of the functionality of text blocks to produce more readable source.

Concrete Use Cases

I was writing an ErrorProne check that google-java-formats code written in text blocks.

Checklist

@graememorgan graememorgan added the type=enhancement Make an existing feature better label Oct 8, 2024
@cpovirk
Copy link
Member

cpovirk commented Oct 8, 2024

(previous request in another thread, which I had forgotten about)

A text block escaper would leave " unchanged unless it's a triple """.... would need to worry about trailing spaces at the end of lines.

The literal Escaper class is limited to operating on a per-character basis. (This limitation has prevented us from offering things before, as in internal bug 2176354 about an XML CDATA escaper.) With it, we'd be stuck with escaping ", and, to deal with trailing spaces, we'd have to either escape newlines to \n or else escape all spaces to \s. (Hmm, and I'd have to think about what happens at the very end of the string, too.) Obviously we'd want to go another direction.

And conveniently, none of that prevents us from offering SourceCodeEscapers.escapeAndProbablyAlsoTripleQuoteWhatIPromiseIsTheEntireContentOfAJavaTextBlock. We would have to decide whether to triple-quote the output or not:

  • If we don't produce """ around the output, then I'd worry a small amount that users would get themselves in trouble by concatenating the results of two calls.
  • If we do produce """ around the output, then we may have style decisions to make about the end of the string. Actually, I guess we have style decisions to make no matter what, since we might want to offer indentation? Or maybe that's best left to a later formatter step, especially since users may want to de-indent the text but only if it exceeds the line limit? But maybe we'd want to indent a little bit in case we decide that google-java-format will preserve existing de-indentation even when it's not necessary to stay under the column limit?

I will make the usual comment for the record that typical users will want something more like JavaPoet to perform their source-code generation at a higher level. (We try to steer users away from our HTML escapers on similar grounds. We should do the same for our source-code escapers.) [edit: Not that I can tell whether they have support for text blocks. And of course there is a relatively less pressing need for text blocks in generated code, which may be read by humans less frequently than Error Prone's test data.]

Also, by the way, SourceCodeEscapers currently remains Google-internal, at least partially on those grounds :)

@cpovirk
Copy link
Member

cpovirk commented Oct 8, 2024

The literal Escaper class is limited to operating on a per-character basis.

@graememorgan just pointed out to me that that is not actually true :) But what fun would it have been to type "guava.dev/escaper" into a browser when I could instead have been looking up bugs from 15 years ago? I notice now that that old bug even says only that the requested CDATA escaper "isn't really an Escaper in the CharEscaper sense." Now, it looks like that concern dates to the days when Escaper itself offered Appendable escape(Appendable out), in which case even Escaper had an implicit assumption that it wasn't necessarily receiving "the whole string" at once. (It could perhaps have solved the problem with buffering, but that may have created new problems, especially since Appendable offers no close() operation.) Nowadays, a plain Escaper doesn't have to worry about the Appendable problem. Still, we might worry about muddying the water by introducing escapers that couldn't be implemented in terms of only individual chars / code points. Or maybe the waters have been muddied by other Escaper implementations in the wild. Or maybe no one has been stopping to think about how muddy the waters were before drinking them, anyway. The bigger question might end up being whether we would worry that users would manually concatenate the results of multiple calls or whether we'd want to wrap the result with """ so that we can do something clever. (Wrapping seems at least more clearly conceptually incompatible with our current conception of what an Escaper does. But I've already been wrong once in the past 20 minutes.)

@xc-x
Copy link

xc-x commented Oct 10, 2024

Hi I would like to complete this issue. Can you plz assign it to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 package=escape type=enhancement Make an existing feature better
Projects
None yet
Development

No branches or pull requests

4 participants