Add support for Java21 string template #1010

butterunderflow · 2023-12-15T13:27:20Z

86390d0: Support for string templates and corresponding tests. But won't work, because output tokens of com.sun.tools.javac.parser.Scanner are not sorted by their position, and then cause the verification here failed.
86390d0: Read all tokens from Scanner, and sort them if jdk version >= 21.

Some links may be useful:
How jdk parse string template from tokens: https://github.com/openjdk/jdk/blob/3d9d353edb64dd364925481d7b7c8822beeaa117/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java#L695-L746
How tokens of string templates are builded: https://github.com/openjdk/jdk/blob/3d9d353edb64dd364925481d7b7c8822beeaa117/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavaTokenizer.java#L342-L420

won't work because the output tokens of `com.sun.tools.javac.parser.Scanner` are not sorted by position

google-cla · 2023-12-15T13:27:25Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

cushon

Thanks for this!

I had started working on this in parallel before seeing this PR, and had merged a very simple initial handling of string templates in b5feefe. It just passes the entire string template through without formatting it. This approach is more complete, and I would like to merge most of what you have here.

cushon · 2023-12-18T22:46:27Z

core/src/main/java/com/google/googlejavaformat/java/JavacTokens.java

@@ -104,6 +123,8 @@ public static ImmutableList<RawTok> getTokens(
        break;
      }
      if (last < t.pos) {
+        /* If the current token is not immediately following the previous one,
+        treat the gap as a token */


I think the intent here was to only add tokens for these 'gaps' corresponding to whitespace. I think with this change there's also a gap for the \ that escapes { in the string templates, and this logic is adding in that token.

One alternative I was thinking about was to include the \ in the preceding string fragment token. That also solves the problem I mentioned below where an empty token is leading to a crash in JavaInput.buildToks.

I have a demo of that approach here, what do you think? https://github.com/google/google-java-format/compare/master...cushon:google-java-format:stringtemplate?expand=1#diff-a1167950838171d2ee18097c833af4c3155bfededea60870f677cb8a72738aa1R161

I think your demo will work without problem.

There's one thing I'm not very sure, will that too tricky to split \ and { into two tokens? For example, can we guarantee that { will follow \ immediately in output?

After some test of your demo, I can't find a negative test case to illustrate this. Even for very long string fragment, \ and { are stay together perfectly.

I think that should be OK, and that \{ is handled as an escape sequence where it wouldn't be valid to have any separate between the two characters.

cushon · 2023-12-18T22:48:17Z

core/src/main/java/com/google/googlejavaformat/java/JavacTokens.java

-  /** Lex the input and return a list of {@link RawTok}s. */
-  public static ImmutableList<RawTok> getTokens(
-      String source, Context context, Set<TokenKind> stopTokens) {
+  private static ImmutableList<Token> readAllToken(String source, Context context) {


With the test cases that were added for I981, I am seeing an issue where this returns tokens with empty string values for some string templates, which crash JavaInput.buildToks. I think the issue is that for string templates like "foo\{X}\{Y}bar" there are three string fragment tokens, and the middle one is empty.

java.lang.StringIndexOutOfBoundsException: Index 0 out of bounds for length 0 ... at java.base/java.lang.String.charAt(String.java:1555) at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:385) at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:335) at com.google.googlejavaformat.java.JavaInput.<init>(JavaInput.java:277) at com.google.googlejavaformat.java.Formatter.getFormatReplacements(Formatter.java:270) at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:257) at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:223)

cushon · 2023-12-18T22:50:12Z

core/src/main/java/com/google/googlejavaformat/java/java21/Java21InputAstVisitor.java

+
+      String frag = fragIt.next();
+      if (!fragIt.hasNext()) {
+        token(frag + "\"");


I don't think this handles unicode escapes. The approach in the other change was to just pass through the token for string fragments:

google-java-format/core/src/main/java/com/google/googlejavaformat/java/java21/Java21InputAstVisitor.java

Line 89 in b5feefe

token(builder.peekToken().get());

which is similar to how string literals worked previously:

google-java-format/core/src/main/java/com/google/googlejavaformat/java/JavaInputAstVisitor.java

Line 1658 in b5feefe

String sourceForNode = getSourceForNode(node, getCurrentPath());

butterunderflow · 2023-12-21T03:16:20Z

Thank you for this detailed reviewing!

butterunderflow · 2023-12-21T10:56:16Z

Should I close this PR and wait for yours to merge? There are some details in the original PR that weren't considered, and I found they're already handled in your fork.

cushon · 2023-12-21T18:00:09Z

Should I close this PR and wait for yours to merge? There are some details in the original PR that weren't considered, and I found they're already handled in your fork.

Thanks for taking a look! If the changes I made look OK I will go ahead and merge that version (and credit you).

The initial implementation passed through the entire string unmodified, this allows formatting the Java expressions inside the `\{...}`. See #1010 Co-authored-by: butterunderflow <[email protected]> PiperOrigin-RevId: 592889073

The initial implementation passed through the entire string unmodified, this allows formatting the Java expressions inside the `\{...}`. See #1010 Co-authored-by: butterunderflow <[email protected]> PiperOrigin-RevId: 592940163

cushon · 2023-12-21T21:45:49Z

This has been merged in 38de9c4. Thanks again for the contribution!

butterunderflow added 2 commits December 15, 2023 21:00

Add support for string template of Java 21

eb92b02

won't work because the output tokens of `com.sun.tools.javac.parser.Scanner` are not sorted by position

Sort the tokens if jdk version >= 21

86390d0

cushon requested changes Dec 19, 2023

View reviewed changes

copybara-service bot mentioned this pull request Dec 21, 2023

Improve support for string templates #1019

Merged

cushon closed this Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Java21 string template #1010

Add support for Java21 string template #1010

butterunderflow commented Dec 15, 2023

google-cla bot commented Dec 15, 2023

cushon left a comment

cushon Dec 18, 2023

butterunderflow Dec 21, 2023 •

edited

Loading

cushon Dec 21, 2023

cushon Dec 18, 2023

cushon Dec 18, 2023

butterunderflow commented Dec 21, 2023

butterunderflow commented Dec 21, 2023

cushon commented Dec 21, 2023

cushon commented Dec 21, 2023

Add support for Java21 string template #1010

Add support for Java21 string template #1010

Conversation

butterunderflow commented Dec 15, 2023

google-cla bot commented Dec 15, 2023

cushon left a comment

Choose a reason for hiding this comment

cushon Dec 18, 2023

Choose a reason for hiding this comment

butterunderflow Dec 21, 2023 • edited Loading

Choose a reason for hiding this comment

cushon Dec 21, 2023

Choose a reason for hiding this comment

cushon Dec 18, 2023

Choose a reason for hiding this comment

cushon Dec 18, 2023

Choose a reason for hiding this comment

butterunderflow commented Dec 21, 2023

butterunderflow commented Dec 21, 2023

cushon commented Dec 21, 2023

cushon commented Dec 21, 2023

butterunderflow Dec 21, 2023 •

edited

Loading