Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Java21 string template #1010

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import com.sun.tools.javac.parser.Tokens.TokenKind;
import com.sun.tools.javac.parser.UnicodeReader;
import com.sun.tools.javac.util.Context;
import java.util.Comparator;
import java.util.Set;

/** A wrapper around javac's lexer. */
Expand Down Expand Up @@ -71,22 +72,40 @@ public String stringVal() {
}
}

/** Lex the input and return a list of {@link RawTok}s. */
public static ImmutableList<RawTok> getTokens(
String source, Context context, Set<TokenKind> stopTokens) {
private static ImmutableList<Token> readAllToken(String source, Context context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the test cases that were added for I981, I am seeing an issue where this returns tokens with empty string values for some string templates, which crash JavaInput.buildToks. I think the issue is that for string templates like "foo\{X}\{Y}bar" there are three string fragment tokens, and the middle one is empty.

java.lang.StringIndexOutOfBoundsException: Index 0 out of bounds for length 0

...
	at java.base/java.lang.String.charAt(String.java:1555)
	at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:385)
	at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:335)
	at com.google.googlejavaformat.java.JavaInput.<init>(JavaInput.java:277)
	at com.google.googlejavaformat.java.Formatter.getFormatReplacements(Formatter.java:270)
	at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:257)
	at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:223)

if (source == null) {
return ImmutableList.of();
}
ScannerFactory fac = ScannerFactory.instance(context);
char[] buffer = (source + EOF_COMMENT).toCharArray();
Scanner scanner =
new AccessibleScanner(fac, new CommentSavingTokenizer(fac, buffer, buffer.length));
ImmutableList.Builder<Token> tokens = ImmutableList.builder();
do {
scanner.nextToken();
tokens.add(scanner.token());
} while (scanner.token().kind != TokenKind.EOF);
if (Runtime.version().feature() < 21) {
return tokens.build();
} else {
return ImmutableList.sortedCopyOf(Comparator.comparingInt(t -> t.pos), tokens.build());
}
}

/** Lex the input and return a list of {@link RawTok}s. */
public static ImmutableList<RawTok> getTokens(
String source, Context context, Set<TokenKind> stopTokens) {
if (source == null) {
return ImmutableList.of();
}
ImmutableList<Token> javacTokens = readAllToken(source, context);

var javacTokenIt = javacTokens.iterator();
ImmutableList.Builder<RawTok> tokens = ImmutableList.builder();
int end = source.length();
int last = 0;
do {
scanner.nextToken();
Token t = scanner.token();
Token t = javacTokenIt.next();
if (t.comments != null) {
for (Comment c : Lists.reverse(t.comments)) {
if (last < c.getSourcePos(0)) {
Expand All @@ -104,6 +123,8 @@ public static ImmutableList<RawTok> getTokens(
break;
}
if (last < t.pos) {
/* If the current token is not immediately following the previous one,
treat the gap as a token */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent here was to only add tokens for these 'gaps' corresponding to whitespace. I think with this change there's also a gap for the \ that escapes { in the string templates, and this logic is adding in that token.

One alternative I was thinking about was to include the \ in the preceding string fragment token. That also solves the problem I mentioned below where an empty token is leading to a crash in JavaInput.buildToks.

I have a demo of that approach here, what do you think? https://github.com/google/google-java-format/compare/master...cushon:google-java-format:stringtemplate?expand=1#diff-a1167950838171d2ee18097c833af4c3155bfededea60870f677cb8a72738aa1R161

Copy link
Contributor Author

@butterunderflow butterunderflow Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your demo will work without problem.

There's one thing I'm not very sure, will that too tricky to split \ and { into two tokens? For example, can we guarantee that { will follow \ immediately in output?

After some test of your demo, I can't find a negative test case to illustrate this. Even for very long string fragment, \ and { are stay together perfectly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be OK, and that \{ is handled as an escape sequence where it wouldn't be valid to have any separate between the two characters.

tokens.add(new RawTok(null, null, last, t.pos));
}
tokens.add(
Expand All @@ -113,7 +134,7 @@ public static ImmutableList<RawTok> getTokens(
t.pos,
t.endPos));
last = t.endPos;
} while (scanner.token().kind != TokenKind.EOF);
} while (javacTokenIt.hasNext());
if (last < end) {
tokens.add(new RawTok(null, null, last, end));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
import com.sun.source.tree.ExpressionTree;
import com.sun.source.tree.PatternCaseLabelTree;
import com.sun.source.tree.PatternTree;
import com.sun.source.tree.StringTemplateTree;
import java.util.Iterator;
import javax.lang.model.element.Name;

/**
Expand Down Expand Up @@ -86,4 +88,38 @@ protected void variableName(Name name) {
visit(name);
}
}

@Override
public Void visitStringTemplate(StringTemplateTree node, Void unused) {

scan(node.getProcessor(), null);

token(".");

Iterator<? extends ExpressionTree> exprIt = node.getExpressions().iterator();

Iterator<String> fragIt = node.getFragments().iterator();

var firstFrag = fragIt.next();

token("\"" + firstFrag);
while (fragIt.hasNext()) {
if (exprIt.hasNext()) {
ExpressionTree expr = exprIt.next();
token("\\");
token("{");
scan(expr, null);
token("}");
}

String frag = fragIt.next();
if (!fragIt.hasNext()) {
token(frag + "\"");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this handles unicode escapes. The approach in the other change was to just pass through the token for string fragments:

which is similar to how string literals worked previously:

String sourceForNode = getSourceForNode(node, getCurrentPath());

} else {
token(frag);
}
}

return null;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,8 @@ public class FormatterIntegrationTest {
"SwitchDouble",
"SwitchUnderscore",
"I880",
"Unnamed")
"Unnamed",
"StringTemplate")
.build();

@Parameters(name = "{index}: {0}")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -492,4 +492,24 @@ public void removeTrailingTabsInComments() throws Exception {
+ " }\n"
+ "}\n");
}

@Test
public void stringTemplateTests() throws Exception {
assertThat(
new Formatter()
.formatSource(
"public class Foo {\n"
+ " String test(){\n"
+ " var simple = STR.\"mytemplate1XXXX \\{exampleXXXX.foo()}yyy\";\n"
+ " var nested = STR.\"template \\{example. foo()+ STR.\"templateInner\\{ example}\"}xxx }\";\n"
+ " }\n"
+ "}\n"))
.isEqualTo(
"public class Foo {\n"
+ " String test() {\n"
+ " var simple = STR.\"mytemplate1XXXX \\{exampleXXXX.foo()}yyy\";\n"
+ " var nested = STR.\"template \\{example.foo() + STR.\"templateInner\\{example}\"}xxx }\";\n"
+ " }\n"
+ "}\n");
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
public class StringTemplates {
void test(){
var m = STR."template \{example}xxx";
var nested = STR."template \{example.foo()+ STR."templateInner\{example}"}xxx }";
var nestNested = STR."template \{example0.
foo() +
STR."templateInner\{example1.test(STR."\{example2
}")}"}xxx }";
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
public class StringTemplates {
void test() {
var m = STR."template \{example}xxx";
var nested = STR."template \{example.foo() + STR."templateInner\{example}"}xxx }";
var nestNested =
STR."template \{example0.foo()
+ STR."templateInner\{example1.test(STR."\{example2}")}"}xxx }";
}
}