Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactivity API: Move Store's data encoding to the echo call #51974

Merged
merged 3 commits into from
Jul 24, 2023

Conversation

DAreRodz
Copy link
Contributor

What?

Moves wp_json_encode() to the WP_Interactivity_Store's render method (next to echo), thus escaping as late as possible.

Also, removes the unneeded serialize method.

Props to @danielwrobert. 😄

Why?

To conform with WP security guiding principles.

Testing Instructions

PHP unit tests should pass.

luisherranz
luisherranz previously approved these changes Jun 27, 2023
Copy link
Member

@luisherranz luisherranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

We can fix another potential problem. See Dennis' comment below.

echo "<script id=\"wp-interactivity-store-data\" type=\"application/json\">$store</script>";
echo sprintf(
'<script id="wp-interactivity-store-data" type="application/json">%s</script>',
wp_json_encode( self::$store )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we ought to also escape potentially ambiguous characters

wp_json_encode( self::$store, JSON_HEX_TAG | JSON_HEX_AMP )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context (Dennis explained this to me privately), a user could inject {"content": "this is a </script>"} and it will close the script element unexpectedly. Or a malicious user could inject {"content": "this is a </script><script>malicious();</script>"} and execute malicious code.

Using JSON_HEX_TAG turns that into {"content": "this is a \u003C/script\u003E"}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to add a small PHPUnit test for this, and it turned out a malicious script can't be injected that way because trailing slashes are escaped by default (there is a flag to deactivate this behavior, see JSON_UNESCAPED_SLASHES). 🤷

The example above actually becomes as follows:

{"content":"this is a <\/script><script>malicious();<\/script>"}

As <\/script> is not a valid closing tag, the <script> remains open, and the malicious code cannot be interpreted and executed.

However, this already happens with trailing slashes anywhere! E.g.,

{ "url": "http://mydomain.com/page/123" }

turns into

{"url":"http:\/\/mydomain.com\/page\/123"}

So json_encode seems to be secure by default (in that regard), although it is modifying strings in a way that we are not handling correctly in the Interactivity API runtime.

Now, my questions:

  • Should we stay with the json_encode's default behavior (secure) and, in the client, process the <script> content to convert all \/ to / and then parse the result?
  • Or should we use the above flags, along with JSON_UNESCAPED_SLASHES, to avoid processing the <script> content in the client anyway?

Copy link
Member

@luisherranz luisherranz Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting.

process the <script> content to convert all / to / and then parse the result?

JSON.parse seems to be handling these cases well, including the dangerous script tag:

JSON.parse('{"url":"http:\/\/mydomain.com\/page\/123"}').url
// outputs 'http://mydomain.com/page/123'

JSON.parse('{"content":"this is not a <\/span><span>malicious<\/span> content"}').content
// outputs 'this is not </span><span>malicious</span> content'

JSON.parse('{"content":"this is a <\/script><script>malicious();<\/script> script"}').content
// outputs 'this is a \x3C/script>\x3Cscript>malicious();\x3C/script> script'

So unless we want to allow script tags in the state strings (which we don't), I don't think we need to do anything in terms of client parsing.

Copy link
Contributor Author

@DAreRodz DAreRodz Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Re)learning stuff about JSON after years and years of using it. 😄

Yup, backslash escaping in strings is part of the JSON specs, so JSON.parse already handles them correctly. As stated in the specification (link):

"A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes."

So, you are right; nothing to do in terms of client parsing.

So unless we want to allow script tags in the state strings (which we don't), I don't think we need to do anything in terms of client parsing.

Actually, when you do console.log (i.e., printing the string to users in the client), the output of that string is the expected one:

a </script><script>malicious();</script> script

Again, nothing to do in this regard. I guess the code is secure, after all? I do not want to add any security breaches, although I don't see issues with the current implementation. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems mostly fine; thanks for testing everything. I've tried myself and can't figure out how to break it the way I thought I could. I've asked for external review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Dennis!

I've asked for external review.

Should we wait?

Copy link
Contributor

@mdawaffe mdawaffe Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's wise to be cautious :) There's no penalty to using JSON_HEX_TAG other than making the raw HTML slightly harder for a human to read.

I know of one way the current output (without the JSON_HEX_TAG flag) can be broken: an opening comment tag.

What content the HTML spec allows inside <script> tags (even those that have a non-executable type attribute value) is really odd.

Try saving the following as wat.php and loading it in a browser:

<!DOCTYPE html>
<html>
JavaScript is weird
<!-- First Script -->
<script type="application/json"><?php echo json_encode( ["one" => "two<!-- <script three"] ); ?></script>
<!-- Second Script -->
<script>document.write('but fun')</script>
right?
</html>

At first glance, it looks like it should generate a DOM that, to a human, would read "JavaScript is weird but fun right?". Instead, you'll see that it reads "JavaScript is weird right?". The problem is the <!-- <script in the first script. When a browser sees a byte sequence like that inside a <script> tag, it will keep that script tag open until it finds a closing comment tag (-->) followed by a closing script tag (</script>) (with anything between the two).

In the example above, that rule means that the browser only "sees" one script tag, and it has the following contents:

{"one":"two<!-- <script three"}</script>
<!-- Second Script -->
<script>document.write('but fun')

(Viewing the page source of wat.php in your browser or inspecting the DOM will make this clear.)

I don't know if it's possible for a byte sequence like <!-- <script to appear in the data being JSON encoded here, but let's do

wp_json_encode( self::$store, JSON_HEX_TAG | JSON_HEX_AMP )

as suggested anyway. JSON_HEX_TAG will prevent any of the above weirdness. JSON_HEX_AMP isn't at all related, but it may prevent some further weirdness if anyone is trying to serve these pages as XHTML (a very very odd thing to do these days :)). There may be a couple other odd edge cases where JSON_HEX_AMP is handy too - I don't remember (though they probably only come up in old browsers).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did confirm that using echo json_encode( [ 'one' => '<!-- <script' ] ) results in the unwanted output to the browser. thanks for remembering that case, @mdawaffe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thank you very much for the explanation, @mdawaffe! 🙏 And also you, @dmsnell, for suggesting that change. 😄

@luisherranz luisherranz self-requested a review June 27, 2023 13:59
@luisherranz luisherranz dismissed their stale review June 27, 2023 14:00

Let's add what Dennis proposed.

Base automatically changed from interactivity to trunk June 28, 2023 16:00
@DAreRodz DAreRodz requested a review from dmsnell July 4, 2023 14:23
@github-actions
Copy link

This pull request has changed or added PHP files. Please confirm whether these changes need to be synced to WordPress Core, and therefore featured in the next release of WordPress.

If so, it is recommended to create a new Trac ticket and submit a pull request to the WordPress Core Github repository soon after this pull request is merged.

If you're unsure, you can always ask for help in the #core-editor channel in WordPress Slack.

Thank you! ❤️

View changed files
❔ lib/experimental/interactivity-api/class-wp-interactivity-store.php
❔ phpunit/experimental/interactivity-api/class-wp-interactivity-store-test.php

@DAreRodz DAreRodz force-pushed the interactivity-store-json-encode-inside-render branch from 3b33a6d to 9176b2f Compare July 21, 2023 18:59
@DAreRodz
Copy link
Contributor Author

I messed up the branch with some merges, so I had to force-push again. Sorry for any trouble. 🙇

@github-actions
Copy link

Flaky tests detected in 9176b2f.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/5625683399
📝 Reported issues:

@DAreRodz DAreRodz added [Type] Code Quality Issues or PRs that relate to code quality [Feature] Interactivity API API to add frontend interactivity to blocks. [Type] Security Related to security concerns or efforts labels Jul 24, 2023
@DAreRodz DAreRodz merged commit cdd4f8b into trunk Jul 24, 2023
@DAreRodz DAreRodz deleted the interactivity-store-json-encode-inside-render branch July 24, 2023 09:42
@github-actions github-actions bot added this to the Gutenberg 16.4 milestone Jul 24, 2023
@@ -71,7 +61,9 @@ static function render() {
if ( empty( self::$store ) ) {
return;
}
$store = self::serialize();
echo "<script id=\"wp-interactivity-store-data\" type=\"application/json\">$store</script>";
echo sprintf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using native WordPress functions like wp_register_script that support inline scripts? They give better control over where the code is going rendered in the document, and plugin authors can use filtering if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@gziolo gziolo Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would work, you can connect it with the existing interactivity script handle and put it before or after depending on the actual use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks. Added to the roadmap 🙂👍️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gziolo, do you know if wp_add_inline_script works with application/json scripts? 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DAreRodz, most likely, you would need to use WP hooks to inject that custom type.

@adamsilverstein or @westonruter, what would you advise to handle in this special case? Is it something that we should bake into wp_add_inline_script in WP core?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of WordPress 6.3, such inline scripts get printed via \WP_Scripts::get_inline_script_tag() which finally makes use of wp_get_inline_script_tag(). This function fortunately includes a wp_inline_script_attributes filter for manipulating the inline script's attributes, such as adding a nonce or even change the type to application/json. Nevertheless, since this is a low-level function it isn't aware of the script handle specifically, although there is an id attribute that is passed. So if the script handle was interactivity and the script was a before inline script, this should work:

add_filter(
    'wp_inline_script_attributes',
    static function ( $attributes ) {
        if ( isset( $attributes['id'] ) && 'interactivity-js-before' === $attributes['id'] ) {
            $attributes['type'] = 'application/json';
        }
        return $attributes;
    }
);

I'm not sure whether it makes sense to add a more direct API to change the type of such inline scripts without seeing whether there are more such use cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter, thank you so much for the detailed explanation. The id should be enough in this case as it is standardized based on the script handle and always included for a few major WP releases.

I'm not sure whether it makes sense to add a more direct API to change the type of such inline scripts without seeing whether there are more such use cases.

Let's see if wp_add_inline_script needs to be extended. If it's one of case, then the filter is more than enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Interactivity API API to add frontend interactivity to blocks. [Type] Code Quality Issues or PRs that relate to code quality [Type] Security Related to security concerns or efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants