Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multi-line comment bug #3

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

slorber
Copy link

@slorber slorber commented Jun 22, 2023

Fix facebook/docusaurus#9084

The tokenizer had a problem handling comment openings immediately followed by a line break.

Note, I did not update one of the tests (It renders within HTML elements) because I believe it does not seem to support multi-line comments in the first place (starting with line breaks or not): that is probably a separate bug to fix.


Previous text before edit:

WIP: for now it's just a unit test proof that this library has the bug reported here: facebook/docusaurus#9084

No CI, so proof of local failure with a simple test case change:

CleanShot 2023-06-22 at 15 06 33@2x

@slorber slorber marked this pull request as ready for review June 22, 2023 14:29
@slorber
Copy link
Author

slorber commented Jun 22, 2023

@leebyron PR is ready for review, tests are passing

cc @wooorm I'm not super comfortable with this micromark parsing logic so please let me know if you see any possible problem 🤪 that was a good opportunity to learn a bit.


Please let me know if you can review/merge/publish this soon because I need it for Docusaurus (facebook/docusaurus#9084).

Otherwise, I can publish my fork.
Or you can also give me npm/github permissions (slorber on both) in case you want another maintainer.

I tested the change on Docusaurus and it fixes our issue:

CleanShot 2023-06-22 at 16 28 42

Comment on lines +112 to +115
if (markdownLineEnding(code)) {
return atLineEnding(code);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix.

Not sure what I'm doing 🤪
But I assume we shouldn't effects.enter(types.data) if there's no data to consume on that line

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines (only occurs in the block/flow version) probably are broken from a quick glance. Otherwise looks good. See also https://github.com/micromark/micromark/blob/e10f892185d5616db6a9efad3a557ca1845d1843/packages/micromark-core-commonmark/dev/lib/html-text.js#L129

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

Not sure what you mean by "blank lines" 🤪 do you have an example I could use in a test?

Copy link

@wooorm wooorm Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<!--\n\n-->, <!--\na\n\nb\n-->

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried these samples in tests and not sure to understand what you mean by "probably are broken". What do you think it the bad behavior these samples produce?

According to my local tests:

  • Those comments are removed when they should
  • Those comments are serialized back in their original form when they should

Just added those to the test in case you want to take a look.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might have to run the tests with --conditions development to get instrumented code that checks if the extension works (so node test.mjs -> node --conditions development test.mjs). That’s not loaded normally because it would slow everyone down and increase the bundle size. See also: https://github.com/micromark/micromark#size--debug

My hunch is that there are empty tokens, which should not exist.

If that works, it’s all good.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm running DEBUG="*" node --conditions development test/test.mjs and seeing the debug statements now 👍


The current code (even before my changes) has this assertion error at the comment end step:

Assertion: expected last token to be open. code=62

It looks like it does not like to consume just after an exit:

if (code === codes.greaterThan) {
  effects.exit(types.data);
  effects.consume(code);
  effects.exit("comment");
  return ok(code);
}

I'm not sure to understand the issue, that looks fine to me 🤔

Anyway, if I do this (which kind of feel useless?), now both tests and assertions are all passing:

if (code === codes.greaterThan) {
  effects.exit(types.data);
  effects.enter("commentEnd"); // NEW
  effects.consume(code);
  effects.exit("commentEnd"); // NEW
  effects.exit("comment");
  return ok(code);
}

Does it make sense to add the code above?


My hunch is that there are empty tokens, which should not exist.

Not sure how I can see those empty tokens. If there are no more assertion failures, does it means there are no empty tokens?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this project we‘re discussing on has exactly one commit. It’s very likely that it doesn’t work well, is not used a lot in practise, and might be abandoned.

I'm not sure to understand the issue, that looks fine to me 🤔

It’s not. That’s why there’s an error: every byte has to be in something specific.

Anyway, if I do this (which kind of feel useless?), now both tests and assertions are all passing:

Yep, that’s good! That’s the important part: putting every byte into something. For remark, which has ASTs, that’s indeed useless. But micromark can be used to make CSTs, where every character is present.

Not sure how I can see those empty tokens. If there are no more assertion failures, does it means there are no empty tokens?

If there are no more errors, including for those blank line fixtures (<!--\n\n\n-->), it’s good! 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no more errors, including for those blank line fixtures (), it’s good! 👍

Thanks!

FYI this project we‘re discussing on has exactly one commit. It’s very likely that it doesn’t work well, is not used a lot in practise, and might be abandoned.

Yes I understand that, and as I wasn't sure Lee would answer/merge fast I just published @slorber/remark-comment with these PR fixes.

We'll use it on Docusaurus mostly to make the migration easier, according to what I see it seems good enough as a transitory measure. We have a flag for users to opt-out of this plugin once they have fully migrated to MDX comments.

Comment on lines +79 to +85
<!--
has a multi-line comment
-->

<!-- another
multi-line
comment -->
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart the refactor to a template literal, only those lines were added and the rest remains unchanged

test/test.mjs Outdated
and a paragraph
`,
{ ast: true }
),
'<h1>This document</h1>\n\n<p>and a paragraph</p>'
'<h1>This document</h1>\n\n\n\n<p>and a paragraph</p>'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See your own test comment: the extra line breaks are expected

Comment on lines +50 to +52
<!--\\n\\n-->

<!--\\na\\n\\nb\\n-->
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are successfully removed @wooorm

Comment on lines +95 to +97
<!--\\n\\n-->

<!--\\na\\n\\nb\\n-->
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are successfully printed back @wooorm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiline HTML comments throw a MDX compilation error in canary (3.0.0-alpha.0)
2 participants