-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multi-line comment bug #3
base: main
Are you sure you want to change the base?
Conversation
@leebyron PR is ready for review, tests are passing cc @wooorm I'm not super comfortable with this micromark parsing logic so please let me know if you see any possible problem 🤪 that was a good opportunity to learn a bit. Please let me know if you can review/merge/publish this soon because I need it for Docusaurus (facebook/docusaurus#9084). Otherwise, I can publish my fork. I tested the change on Docusaurus and it fixes our issue: |
if (markdownLineEnding(code)) { | ||
return atLineEnding(code); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the fix.
Not sure what I'm doing 🤪
But I assume we shouldn't effects.enter(types.data)
if there's no data to consume on that line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank lines (only occurs in the block/flow version) probably are broken from a quick glance. Otherwise looks good. See also https://github.com/micromark/micromark/blob/e10f892185d5616db6a9efad3a557ca1845d1843/packages/micromark-core-commonmark/dev/lib/html-text.js#L129
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍
Not sure what you mean by "blank lines" 🤪 do you have an example I could use in a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!--\n\n-->
, <!--\na\n\nb\n-->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried these samples in tests and not sure to understand what you mean by "probably are broken". What do you think it the bad behavior these samples produce?
According to my local tests:
- Those comments are removed when they should
- Those comments are serialized back in their original form when they should
Just added those to the test in case you want to take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might have to run the tests with --conditions development
to get instrumented code that checks if the extension works (so node test.mjs
-> node --conditions development test.mjs
). That’s not loaded normally because it would slow everyone down and increase the bundle size. See also: https://github.com/micromark/micromark#size--debug
My hunch is that there are empty tokens, which should not exist.
If that works, it’s all good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'm running DEBUG="*" node --conditions development test/test.mjs
and seeing the debug statements now 👍
The current code (even before my changes) has this assertion error at the comment end step:
Assertion: expected last token to be open. code=62
It looks like it does not like to consume just after an exit:
if (code === codes.greaterThan) {
effects.exit(types.data);
effects.consume(code);
effects.exit("comment");
return ok(code);
}
I'm not sure to understand the issue, that looks fine to me 🤔
Anyway, if I do this (which kind of feel useless?), now both tests and assertions are all passing:
if (code === codes.greaterThan) {
effects.exit(types.data);
effects.enter("commentEnd"); // NEW
effects.consume(code);
effects.exit("commentEnd"); // NEW
effects.exit("comment");
return ok(code);
}
Does it make sense to add the code above?
My hunch is that there are empty tokens, which should not exist.
Not sure how I can see those empty tokens. If there are no more assertion failures, does it means there are no empty tokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this project we‘re discussing on has exactly one commit. It’s very likely that it doesn’t work well, is not used a lot in practise, and might be abandoned.
I'm not sure to understand the issue, that looks fine to me 🤔
It’s not. That’s why there’s an error: every byte has to be in something specific.
Anyway, if I do this (which kind of feel useless?), now both tests and assertions are all passing:
Yep, that’s good! That’s the important part: putting every byte into something. For remark, which has ASTs, that’s indeed useless. But micromark can be used to make CSTs, where every character is present.
Not sure how I can see those empty tokens. If there are no more assertion failures, does it means there are no empty tokens?
If there are no more errors, including for those blank line fixtures (<!--\n\n\n-->
), it’s good! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no more errors, including for those blank line fixtures (), it’s good! 👍
Thanks!
FYI this project we‘re discussing on has exactly one commit. It’s very likely that it doesn’t work well, is not used a lot in practise, and might be abandoned.
Yes I understand that, and as I wasn't sure Lee would answer/merge fast I just published @slorber/remark-comment
with these PR fixes.
We'll use it on Docusaurus mostly to make the migration easier, according to what I see it seems good enough as a transitory measure. We have a flag for users to opt-out of this plugin once they have fully migrated to MDX comments.
<!-- | ||
has a multi-line comment | ||
--> | ||
|
||
<!-- another | ||
multi-line | ||
comment --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart the refactor to a template literal, only those lines were added and the rest remains unchanged
test/test.mjs
Outdated
and a paragraph | ||
`, | ||
{ ast: true } | ||
), | ||
'<h1>This document</h1>\n\n<p>and a paragraph</p>' | ||
'<h1>This document</h1>\n\n\n\n<p>and a paragraph</p>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See your own test comment: the extra line breaks are expected
<!--\\n\\n--> | ||
|
||
<!--\\na\\n\\nb\\n--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are successfully removed @wooorm
<!--\\n\\n--> | ||
|
||
<!--\\na\\n\\nb\\n--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are successfully printed back @wooorm
Fix facebook/docusaurus#9084
The tokenizer had a problem handling comment openings immediately followed by a line break.
Note, I did not update one of the tests (
It renders within HTML elements
) because I believe it does not seem to support multi-line comments in the first place (starting with line breaks or not): that is probably a separate bug to fix.Previous text before edit:
WIP: for now it's just a unit test proof that this library has the bug reported here: facebook/docusaurus#9084
No CI, so proof of local failure with a simple test case change: