Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate into the performance issues of the bracket pair colorizer extension #128465

Closed
egamma opened this issue Jul 12, 2021 · 24 comments
Closed
Assignees
Labels
bracket-pair-colorization plan-item VS Code - planned item for upcoming
Milestone

Comments

@egamma
Copy link
Member

egamma commented Jul 12, 2021

Bracket Pair Colorizer is a very popular VS Code extension. However, users are reporting performance problems as the following tweet illustrates.

We should investigate into how the performance can be improved.

image

@egamma egamma added the plan-item VS Code - planned item for upcoming label Jul 12, 2021
@egamma egamma added this to the July 2021 milestone Jul 12, 2021
@IllusionMH
Copy link
Contributor

Please note that there also is separate version Bracket Pair Colorizer 2 which should be faster, however lost some features around custom pairs.

@hediet
Copy link
Member

hediet commented Jul 13, 2021

@alexdima and I did some brainstorming.

The fundamental problem is to offer correct bracket highlighting while staying performant.

Consider the following TypeScript snippet:

function test() { [1]
    const str = `
        ${(() => { if (true) {  } })()}
    } [2]`;
} [3]

Does the bracket at [1] match with the bracket at [2] or at [3]?
To answer this question, the template literal expression before must be parsed correctly.
Unfortuantely, a single parsing inaccuracy might lead to erroneous matching of brackets for all later lines, degrading the entire experience.

Thus, perfect bracket pair colorization requires syntax tree information.

However, we think that the textmate tokens should be sufficient to identify matching bracket pairs, assuming the tokens are usually correct and that bracket pairs match if they have the same token type and the same level within all brackets of that token type. Simple matching is already implemented in the editor core. However, tokens are not always accurate, especially for long lines and for documents with many lines.

The goal should be to avoid iterating over the entire document on each keystroke to compute bracket colors.
This must also be avoided when rendering.

We identified the following possible solutions for implementing bracket pair colorizing:

1. Do Bracket Pair Colorizing Inside An Extension

  • Con: The extension host must have access to tokens of a given document.
    Since VS Code textmate grammars can be extended by extensions and these extended grammars cannot be queried by extensions, the extension API needs to be extended for full support.
  • Con: Exposing textmate tokens might cause a textmate lock-in. Moving away from textmate would then cause a breaking change.
  • Con: If a document has thousands of bracket-pairs, there needs to be a way to efficiently update decorations. Alternatively, the extension only sends decorations for the current view port. It is a no go to update the decorations for all matching brackets on each keystroke.
  • Pro: Extensions such as Blockman (13k installs) by @leodevbro, Rainbow Brackets by @Logicult (discontinued, 950k installs) or Rainbow Brackets 2 (2k installs) by @tejasvi might also benefit from this. It would be interesting to hear their opinions on this.

1.1. Expose Tokens To Extensions

  • Pro: Extensions don't need to reimplement the textmate tokenizer.
  • Con: Extensions expect the tokens to be correct. However, with the current state of textmate and due to some performance optimizations, we cannot guarantee correctness. However, we could restrict the token information to some very basic properties that are mostly correct.
  • Con: We also cannot offer a fully synchronous way to access tokens, as this could block the extension host.

API usage could look like this:

const doc = vscode.workspace.textDocuments[0];

// If a non-undefined value is returned, these tokens must be in sync with `doc`.
// We intentionally don't return a promise here to avoid synchronization issues.
// This API allows to compute tokens in batches to avoid freezing the extension host.
const tokens: { isComputing: false, tokens: ILineToken[] } | { isComputing: true } = doc.getTokensForLine(myLine);

vscode.workspace.onDidChangeTextDocumentTokens((e: { changedRanges: Range[] }) => {
    // Is fired immediately after the text document got changed or computation started/finished.
    // Calls to `getTokensForLine` can still return `{ isComputing: true }`.
    const tokens = doc.getTokensForLine(myLine);
});

// Tokens itself could look like this:
interface ILineToken {
    kind: TokenKind,
    languageId: string, // e.g. 'typescript' or 'html'
    startOffset: number,
    endOffset: number
}

// Only provide a very limited amount of token kinds so that
// we don't expose too much textmate flavor.
// We could easily provide the same token information with tree-sitter or monarch.
enum TokenKind {
    default,
    string,
    comment,
    regex
}

There are various ways of how such an API could be implemented:

1.1.1. Send the Tokens Of The Renderer To The Extension Host

  • Pro: The document only needs to be tokenized once.
  • Con: Sending the tokens to the extension host could be very slow. At all costs, slowing down the renderer must be prevented. This is probably a blocker for this option.

1.1.2. Compute the Tokens In The Extension Host

  • Con: Documents are tokenized twice.
  • Con: Might slow down the extension host.

1.2. Compute the Tokens In The Extension

  • Con: The extension needs to ship with recent grammars.
  • Con: The extension needs to bundle oniguruma (by using webassembly)
  • Con: The extension tokens and VS Code tokens might not align (an API would be needed for extensions to get a list of all registered textmate grammars, as extensions can contribute grammar extensions).
  • Con: If implemented incorrectly, the extension could freeze the extension host (like Bracket Pair colorizer is doing).

1.a) Fix either Bracket Pair Colorizer 1 or 2

  • Con: Both extensions by @CoenraadS are deprecated & the repositories are archived. There has not been a significant update for at least one year.
  • Pro: No special steps required by the community to upgrade.

1.b) Create a New Extension

  • Pro: A fresh start has the most opportunities to optimize for performance.
  • Con: The community is important to us. Replacing popular community-extensions might be harmful and could discourage extension authors.

1.c) Wait for a New Extension by the Community + Provide Guidance

  • Pro: We enable the community to support advanced customization options.
  • Pro: With our guidance + potentially new API, we can help making this extension very performant

We could also start with 1.b) and encourage authors to fork our extension.

2. Implement Bracket Pair Colorizing Inside The Editor Core

  • Pro: We can use the existing tokens without performance implications.
  • Pro: We don't need to use decorations but can modify the rendering logic directly.
  • Pro: We don't need to expose tokens to extensions, saving a lot of headaches.
  • Con: We might exclude the community from customizing bracket pair colorization beyond what we offer.

Personally, I would love to do 1.1.2 + 1.c) (with 1.b as fallback and encouraging authors to fork).
However I think the fastest and least risky way to get performant Bracket Pair Colorization in VS Code would be option 2.

@CoenraadS
Copy link
Contributor

CoenraadS commented Jul 13, 2021

Author of Bracket Pair Colorizer here.

I follow this thread with interest, my extension is something that grew a bit out of control, and I grew tired of maintaining it.

If there are some quick wins, I can still apply them, but I think my extension is so hacky it is easier to do 1.b or 1.c

I coded this thing when I was still in college and it shows 👎
Maybe the first step would be just to disable BC1, it's so bad yet people don't upgrade to BC2 (which still has it's own perf issues..)

@leodevbro
Copy link

Author of Blockman here.

1.1. Expose Tokens To Extensions

This was the first thing I was thinking when I started working on Blockman, and I think it's very natural idea, because why on earth would an extension re-parse/re-tokenize the file if the host already does it? Internal access to tokens would be super useful, if not all tokens, then at least the tokens which represent the opening and closing locations of each nested block.

Extensions expect the tokens to be correct. However, with the current state of textmate and due to some performance optimizations, we cannot guarantee correctness.

Well, maybe, if it's possible, the extension API should have access to the parsing process of host tokens with some option to choose between strict (maximally correct) mode and fast mode. I think maximum correctness must be always ON at least for the tokens which represent the opening and closing locations of each nested block.

We also cannot offer a fully synchronous way to access tokens, as this could block the extension host.

I think that's fine. Blockman gets brackets asynchronously and it seems to work fine for most of the users.

@hediet
Copy link
Member

hediet commented Jul 14, 2021

Thanks so much for reaching out! Your extensions not only make VS Code an even more amazing editor, but are also a source for inspiration!

If there are some quick wins, I can still apply them, but I think my extension is so hacky it is easier to do 1.b or 1.c

I had a quick look at your source, but I think a much more incremental data structures need to be adopted to get the desired performance - I doubt this is a low hanging fruit.
In particular, a single keystroke should not take more than O(log(Number of Lines)) steps to update this data structure (and in particular must not require to retokenize the entire file just for bracket colorization).

Maybe the first step would be just to disable BC1, it's so bad yet people don't upgrade to BC2 (which still has it's own perf issues..)

Did you experiment with notifications to inform users of BC2? When we have a more performant solution in place (however it may look), would you be open to help us getting users who struggle with BC2's performance to adopt that new solution?

because why on earth would an extension re-parse/re-tokenize the file if the host already does it?

The problem here is that at all costs we must prevent the renderer (who currently computes the tokens) from being slow. Synchronizing the tokens with the extension host could be slow, as the extension host is a separate process.

the tokens which represent the opening and closing locations of each nested block.

Unfortunately, the tokens produced by textmate grammars don't reflect blocks.

@leodevbro
Copy link

The problem here is that at all costs we must prevent the renderer (who currently computes the tokens) from being slow. Synchronizing the tokens with the extension host could be slow, as the extension host is a separate process.

Why not give the extension developers an option to access host tokens or not to access, so they can choose, they can test it for their specific case and maybe for some situations this approach will not be a deal breaker in terms of speed (even with async access).

Unfortunately, the tokens produced by textmate grammars don't reflect blocks.

What do you mean? Blockman uses the source code of Bracket Pair Colorizer 2 to find the locations of all brackets (which are by themselves the locations of block start and block end), and I believe BPC2 itself uses textmate, and so it provides all the locations (positions) of each bracket with this kind of array:

{
    char: string;
    type: number; // opening or closing
    inLineIndexZero: number;
    lineZero: number;
}[]

@hediet
Copy link
Member

hediet commented Jul 14, 2021

I believe BPC2 itself uses textmate, and so it provides all the locations (positions) of each bracket with this kind of array:

BC2 seems to postprocess the tokens.

and maybe for some situations this approach will not be a deal breaker in terms of speed

As an extension author myself I can fully understand you. However, from VS Code's perspective, slow extensions do not only degrade the experience of the extension itself, but of the entire product. This is already the case for BC2 - not only is bracket coloring slow for huge documents, but also all the TypeScript features.
This is why we don't want to give out API that can easily cause performance issues.

@jpcastberg
Copy link

I forked and made a minor edit about a month ago to debounce the tokenizations, was a major improvement for me with no noticeable change in responsiveness. https://github.com/jpcastberg/Performant-Bracket-Pair-Colorizer-2

@hediet
Copy link
Member

hediet commented Aug 17, 2021

With merging #129231, todays Insider build of VS Code now comes with native bracket pair colorization! We would love to hear your feedback and discuss potential migration paths!

Setting "editor.bracketPairColorization.enabled": true will enable it.

As announced, we chose to go with option 2 and the outcome shows that this was the right thing to do.

For checker.ts, this new colorizer is up to 10,000 times faster (it takes less than a millisecond to recolorize brackets when you prepend { to a ~40kLoC document, while the bracket pair colorizer extension needs more than 10 seconds, blocking the extension host in the meantime).

Key-Components of this implementation are:

  • A separate decoration tree only for bracket pairs that can be updated incrementally. This data structure is queried synchronously by the editor to get bracket pair decorations in the viewport. Find and update requests are usually handled in less than a millisecond, even for files like checker.ts.
    While this data structure could have been implemented in an extension as well, due to the extension host communication overhead, it would not have been possible to get both sub-millisecond find and update performance. Either you send all bracket decorations over to the renderer process as soon as they change to get high-performant find times, but have to resend all brackets when prepending the document with a single { (which would be too slow). Or you only send bracket decorations to the renderer when they are requested, which would cause noticeable flickering whenever the document is scrolled, as renderer <-> extension host communication is async.
  • Lazily reusing the tokens of the renderer process. This new implementation does not force tokenization and updates as soon as token information become available. The initial scanning for bracket pairs ignores tokens.

@CoenraadS
Copy link
Contributor

CoenraadS commented Aug 17, 2021

@hediet Will VSCode also support colored line scopes?

e.g.
image

If it could that would be great, and my extension could be deprecated completely.

@hediet
Copy link
Member

hediet commented Aug 17, 2021

Will VSCode also support colored line scopes?

See here: #131001

If it could that would be great, and my extension could be deprecated completely.

It would be awesome if you could show a one-time notification to users prompting them to enable the native bracket pair colorization!
If you are open for that, I can create a PR for your extension.

@CoenraadS
Copy link
Contributor

I un-archived https://github.com/CoenraadS/BracketPair & https://github.com/CoenraadS/Bracket-Pair-Colorizer-2/ if you want to make PR.

@leodevbro
Copy link

leodevbro commented Aug 17, 2021

todays Insider build of VS Code now comes with native bracket pair colorization!

Question: I know exposing tokens to extensions is a headache, but is it now possible and non-headache to expose only the positions of those brackets colored by native bracket pair colorization?
like this:
{type: "{" | "}" | "(" | ")" | "[" | "]"; line: number; column: number; }[]
or like this:
{type: "{" | "}" | "(" | ")" | "[" | "]"; globalStringIndex: number; }[]

No tree structure needed, just a simple array of positions. It will be super huge speed upgrade for Blockman (already 19k installs).

@hediet hediet modified the milestones: January 2022, February 2022 Jan 30, 2022
@hediet
Copy link
Member

hediet commented Feb 21, 2022

Filed a PRs here and here to improve transition from the extension to the native feature.
Nothing left to do, thus closing this issue.

@hediet hediet closed this as completed Feb 21, 2022
@hediet
Copy link
Member

hediet commented Mar 2, 2022

Reopening, as still hundreds of thousands of users seem to have the extension installed and new uses are continuing to install both the old and the new version of the extension.

@CoenraadS what do you think about our PRs here and here for your bracket pair colorizer 2 extension to more prominently offer a way to migrate to the native feature?

@hediet hediet reopened this Mar 2, 2022
@hediet hediet modified the milestones: February 2022, March 2022 Mar 2, 2022
@Stanzilla
Copy link

I guess the main problem is that the native implementation does not cover all the features yet. At least for me, that's the reason why I continue using the extensions.

@hediet
Copy link
Member

hediet commented Mar 7, 2022

I guess the main problem is that the native implementation does not cover all the features yet. At least for me, that's the reason why I continue using the extensions.

What features do you miss? Our goal is not to have feature-parity, but to have just enough to make 80% of users of the old extension switch.

@Stanzilla
Copy link

I guess the main problem is that the native implementation does not cover all the features yet. At least for me, that's the reason why I continue using the extensions.

What features do you miss? Our goal is not to have feature-parity, but to have just enough to make 80% of users of the old extension switch.

These two https://github.com/microsoft/vscode/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+author%3AStanzilla

@hediet hediet modified the milestones: March 2022, April 2022 Mar 24, 2022
@CoenraadS
Copy link
Contributor

I haven't been really active around this since for my work I'm using regular Visual Studio, and don't really touch VSCode anymore.

But yesterday I finally got around to merging the PR's against my extension to make migration easier. However then I was trying to see how I could make the native functionality match the default settings of BPC2.

I would like to see:

#136475
#146030

Then people that migrate from my extension to native would not experience any different behavior. Assuming most people simply using the default settings.

@hediet
Copy link
Member

hediet commented Mar 25, 2022

Thanks for looking at the PRs! I will prioritize #136475 for next release!

@Kevin-Hamilton
Copy link

This is the main blocker for me and my team to switch away from the extension: #143484

@hediet hediet modified the milestones: April 2022, May 2022 Apr 29, 2022
@ghost
Copy link

ghost commented May 7, 2022

Thanks for mentioning the need for a token data query, @TheFanatr. I have written vscode-textmate-languageservice - a service that replicates Atom's ability to glean language provider data from token heuristics. These ideas are really powerful and in high demand, but the API to expose this data in a performant, platform-agnostic way isn't there.

Admittedly I semi-regret the service as there has been complaints it causes painful slowdowns in the Matlab extension. 😦

@hediet hediet modified the milestones: May 2022, June 2022 Jun 2, 2022
@hediet hediet modified the milestones: June 2022, July 2022 Jul 1, 2022
@hediet hediet modified the milestones: July 2022, August 2022 Jul 29, 2022
@hediet hediet modified the milestones: August 2022, On Deck Aug 23, 2022
@hediet
Copy link
Member

hediet commented Dec 2, 2022

Last week, only ~80k users used Bracket Pair Colorizer 1 (decreasing tendency).
The same week, about ~300k users used Bracket Pair Colorizer 2 (also decreasing tendency). More than 99% of them have native colorization enabled though (which effectively disables the bracket pair colorizer 2 extension).

Thus I would say these extensions are no longer used and we can finally close this issue.

@hediet hediet closed this as completed Dec 2, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Jan 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bracket-pair-colorization plan-item VS Code - planned item for upcoming
Projects
None yet
Development

No branches or pull requests

10 participants