-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Planning for Table of Content Block Functionality and Heading IDs #22874
Comments
If we create some kind of document outline API, we should probably include page break ( |
I'm looking forward to have it live. |
I strongly recommend starting with just Heading blocks. This greatly simplifies things both product and implementation, and removes hurdles to getting started. The question of whether there is an opportunity for supporting other blocks (perhaps via an API at the level of the block type or of the block proper) or for supporting HTML-level indexing of heading tags (in my opinion, something to avoid) can then be explored separately and on top of a finished base. |
There are many parallels with the optional HTML anchor feature in core blocks. Recently, #23197 extended this feature to all static core blocks, and it's notable how everything hinges on block types adhering to the feature with a simple "supports": {
"tableOfContents": true
} Any block type declaring the above would be picked up by a ToC hook. This could then mean that such blocks automatically sport a control to include it in the ToC, or could mean a more subtle experience (e.g. adding an HTML anchor to a block that has |
@mcsf There's a bit of a problem with "just supporting Heading blocks" in the case of the Table of Contents block. That's easy to do in the editor, but on the front-end, it's a lot more difficult because the JS APIs are not available there. There's no awareness of blocks in the PHP file dynamically rendering the front-end output. So the front-end implementation ends up having to parse HTML, which results in inconsistency between it and the editor implementation. The Table of Contents block also needs to support paginated posts properly, and this also currently has to be done two different ways depending on if you're in the editor or the front-end. Right now, the Table of Contents block works perfectly on the front-end, but relies entirely on HTML parsing (which definitely isn't a performant way to handle it). I can't even change the PHP implementation to only work with core Heading blocks, because there's no concept of blocks anymore at that point. The only way to get the necessary data would be through something kinda like the block context system, and no such API relating to headings and page breaks currently exists. So as far as I can tell, it's not possible to provide a shippable Table of Contents block right now. There is no clean, simple solution, because what the block tries to do requires data that is currently only available by creating temporary clones of the post in memory to parse and scan for specific HTML tags and comment strings. As far as I can tell, the Table of Contents block needs a table of contents API. Specifically, here's what the Table of Contents block needs to know in both the editor and the front-end:
To provide this data, Heading blocks will likely have to provide this data to the API:
Page Break blocks will likely have to tell the API that they mark the start of a new page, and therefore all blocks following them should be considered to be on page 2 (or 3, and so on). All of the data requirements I have just listed are absolutely necessary to make the Table of Contents block work. If any one of these is not provided by some sort of API, then the block has to resort to messy HTML parsing. (Remember, you can't just provide a list of Heading block |
I don't follow; why is the ToC back end not consuming the output of the PHP block parser? Even if the server can't parse as fully as the block editor (stage I is block demarcation and explicit attribute parsing; stage II is full attribute sourcing, validation, migration, and is JS-only), there should be enough to get us started, and it will be much faster and safer than ad-hoc parsing of HTML. Things like pagination support are not necessarily trivial, but would fall into place as soon as we can use the proper parser on the server to clearly identify — always relying on blocks, not HTML — what is a heading, what is a page boundary, and what else is heading-like.
This might be something that the (environment-agnostic) block context API nicely solves. |
Hmm... I'd forgotten about the PHP block parser. Thanks for reminding me. You're right that I could use that on the PHP implementation. I'm currently not using it because my current implementation is still trying to support 3rd party heading blocks. If I switch to sourcing the data from block attributes, I have to drop support for all headings outside of the core Heading block. It's also worth noting that even headings in our own Custom HTML block will be ignored by a Table of Contents implementation that only checks Heading block attributes. My thinking was that if we had a table of contents API, we could at least update the Custom HTML block to provide data to the API so they would work as expected. Would a Table of Contents block that only supports core Heading and Next Page blocks be acceptable? It feels kind of wrong to ship it without 3rd party block support. But if desired, I can update my PR to work that way. Still, though, it seems less than ideal to parse the whole post for block data whenever it encounters a Table of Contents block. |
Also, I'm not certain that post pagination info can be provided through the block context API. If a whole post is considered a single source of data, how can it provide different answers to "what page am I on?"... it seems like you'd have to use "Page" blocks to divide up the post, rather than marker points like the current Next Page block. But maybe the block context API is more powerful than I think? |
Having thought about this for a while, it's clear to me now that block context can't solve this. Block context provides data from a parent to its children, but in the case of page breaks, there's no parent to provide this info. If we were to redesign WordPress from scratch, paginated posts could have been implemented via a "Page" block that would contain all the content that goes on that page. However, that's not how things are. Page breaks are determined at the seam between one and the other via the I don't want to prematurely abandon a potential path forward, though... so here's a question: would it be feasible to deprecate the This still doesn't solve the headings issue, however. As far as I can tell, we have to support 3rd-party heading blocks. Even within core, the Heading block isn't the only reasonable place to put an |
I think we have to accept that trade-offs will be made, and make a choice we can be happy with. Otherwise, this feature will crumble under the weight of its requirements. My own opinion is that we should optimise for:
and that this can come at expense of:
The choices above are in order of preference. So I think it's better to ditch premature APIs than to ditch support for dynamic content. This makes it easier to let the editor itself generate a static ToC, but I think we can still leverage existing hooks in the WP back end and make sure the ToC is present at the top of each page. For example: $pages = apply_filters( 'content_pagination', $pages, $post ); — in class-wp-query.php |
Just to be clear, do you think we should support 3rd-party heading blocks or not? There are already many plugins that add some variation of an "advanced heading" block, including:
And this isn't taking into account any other blocks that use headings like accordion blocks.
I don't think I understand what you're trying to say here? My Table of Contents block can be placed anywhere from the start of the page to the very end, and there can be multiple instances of it. (This is useful for allowing each page of a paginated post to have its own table of contents.) It's also worth pointing out that the reason my Table of Contents block is dynamic is that that altering the static output every time a heading changed resulted in two undo steps being created rather than just one. Pressing undo just once would change the table of contents, but not the heading. So unless someone can come up with an alternative solution there, the Table of Contents block has to be completely dynamic. I do agree it would be best to try and solve this problem without introducing new APIs if possible. To that end, I've tried my best to complete the Table of Contents block in #21234, and at the moment the implementation certainly works in all likely situations, but I am concerned about the performance of the block, and there are a few edge cases that I can't handle without adding even more performance overhead. If you have any suggestions on how to proceed there, let me know. |
I think it's fine to keep it dynamic, as long as the block in the editor still accurately represents the final output. That said, just to touch on the undo question — in case you aren't familiar with it yet —
Thanks for the work you're doing there. I've been meaning to review, and I think it's a great feature, but haven't found enough time yet.
In the long run, the editor should understand that, beyond Other efforts out there, such as semantic template parts (#27337), deal with a similar ontological problem. Even if the domain is very different — templates and template parts — it's something to keep an eye on and learn from. As always, the duty and luxury with Gutenberg is that we're building for the long run. So we can afford to take time to get some of these things right. I mean, just look at how many times we've visited footnotes (#1890) over nearly four years! So, to distill my original message: let's start by building a good ToC block in that it works well, feels right, and treats user data well. Only then should we worry about widening the reach of that feature. |
This issue sets up discussion started during a Core Editor chat for the functionality of a Table of Contents (TOC) Block. Currently, there are several PRs/Issues that provide possible solutions.
Add Table of Contents block (dynamic rendering + hooks version) PR #21234
"Table of Contents" Block PR #11047
#15426 (Closed) PR #15426
From a technical point, when working with a TOC block, how are items that aren't in blocks like headings and next page tags counted, and how is it determined if those items precede the current block? Counting Heading blocks is relatively easy, but counting all headings in the HTML is more difficult, and counting all headings in the HTML preceding the current block seems impossible in some situations. This challenge is compounded when considering if the headings are in a dynamic block.
Resolving these questions impacts:
Specific challenges that need feedback are:
Possible solutions include:
@ZebulanStanphill @mtias @youknowriad @MichaelArestad contributed to the original conversation. Additional feedback here is welcome.
The text was updated successfully, but these errors were encountered: