GitHub code navigation helps you to read, navigate, and understand code by linking definitions of named symbols (like a class or method) to references to that symbol, as well as linking references to the symbol's definition. GitHub has developed two code navigation approaches:
- Search-based: searches all definitions and references across a repository to find symbols with a given name
- Precise: resolves definitions and references based on the set of classes, functions, and imported definitions at a given point in your code
Search-based code navigation is implemented using the Tree-sitter parser ecosystem. A few languages support precise code navigation, built with stack graphs.
For more information, see "Navigating code on GitHub."
Code navigation is supported for the following languages:
Language | Search-based | Precise |
---|---|---|
Bash | ✔️ | ✖️ |
C# | ✔️ | ✖️ |
C++ | ✔️ | ✖️ |
CodeQL | ✔️ | ✖️ |
Elixir | ✔️ | ✖️ |
Go | ✔️ | ✖️ |
JSX | ✔️ | ✖️ |
Java | ✔️ | ✖️ |
JavaScript | ✔️ | ✖️ |
Lua | ✔️ | ✖️ |
PHP | ✔️ | ✖️ |
Protocol Buffers | ✔️ | ✖️ |
Python | ✔️ | ✔️ |
R | ✔️ | ✖️ |
Ruby | ✔️ | ✖️ |
Rust | ✔️ | ✖️ |
Scala | ✔️ | ✖️ |
Starlark | ✔️ | ✖️ |
Swift | ✔️ | ✖️ |
Typescript | ✔️ | ✔️ |
If your programming language is not one of them, you can help us add it.
To add code navigation for a new language, you must follow these steps:
- Add the language to Linguist.
- Define a Tree-sitter parser for the language.
- Write tags queries.
- Write fully-qualified name queries (if applicable).
- Open an issue in this repo.
For details, see below.
Note
Adding a language is at the discretion of GitHub. We may not add every language. Common reasons to reject language support include an immature Tree-sitter parser, excessive resources required to parse, or low use on GitHub.
First, the language must be added to Linguist. Linguist is the source of truth for all languages on GitHub.
You can check to see if the language exists in Linguist by searching the languages.yml
file. If your language is not included in Linguist, follow the contribution guidelines to get it added.
Next, we require a mature, well-maintained Tree-sitter parser for the language. The parser must publish a Rust crate to crates.io.
Most popular programming languages already have a Tree-sitter grammar, but if you need to create one, you can review the documentation for creating a new parser.
Once the language has a Tree-sitter parser, you need to write tag queries to extract the structure of the code for navigation. A tag query is a Scheme-like expression that navigates the Abstract Syntax Tree generated by the Tree-sitter parser to extract a symbol. You can look at existing Tree-sitter parsers for inspiration. Parsers usually contain a file called tags.scm
with tag queries (for example, see the JavaScript tag queries). Additionally, Tree-sitter has documentation about using tags queries for code navigation.
GitHub code navigation supports extracting definitions for these types of symbols:
Category | Tag |
---|---|
Class | @definition.class |
Constant | @definition.constant |
Enum | @definition.enum |
Enum variant | @definition.enum_variant |
Field | @definition.field |
Function | @definition.function |
Implementation | @definition.implementation |
Interface | @definition.interface |
Macro | @definition.macro |
Module | @definition.module |
Struct | @definition.struct |
Trait | @definition.trait |
Type | @definition.type |
Union | @definition.union |
Additionally, references to function or method calls can be extracted as @reference.call
.
Not all programming languages support all of these symbol types. The tag queries should contain only those that make sense for your programming language.
For languages that support defining functions, methods, or other entities within another structure, GitHub code navigation supports extracting fully-qualified names. Fully-qualified names are used to improve code navigation as well as the relevance of search results.
Here is an example from our Java extractor. The following Java code defines a class named Cat
that has a method named noise
:
public class Cat {
public String noise() {
return "meow";
}
}
Our tag queries extract @definition.class
and @definition.method
and tag the identifiers with @name
:
(class_declaration name: (identifier) @name) @definition.class
(method_declaration name: (identifier) @name) @definition.method
The extracted identifier names are used to prefix the method name (noise
) with its container's name (Cat
), resulting in the fully-qualified name Cat::noise
.
However, not all languages define nested items within the container. For example, Go has methods, but they are defined separately from the struct they belong to:
type Cat struct {}
func (c Cat) Noise() string {
return "meow"
}
To implement fully-qualified names for languages like Go, GitHub code navigation adds a @scope
capture name:
(method_declaration
receiver: (parameter_list (parameter_declaration type: (type_identifier) @scope))
name: (field_identifier) @name
) @definition.method
Our extractor uses the @scope
capture to create the fully qualified name Cat.Noise
.
If your language supports nested entities that are defined separately, include a @scope
capture for best results with GitHub code navigation.
Finally, create an issue in this repository. We will evaluate adding the parser to the code search indexing system.
This project is licensed under the terms of the MIT open source license. Please refer to the license for the full terms.
This project is maintained by members of the GitHub code search team.
Please file an issue for support. See SUPPORT.md for details.
GitHub code navigation is made possible by the Tree-sitter ecosystem and all the Tree-sitter parser maintainers. Thank you!