-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/ast: add func IsGenerated(*File) bool #28089
Comments
Is the only concern about the regular expression its speed? If so, I'd rather improve the speed of regexp instead. There are some performance issues for the package already, so one of them may already cover this particular expression: #11646 #26623 #21463 I realise that the regex package may take a while to be fast even for this one case, but I don't think there's a sense of urgency to have a faster implementation of it in |
Can the predicate be expressed as a function of the AST? Very often the tools that need this predicate have already parsed the file.
It seems like a good fit for the x/tools/go/astutil package since it doesn't add any dependencies. |
Regular expressions are great for experimenting with complicated patterns, but you don't need the regexp package for an expression this trivial:
|
This POSIX-compliant grep command has worked fine for my needs in shell scripts:
In my case, I'm not concerned about a false positive of that line occurring in a block comment or in a raw string literal. |
No. I wanted to avoid the The
Good question. I'd be curious to try. I expect it should be possible, but I'm not sure if it'd be faster. Note that not only comments would have to be checked in the AST, but raw string literals as well. A .go file with the following code needs to be reported as a positive match for the pattern: package p
import "fmt"
func Foo() {
const s = `this is a raw string literal. it happens to contain
some text in column 1, including a string that matches
the "Code generated ... DO NOT EDIT." comment, like so:
// Code generated DO NOT EDIT.
to product a correct result, it needs to be detected`
fmt.Println(len(s))
} Since it does in fact contain a line of text that matches the regular expression
Indeed. It's very quick and easy to implement a parser for https://golang.org/s/generatedcode format in any language that has support for regular expressions. This package is meant to be available for use by tools that are written in Go, and prefer to avoid incurring the cost of spawning a process or importing the
Those are not false positives. According to the spec, a file is considered generated if a matching line of text appears anywhere in the file, which can include block comments and raw string literals. |
Good to know that grepping this way shouldn't hit any false positives then :)
Anecdotally: my most frequent case for needing to check whether a file is auto-generated has been to manually filter through |
This would be a 1-declaration 1-line package. We try to avoid those. Please propose a new method or top-level function in an existing package, like probably go/ast. |
Ping @dmitshur; see previous comment. |
I agree with your comment @rsc that this would be a small package, and it's better to avoid that in x/tools. I've been thinking about where this functionality could fit into existing packages, and I have some initial thoughts I can share. Please note this isn't a final proposal; just an iteration I want to share. I'm not completely happy with it and wouldn't want it implemented as is. WIP Counter-Proposal Revision 1Out of the existing packages, this functionality seemed to belong most in // A Mode value is a set of flags (or 0).
// They control the amount of source code parsed and other optional
// parser functionality.
//
type Mode uint
const (
PackageClauseOnly Mode = 1 << iota // stop parsing after package clause
ImportsOnly // stop parsing after import declarations
ParseComments // parse comments and add them to AST
+ GeneratedComment // determine whether the file contains a "// Code generated … DO NOT EDIT." line comment
Trace // print a trace of parsed productions
DeclarationErrors // report declaration errors
SpuriousErrors // same as AllErrors, for backward-compatibility
AllErrors = SpuriousErrors // report all errors (not just the first 10 on different lines)
) When the mode is turned on, (There is some precedent to this in the In order for // A File node represents a Go source file.
//
// [...]
type File struct {
Doc *CommentGroup // associated documentation; or nil
Package token.Pos // position of "package" keyword
Name *Ident // package name
Decls []Decl // top-level declarations; or nil
Scope *Scope // package scope (this file only)
Imports []*ImportSpec // imports in this file
Unresolved []*Ident // unresolved identifiers in this file
Comments []*CommentGroup // list of all comments in the source file
+ Generated *Comment // a "// Code generated … DO NOT EDIT." line comment; or nil
} At first, this seemed like the most appropriate place in existing packages to add this functionality. However, after considering the details more closely, it doesn't seem like a good fit. The "// Code generated … DO NOT EDIT." comment lives in parallel to the actual AST structure of a file. According to the spec, such a comment can appear inside a block comment, and it would still count. It can also appear within a raw string literal, and match. The work that must be done to determine if a .go file has such a comment has very little to do with parsing the actual AST of a .go file. WIP Counter-Proposal Revision 2An alternative proposal to consider is to add this functionality to // If ImportComment is set, parse import comments on package statements.
// Import returns an error if it finds a comment it cannot understand
// or finds conflicting comments in multiple source files.
// See golang.org/s/go14customimport for more information.
ImportComment
+
+// If GeneratedComment is set, determine which .go files contain
+// a "// Code generated … DO NOT EDIT." line comment,
+// and populate GeneratedGoFiles with the results.
+// See golang.org/s/generatedcode for more information.
+GeneratedComment The result would be populated in a new type Package struct {
// ...
// Source files
GoFiles []string // .go source files (excluding CgoFiles, TestGoFiles, XTestGoFiles)
CgoFiles []string // .go source files that import "C"
IgnoredGoFiles []string // .go source files ignored for this build
InvalidGoFiles []string // .go source files with detected problems (parse error, wrong package name, and so on)
+ GeneratedGoFiles []string // .go source files containing a contains a "// Code generated … DO NOT EDIT." line comment; populated only if GeneratedComment mode is set
CFiles []string // .c source files
// ... This approach seems to be cleaner than adding something to There's also the factor that A "WIP Counter-Proposal Revision 3" section can/should be added where a similar approach as in revision 2 can be evaluated for the ConclusionI would lean towards either putting this proposal on hold (or waiting to iterate on it further), or outright declining it. I made the proposal because I was planning to move the existing package, and wanted to check if there'd be a more fitting place for it in Additionally, given how easy it is to parse the generated comment with just regexp, there isn't a high demand for this functionality as far as I can tell (e.g., |
@dmitshur Let's discuss this in person. Per @golang/proposal-review we're ok with adding a utility function iff we can decide on a good place for it and a good API. |
Change https://golang.org/cl/158817 mentions this issue: |
This CL attempts to restore the clarity of the original specification at https://golang.org/s/generatedcode that the line may appear anywhere. It is preferable (for human readability), and most common for it to be early in the file, but that is merely a convention, not a strict well-specified requirement. Document it as so. Background Issue #13560 was a proposal define a standard for marking files as generated, one that is suitable to be recognized both by humans and machine tools. It was accepted, and the final specification was documented at https://golang.org/s/generatedcode. Its text, copied exactly: Generated files are marked by a line of text that matches the regular expression, in Go syntax: ^// Code generated .* DO NOT EDIT\.$ The .* means the tool can put whatever folderol it wants in there, but the comment must be a single line and must start with Code generated and end with DO NOT EDIT., with a period. The text may appear anywhere in the file. The https://golang.org/s/generatedcode link points to a comment in a very large GitHub issue. That makes it harder to find. Issue #25433 was opened about moving that information somewhere else. It was resolved via CL 118756, which added text to cmd/go documentation at https://golang.org/cmd/go/#hdr-Generate_Go_files_by_processing_source: To convey to humans and machine tools that code is generated, generated source should have a line early in the file that matches the following regular expression (in Go syntax): ^// Code generated .* DO NOT EDIT\.$ The CL description noted that "This change merely moves that information to a more visible place." The intention was to preserve the specification unmodified. The original specification was very clear that "The text may appear anywhere in the file." The new text in cmd/go documentation wasn't very clear. "A line early in the file" is not a precise enough criteria to be recognized by a machine tool, because there isn't a precise definition of what lines are "early in the file". Updates #13560 Updates #25433 Updates #28089 Change-Id: I4e374163b16c3f972f9591ec2647fd3d5a2dd5ae Reviewed-on: https://go-review.googlesource.com/c/158817 Reviewed-by: Rob Pike <[email protected]>
This package has been moved to a new home. Its new import path is dmitri.shuralyov.com/go/generated. Deprecate this old copy in favor of the new one. Replace the implementation with a forwarding stub. Updates golang/go#28089
I'm reluctant to add more fields to structs - it seems like a bit overkill. It seems like we could use the current parser with PackageClauseOnly | ParseComments
What do you think? |
This proposal has been added to the active column of the proposals project |
The suggestion in #28089 (comment) seems good to me so far. I'll elaborate on some minor details below. Since there's no I think using I wrote this package primarily as an exercise to test that it would be possible to implement a parser for the spec proposed in #13560, since that was an important goal. I haven't yet used it anywhere. pkg.go.dev reports one known importer in |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
Change https://go.dev/cl/487935 mentions this issue: |
Given that #13560 has been accepted, resolved, and by now, widely accepted by the Go community, I think it can be helpful to have a Go parser for it that tools written in Go could use (if desired).
It's relatively easy to write an ad hoc parser using the
regexp
package, but it's also possible to write a more specialized one that has less overhead.I already wrote one a while ago, and it currently lives at
github.com/shurcooL/go/generated
.I want to move it out of the repository it's currently in, which contains many miscellaneous Go packages of lower utility and quality. I was originally planning to move it out into a standalone repository on my personal site, but then I thought it might be a good fit under
x/tools
subrepo, specifically, in thex/tools/go
directory, since it deals with Go code. The proposed import path would be:Hence this proposal. If accepted, I'm happy to maintain it/be the owner. The scope is very narrow, so it should be very low volume of work.
Not sure how this intersects with #17244.
If not accepted, I would likely move it here instead:
(The code is currently MIT licensed, but in either case, I'd relicense it under the Go license.)
/cc @andybons @bradfitz @alandonovan @matloob @ianthehat
The text was updated successfully, but these errors were encountered: