| 1 | // Copyright 2018 The Go Authors. All rights reserved. |
|---|---|
| 2 | // Use of this source code is governed by a BSD-style |
| 3 | // license that can be found in the LICENSE file. |
| 4 | |
| 5 | /* |
| 6 | Package analysis defines the interface between a modular static |
| 7 | analysis and an analysis driver program. |
| 8 | |
| 9 | # Background |
| 10 | |
| 11 | A static analysis is a function that inspects a package of Go code and |
| 12 | reports a set of diagnostics (typically mistakes in the code), and |
| 13 | perhaps produces other results as well, such as suggested refactorings |
| 14 | or other facts. An analysis that reports mistakes is informally called a |
| 15 | "checker". For example, the printf checker reports mistakes in |
| 16 | fmt.Printf format strings. |
| 17 | |
| 18 | A "modular" analysis is one that inspects one package at a time but can |
| 19 | save information from a lower-level package and use it when inspecting a |
| 20 | higher-level package, analogous to separate compilation in a toolchain. |
| 21 | The printf checker is modular: when it discovers that a function such as |
| 22 | log.Fatalf delegates to fmt.Printf, it records this fact, and checks |
| 23 | calls to that function too, including calls made from another package. |
| 24 | |
| 25 | By implementing a common interface, checkers from a variety of sources |
| 26 | can be easily selected, incorporated, and reused in a wide range of |
| 27 | driver programs including command-line tools (such as vet), text editors and |
| 28 | IDEs, build and test systems (such as go build, Bazel, or Buck), test |
| 29 | frameworks, code review tools, code-base indexers (such as SourceGraph), |
| 30 | documentation viewers (such as godoc), batch pipelines for large code |
| 31 | bases, and so on. |
| 32 | |
| 33 | # Analyzer |
| 34 | |
| 35 | The primary type in the API is Analyzer. An Analyzer statically |
| 36 | describes an analysis function: its name, documentation, flags, |
| 37 | relationship to other analyzers, and of course, its logic. |
| 38 | |
| 39 | To define an analysis, a user declares a (logically constant) variable |
| 40 | of type Analyzer. Here is a typical example from one of the analyzers in |
| 41 | the go/analysis/passes/ subdirectory: |
| 42 | |
| 43 | package unusedresult |
| 44 | |
| 45 | var Analyzer = &analysis.Analyzer{ |
| 46 | Name: "unusedresult", |
| 47 | Doc: "check for unused results of calls to some functions", |
| 48 | Run: run, |
| 49 | ... |
| 50 | } |
| 51 | |
| 52 | func run(pass *analysis.Pass) (interface{}, error) { |
| 53 | ... |
| 54 | } |
| 55 | |
| 56 | An analysis driver is a program such as vet that runs a set of |
| 57 | analyses and prints the diagnostics that they report. |
| 58 | The driver program must import the list of Analyzers it needs. |
| 59 | Typically each Analyzer resides in a separate package. |
| 60 | To add a new Analyzer to an existing driver, add another item to the list: |
| 61 | |
| 62 | import ( "unusedresult"; "nilness"; "printf" ) |
| 63 | |
| 64 | var analyses = []*analysis.Analyzer{ |
| 65 | unusedresult.Analyzer, |
| 66 | nilness.Analyzer, |
| 67 | printf.Analyzer, |
| 68 | } |
| 69 | |
| 70 | A driver may use the name, flags, and documentation to provide on-line |
| 71 | help that describes the analyses it performs. |
| 72 | The doc comment contains a brief one-line summary, |
| 73 | optionally followed by paragraphs of explanation. |
| 74 | |
| 75 | The Analyzer type has more fields besides those shown above: |
| 76 | |
| 77 | type Analyzer struct { |
| 78 | Name string |
| 79 | Doc string |
| 80 | Flags flag.FlagSet |
| 81 | Run func(*Pass) (interface{}, error) |
| 82 | RunDespiteErrors bool |
| 83 | ResultType reflect.Type |
| 84 | Requires []*Analyzer |
| 85 | FactTypes []Fact |
| 86 | } |
| 87 | |
| 88 | The Flags field declares a set of named (global) flag variables that |
| 89 | control analysis behavior. Unlike vet, analysis flags are not declared |
| 90 | directly in the command line FlagSet; it is up to the driver to set the |
| 91 | flag variables. A driver for a single analysis, a, might expose its flag |
| 92 | f directly on the command line as -f, whereas a driver for multiple |
| 93 | analyses might prefix the flag name by the analysis name (-a.f) to avoid |
| 94 | ambiguity. An IDE might expose the flags through a graphical interface, |
| 95 | and a batch pipeline might configure them from a config file. |
| 96 | See the "findcall" analyzer for an example of flags in action. |
| 97 | |
| 98 | The RunDespiteErrors flag indicates whether the analysis is equipped to |
| 99 | handle ill-typed code. If not, the driver will skip the analysis if |
| 100 | there were parse or type errors. |
| 101 | The optional ResultType field specifies the type of the result value |
| 102 | computed by this analysis and made available to other analyses. |
| 103 | The Requires field specifies a list of analyses upon which |
| 104 | this one depends and whose results it may access, and it constrains the |
| 105 | order in which a driver may run analyses. |
| 106 | The FactTypes field is discussed in the section on Modularity. |
| 107 | The analysis package provides a Validate function to perform basic |
| 108 | sanity checks on an Analyzer, such as that its Requires graph is |
| 109 | acyclic, its fact and result types are unique, and so on. |
| 110 | |
| 111 | Finally, the Run field contains a function to be called by the driver to |
| 112 | execute the analysis on a single package. The driver passes it an |
| 113 | instance of the Pass type. |
| 114 | |
| 115 | # Pass |
| 116 | |
| 117 | A Pass describes a single unit of work: the application of a particular |
| 118 | Analyzer to a particular package of Go code. |
| 119 | The Pass provides information to the Analyzer's Run function about the |
| 120 | package being analyzed, and provides operations to the Run function for |
| 121 | reporting diagnostics and other information back to the driver. |
| 122 | |
| 123 | type Pass struct { |
| 124 | Fset *token.FileSet |
| 125 | Files []*ast.File |
| 126 | OtherFiles []string |
| 127 | IgnoredFiles []string |
| 128 | Pkg *types.Package |
| 129 | TypesInfo *types.Info |
| 130 | ResultOf map[*Analyzer]interface{} |
| 131 | Report func(Diagnostic) |
| 132 | ... |
| 133 | } |
| 134 | |
| 135 | The Fset, Files, Pkg, and TypesInfo fields provide the syntax trees, |
| 136 | type information, and source positions for a single package of Go code. |
| 137 | |
| 138 | The OtherFiles field provides the names, but not the contents, of non-Go |
| 139 | files such as assembly that are part of this package. See the "asmdecl" |
| 140 | or "buildtags" analyzers for examples of loading non-Go files and reporting |
| 141 | diagnostics against them. |
| 142 | |
| 143 | The IgnoredFiles field provides the names, but not the contents, |
| 144 | of ignored Go and non-Go source files that are not part of this package |
| 145 | with the current build configuration but may be part of other build |
| 146 | configurations. See the "buildtags" analyzer for an example of loading |
| 147 | and checking IgnoredFiles. |
| 148 | |
| 149 | The ResultOf field provides the results computed by the analyzers |
| 150 | required by this one, as expressed in its Analyzer.Requires field. The |
| 151 | driver runs the required analyzers first and makes their results |
| 152 | available in this map. Each Analyzer must return a value of the type |
| 153 | described in its Analyzer.ResultType field. |
| 154 | For example, the "ctrlflow" analyzer returns a *ctrlflow.CFGs, which |
| 155 | provides a control-flow graph for each function in the package (see |
| 156 | golang.org/x/tools/go/cfg); the "inspect" analyzer returns a value that |
| 157 | enables other Analyzers to traverse the syntax trees of the package more |
| 158 | efficiently; and the "buildssa" analyzer constructs an SSA-form |
| 159 | intermediate representation. |
| 160 | Each of these Analyzers extends the capabilities of later Analyzers |
| 161 | without adding a dependency to the core API, so an analysis tool pays |
| 162 | only for the extensions it needs. |
| 163 | |
| 164 | The Report function emits a diagnostic, a message associated with a |
| 165 | source position. For most analyses, diagnostics are their primary |
| 166 | result. |
| 167 | For convenience, Pass provides a helper method, Reportf, to report a new |
| 168 | diagnostic by formatting a string. |
| 169 | Diagnostic is defined as: |
| 170 | |
| 171 | type Diagnostic struct { |
| 172 | Pos token.Pos |
| 173 | Category string // optional |
| 174 | Message string |
| 175 | } |
| 176 | |
| 177 | The optional Category field is a short identifier that classifies the |
| 178 | kind of message when an analysis produces several kinds of diagnostic. |
| 179 | |
| 180 | The Diagnostic struct does not have a field to indicate its severity |
| 181 | because opinions about the relative importance of Analyzers and their |
| 182 | diagnostics vary widely among users. The design of this framework does |
| 183 | not hold each Analyzer responsible for identifying the severity of its |
| 184 | diagnostics. Instead, we expect that drivers will allow the user to |
| 185 | customize the filtering and prioritization of diagnostics based on the |
| 186 | producing Analyzer and optional Category, according to the user's |
| 187 | preferences. |
| 188 | |
| 189 | Most Analyzers inspect typed Go syntax trees, but a few, such as asmdecl |
| 190 | and buildtag, inspect the raw text of Go source files or even non-Go |
| 191 | files such as assembly. To report a diagnostic against a line of a |
| 192 | raw text file, use the following sequence: |
| 193 | |
| 194 | content, err := ioutil.ReadFile(filename) |
| 195 | if err != nil { ... } |
| 196 | tf := fset.AddFile(filename, -1, len(content)) |
| 197 | tf.SetLinesForContent(content) |
| 198 | ... |
| 199 | pass.Reportf(tf.LineStart(line), "oops") |
| 200 | |
| 201 | # Modular analysis with Facts |
| 202 | |
| 203 | To improve efficiency and scalability, large programs are routinely |
| 204 | built using separate compilation: units of the program are compiled |
| 205 | separately, and recompiled only when one of their dependencies changes; |
| 206 | independent modules may be compiled in parallel. The same technique may |
| 207 | be applied to static analyses, for the same benefits. Such analyses are |
| 208 | described as "modular". |
| 209 | |
| 210 | A compiler’s type checker is an example of a modular static analysis. |
| 211 | Many other checkers we would like to apply to Go programs can be |
| 212 | understood as alternative or non-standard type systems. For example, |
| 213 | vet's printf checker infers whether a function has the "printf wrapper" |
| 214 | type, and it applies stricter checks to calls of such functions. In |
| 215 | addition, it records which functions are printf wrappers for use by |
| 216 | later analysis passes to identify other printf wrappers by induction. |
| 217 | A result such as “f is a printf wrapper” that is not interesting by |
| 218 | itself but serves as a stepping stone to an interesting result (such as |
| 219 | a diagnostic) is called a "fact". |
| 220 | |
| 221 | The analysis API allows an analysis to define new types of facts, to |
| 222 | associate facts of these types with objects (named entities) declared |
| 223 | within the current package, or with the package as a whole, and to query |
| 224 | for an existing fact of a given type associated with an object or |
| 225 | package. |
| 226 | |
| 227 | An Analyzer that uses facts must declare their types: |
| 228 | |
| 229 | var Analyzer = &analysis.Analyzer{ |
| 230 | Name: "printf", |
| 231 | FactTypes: []analysis.Fact{new(isWrapper)}, |
| 232 | ... |
| 233 | } |
| 234 | |
| 235 | type isWrapper struct{} // => *types.Func f “is a printf wrapper” |
| 236 | |
| 237 | The driver program ensures that facts for a pass’s dependencies are |
| 238 | generated before analyzing the package and is responsible for propagating |
| 239 | facts from one package to another, possibly across address spaces. |
| 240 | Consequently, Facts must be serializable. The API requires that drivers |
| 241 | use the gob encoding, an efficient, robust, self-describing binary |
| 242 | protocol. A fact type may implement the GobEncoder/GobDecoder interfaces |
| 243 | if the default encoding is unsuitable. Facts should be stateless. |
| 244 | Because serialized facts may appear within build outputs, the gob encoding |
| 245 | of a fact must be deterministic, to avoid spurious cache misses in |
| 246 | build systems that use content-addressable caches. |
| 247 | The driver makes a single call to the gob encoder for all facts |
| 248 | exported by a given analysis pass, so that the topology of |
| 249 | shared data structures referenced by multiple facts is preserved. |
| 250 | |
| 251 | The Pass type has functions to import and export facts, |
| 252 | associated either with an object or with a package: |
| 253 | |
| 254 | type Pass struct { |
| 255 | ... |
| 256 | ExportObjectFact func(types.Object, Fact) |
| 257 | ImportObjectFact func(types.Object, Fact) bool |
| 258 | |
| 259 | ExportPackageFact func(fact Fact) |
| 260 | ImportPackageFact func(*types.Package, Fact) bool |
| 261 | } |
| 262 | |
| 263 | An Analyzer may only export facts associated with the current package or |
| 264 | its objects, though it may import facts from any package or object that |
| 265 | is an import dependency of the current package. |
| 266 | |
| 267 | Conceptually, ExportObjectFact(obj, fact) inserts fact into a hidden map keyed by |
| 268 | the pair (obj, TypeOf(fact)), and the ImportObjectFact function |
| 269 | retrieves the entry from this map and copies its value into the variable |
| 270 | pointed to by fact. This scheme assumes that the concrete type of fact |
| 271 | is a pointer; this assumption is checked by the Validate function. |
| 272 | See the "printf" analyzer for an example of object facts in action. |
| 273 | |
| 274 | Some driver implementations (such as those based on Bazel and Blaze) do |
| 275 | not currently apply analyzers to packages of the standard library. |
| 276 | Therefore, for best results, analyzer authors should not rely on |
| 277 | analysis facts being available for standard packages. |
| 278 | For example, although the printf checker is capable of deducing during |
| 279 | analysis of the log package that log.Printf is a printf wrapper, |
| 280 | this fact is built in to the analyzer so that it correctly checks |
| 281 | calls to log.Printf even when run in a driver that does not apply |
| 282 | it to standard packages. We would like to remove this limitation in future. |
| 283 | |
| 284 | # Testing an Analyzer |
| 285 | |
| 286 | The analysistest subpackage provides utilities for testing an Analyzer. |
| 287 | In a few lines of code, it is possible to run an analyzer on a package |
| 288 | of testdata files and check that it reported all the expected |
| 289 | diagnostics and facts (and no more). Expectations are expressed using |
| 290 | "// want ..." comments in the input code. |
| 291 | |
| 292 | # Standalone commands |
| 293 | |
| 294 | Analyzers are provided in the form of packages that a driver program is |
| 295 | expected to import. The vet command imports a set of several analyzers, |
| 296 | but users may wish to define their own analysis commands that perform |
| 297 | additional checks. To simplify the task of creating an analysis command, |
| 298 | either for a single analyzer or for a whole suite, we provide the |
| 299 | singlechecker and multichecker subpackages. |
| 300 | |
| 301 | The singlechecker package provides the main function for a command that |
| 302 | runs one analyzer. By convention, each analyzer such as |
| 303 | go/analysis/passes/findcall should be accompanied by a singlechecker-based |
| 304 | command such as go/analysis/passes/findcall/cmd/findcall, defined in its |
| 305 | entirety as: |
| 306 | |
| 307 | package main |
| 308 | |
| 309 | import ( |
| 310 | "golang.org/x/tools/go/analysis/passes/findcall" |
| 311 | "golang.org/x/tools/go/analysis/singlechecker" |
| 312 | ) |
| 313 | |
| 314 | func main() { singlechecker.Main(findcall.Analyzer) } |
| 315 | |
| 316 | A tool that provides multiple analyzers can use multichecker in a |
| 317 | similar way, giving it the list of Analyzers. |
| 318 | */ |
| 319 | package analysis |
| 320 |
Members