ShellCheck: DevGuide – ShellCheck Dev Guide

Want to write a new test? (as opposed to an Integration with an editor or CI system)

Some familiarity with Haskell helps. Most checks just use pattern matching and function calls. Grokking monads is generally not required, but do notation may come in handy.

ShellCheck wiki policy

The ShellCheck wiki can be edited by anyone with a GitHub account. Feel free to update it with special cases and additional information. If you are making a significant edit and would like someone to double check it, you can file an issue with the title [Wiki] Updated SC1234 to ... (and point to this paragraph since this suggestion is still new).

ShellCheck theory

Of these, AST analysis is the most relevant, and where most of the interesting checks happen.

Parsing

Notes are only emitted when parsing succeeds (they are stored in the Parsec user state). For example, a note is emitted when adding spaces around = in assignments, because if the parser later fails (i.e. it's not actually an assignment), we want to discard the suggestion:

when (hasLeftSpace || hasRightSpace) $
    parseNoteAt pos ErrorC 1068 "Don't put spaces around the = in assignments."

On the other hand, problems are always emitted, even when parsing fails (they are stored in a StateT higher than Parsec in the transformer stack). For example, a problem is emitted if there's an unescaped linefeed in a [ .. ] expression, because the statement is likely malformed or unterminated, and we want to show this warning even if we're unable to parse the whole thing:

when (single && '\n' `elem` space) $
    parseProblemAt pos ErrorC 1080 "When breaking lines in [ ], you need \\ before the linefeed."

So basically, notes are emitted for non-fatal warnings while problems are emitted for fatal ones.

There's a distinction because often you can emit useful information even when parsing fails (suggestions for how to fix it). Likewise, there's often issues that only make sense in context, and shouldn't be emitted if the result does not end up being used. There are probably better solutions for this.

--                            v-- Read real/mocked files  v-- Stores parse problems
type SCBase m = Mr.ReaderT (SystemInterface m) (Ms.StateT SystemState m)
type SCParser m v = ParsecT String UserState (SCBase m) v
--                                 ^-- Stores parse notes and token offsets

AST analysis

AST analysis comes in two primary flavors: checks that run on the root node (sometimes called "tree checks"), and checks that run on every node (sometimes called "node checks"). Due to poor planning, these can't be distinguished by type because they both just take a Token parameter.

Here's a simple check designed to run on each node, using pattern matching to find backticks:

checkBackticks _ (T_Backticked id list) | not (null list) =
    style id 2006 "Use $(..) instead of legacy `..`."
checkBackticks _ _ = return ()

A lot of checks are just like this, though usually with a bit more matching logic.

prop_checkBackticks1 = verify checkBackticks "echo `foo`"
prop_checkBackticks2 = verifyNot checkBackticks "echo $(foo)"
prop_checkBackticks3 = verifyNot checkBackticks "echo `#inlined comment` foo"

For example, many tests trigger only for certain commands. This could be done by N tests like the above, each matching command nodes and checking that the command name applies (N node patches, N command name extractions, N comparisons). It's more efficient to just have 1 node match, 1 name extraction, and then a map lookup to find one or more command handlers. Such checks just register to handle a command name, and can be found in Checks/Command.hs.

Similarly, some checks only trigger for a certain shell. This could be done by N tree checks that optionally iterate the tree, or N node checks that match a node and skip emitting for certain shells, but it's more efficient to iterate the tree once with all applicable checks. Such checks just register to handle nodes for a certain shell, and can be found in Checks/ShellSupport.hs.

Formatting

ShellCheck has multiple output formatters. These take parsing results and outputs them as JSON, XML or human-readable output. They rarely need tweaking. Anyone looking for a different output format should consider transforming one of the existing ones (with XSLT, Python, etc) instead of writing a new formatter.

ShellCheck in practice

Let's say that we have a pet peeve: people who use tmp as a temporary filename. We want to warn about statements like sort file > tmp && mv tmp file, and suggest using mktemp instead.

To get started, clone the ShellCheck repository and run cabal repl followed by :load ShellCheck.Debug. This is a development module that offers access to a number of convenient methods, helpfully listed in Debug.hs:

*ShellCheck.Debug> stringToAst "sort file > tmp"
OuterToken (Id 1) (Inner_T_Annotation [] (OuterToken (Id 15) (Inner_T_Script (OuterToken (Id 0) (Inner_T_Literal "")) [OuterToken (Id 14) (Inner_T_Pipeline [] [OuterToken (Id 12) (Inner_T_Redirecting [OuterToken (Id 11) (Inner_T_FdRedirect "" (OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")])))))] (OuterToken (Id 13) (Inner_T_SimpleCommand [] [OuterToken (Id 4) (Inner_T_NormalWord [OuterToken (Id 3) (Inner_T_Literal "sort")]),OuterToken (Id 6) (Inner_T_NormalWord [OuterToken (Id 5) (Inner_T_Literal "file")])])))])])))

(The AST node T_Literal id str is an alias for OuterToken (Id id) (Inner_T_Literal str). GHC outputs the latter, unfortunately making it a bit difficult to read. However, with some effort we can see the part we're interested in:

(OuterToken (Id 10) (Inner_T_IoFile (OuterToken (Id 7) Inner_T_Greater) (OuterToken (Id 9) (Inner_T_NormalWord [OuterToken (Id 8) (Inner_T_Literal "tmp")]))))

This would be equivalent to: (TODO: find a way to format it this way automatically)

(T_IoFile (Id 10) (T_Greater (Id 7)) (T_NormalWord (Id 9) [T_Literal (Id 8) "tmp"]))

--                v-- Redirection operator (T_Greater)
    | T_IoFile Id Token Token
--                      ^-- Filename (T_NormalWord)

  checkTmpFilename _ token =
      case token of
        T_IoFile id operator filename  ->
          warn id 9999 $ "We found this node: " ++ (show token)
        _ -> return ()

and then append checkTmpFilename to the list of node checks at the top of the file:

  nodeChecks :: [Parameters -> Token -> Writer [TokenComment] ()]
  nodeChecks = [
      checkUuoc
      ,checkPipePitfalls
      ,checkForInQuoted
      ...
      ,checkTmpFilename  -- Here
    ]

We can now quick-reload the files with :r, and use ShellCheck.Debug's shellcheckString to run all of ShellCheck (minus output formatters):

Or alternatively build and run to see the check apply as it would when invoking shellcheck:

cabal run shellcheck - <<<  "sort file > tmp"

Alternatively, we can run it in interpreted mode, which is almost as quick as :r:

./quickrun - <<< "sort file > tmp"

Now we can flesh out the check. See ASTLib.hs and AnalyzerLib.hs for convenient functions to work with AST nodes, such as getting the name of an invoked command, getting a list of flags using canonical flag parsing rules, or in this case, getting the literal string of a T_NormalWord so that it doesn't matter if we use > 'tmp', > "tmp" or > "t"'m'p:

  checkTmpFilename _ token =
      case token of
        T_IoFile id operator filename  ->
          when (getLiteralString filename == Just "tmp") $
            warn (getId filename) 9999 $ "Please use mktemp instead of the filename 'tmp'."
        _ -> return ()

We can also prepend a few unit tests that will automatically be picked up if they start with prop_:

prop_checkTmpFilename1 = verify checkTmpFilename "sort file > tmp"
prop_checkTmpFilename2 = verifyNot checkTmpFilename "sort file > $tmp"

We can run these tests with cabal test, or in interpreted mode with ./quicktest. If the command exits with success, it's good to go.

If we wanted to submit this test, we could run ./nextnumber which will output the next unused SC2xxx code, e.g. 2213 as of writing.

For any questions like "How do I turn a X into a Y?" like "shell string into an AST" or "AST into a CFG" or "AST/CFG/DFA into a GraphViz representation", see Debug.hs. It's very readable, and includes additional useful development information.

You can also find the ShellCheck author (me) on IRC as koala_man in #haskell@libera.chat

ShellCheck is a static analysis tool for shell scripts. This page is part of its documentation.