Command and run()

The Command object simply takes an external command and runs it, logging stdout and stderr as each message arrives. The main benefits of using Command are logging and timeouts. Command takes two timeouts: output_timeout, which is how long the command can go without outputting anything before timing out, and max_timeout, which is the total amount of time that can elapse from the start of the command.

(The command is run via subprocess.Popen and timeouts are monitored via the multiprocessing module.)

After the command is run, it runs the detect_error_cb callback function to determine whether the command was run successfully.

The process of creating and running a Command is twofold: Command.__init__() and As a shortcut, there is a run() function that will do both steps for you.

ParsedCommand and parse()

Ideally, external command output would be for humans only, and the exit code would be meaningful. In practice, this is not always the case. Exit codes aren’t always helpful or even meaningful, and sometimes critical information is buried in a flood of output.

ParsedCommand takes the output of a command and parses it for matching substrings or regular expressions, using ErrorLists and OutputParser to determine the log level of a line of output. Because it subclasses Command, ParsedCommand also has built-in output_timeout and max_timeout support.

As with Command and run(), ParsedCommand has a shortcut function, parse().

ErrorLists and OutputParser

The ErrorList object describes which lines of output are of special interest. It’s a class for better validation.

An example error_list:

        "regex": re.compile("^Error: not actually an error!"),
    }, {
        "regex": re.compile("^Error:"),
        "level": logging.ERROR,
        "pre_context_lines": 5,
        "post_context_lines": 5
    }, {
        "substr": "Obscure error #94382",
            "This is a fatal program error."
        "exception": ScriptHarnessFatal

Any output line that matches the first regex will be ignored (discarded), because level is negative. Because the list is matched in order, the more specific regex is placed before the more general 2nd regex. If the order were reversed, the more specific regex would never match anything. The second regex sets the level to logging.ERROR for this line, and 5 lines above and 5 lines below this message. (See OutputBuffer and context lines.)

The final substring has an explanation that will be logged immediately after the matching line, to explain vague error messages. Because it has a defined exception, it will raise.

ParsedCommand sends its output to the OutputParser object, which passes it on to the ErrorList. It keeps track of the number of errors and warnings, as well as handling any context line buffering through the OutputBuffer.

OutputBuffer and context lines

Sometimes there’s an obvious error message line, like make: *** [all] Error 2, but it’s not very helpful without the log context around the line. For those ErrorLists, we can use pre_context_lines and post_context_lines for the number of lines before and after the matching line, respectively. So if we wanted to mark the 10 lines above the make: *** [all] Error 2 as errors, as well, then we can do so.

(Long long ago, I would buffer all the output of certain commands, notably Visual Studio output, when I either wanted to

  • separate threaded logs into easier-to-read unthreaded logs-per-component, or
  • search back up above some line, like the first make line above make: *** [all] Error 2, so we wouldn’t have to hardcode some number of pre_context_lines and guess how much context is needed.

For the moment, however, we only have pre_context_lines and post_context_lines.)

The OutputBuffer holds the buffered output for pre_context_lines, and keeps track of how many lines in the future will need to be marked at which level for post_context_lines.

If multiple lines match, and a line of output is marked as multiple levels, the highest level will win. E.g., logging.CRITICAL will beat logging.ERROR, which will beat logging.WARNING, etc.

Output, get_output(), and get_text_output()

Sometimes you need to manipulate the output from a command, not just log it or perform general error parsing. There’s subprocess.check_output(), but that doesn’t log or have full timeout support.

Enter Output. This also inherits Command, but because is a completely different method than, it has its own timeout implementation. (It does still support both output_timeout and max_timeout.) It redirects STDOUT and STDERR to temp files.

Much like Command has its helper run() function, Output has two helper functions: get_output() and get_text_output(). The former yields the Output object, and the caller can either access the NamedTemporaryFile Output.stdout and Output.stderr objects, or use the Output.get_output() method. Because of this, it is suitable for binary or lengthy output. get_text_output() will get the STDOUT contents for you, log them, and return them to you.