Why Regex-Based Linters Fall Short for SystemVerilog/UVM — A Case for Parser-Based Tools

Linting SystemVerilog and UVM testbench code is crucial to maintain quality and compatibility.
Many teams start by writing quick regex or string-search scripts to catch problematic patterns—like deprecated variables, disallowed constructs, or outdated API usage.

While regex-based linting can be tempting due to its simplicity, it often leads to false failures and missed issues because SystemVerilog’s syntax and preprocessing are too complex for simple text matching.

In this series, we will explore common linting scenarios, we introduced this in an earlier post here: https://asfigo.blogspot.com/2025/02/linting-systemverilog-testbench-code.html 

Below is the next one in this series with a concrete example - detecting deprecated UVM constructs like uvm_top, and demonstrate why a proper parser-based approach using tools like Google’s Verible is superior.

A Quick, regex style lint example

When maintaining UVM code for compatibility, one common check is to detect the use of uvm_top.
This variable was removed in UVM 1.2, so any appearance should be flagged.
At first glance, a simple Python script and string matching looks attractive:

with open("file.sv") as f:
    for lineno, line in enumerate(f, 1):
        if "uvm_top" in line:
            print(f"{lineno}: uvm_top usage")

This is quick to write and run; Unfortunately, it is also fragile and in some cases wrong. The problem is that SystemVerilog is not a regular language — syntax, comments, strings, macros, and preprocessing can all fool naΓ―ve text scans.

Where Regex Fails

Below are real scenarios where regex-based matching either reports a false failure or misses an actual issue.

1. Inside Comments

// TODO: remove uvm_top usage before release
  • Regex: flags — false failure
  • Real Need: ignore — AST node is a comment.

2. Inside String Literals

`uvm_info("CFG", "Legacy path: uvm_top used here", UVM_LOW)
  • Regex: flags — false failure
  • Real Need: ignore — AST node is a string literal.

3. In Macro Definitions

`define LEGACY_TOP uvm_top
  • Regex: flags — false failure
  • Real Need: ignore — rule can target only symbol usage in procedural/structural code.

4. Inactive Preprocessor Branches

`ifdef UVM_11D
  uvm_top.do_something();
`endif

If UVM_11D is not defined when building:

  • Regex: flags — false failure
  • Real Need: code is absent from AST after preprocessing.

Introducing Verible

Verible is an open-source SystemVerilog parser and tooling framework developed and maintained by Google.
It provides a fully compliant parser that produces an Abstract Syntax Tree (AST) representing the true syntactic structure of the code after preprocessing.

This means Verible can differentiate between comments, strings, macro expansions, and active code.
Its robustness and active maintenance make it a reliable foundation for building linting tools.


Why Parsing Wins

Requirement Regex Verible Parser
Ignore comments/strings ❌ false failures ✅ correct
Honour preprocessor defines ❌ false failures ✅ correct
Match only valid identifiers ❌ false failures ✅ correct
Maintainable for complex rules ❌ brittle ✅ extensible

Regex linting is brittle because it works purely at the text level.
In SystemVerilog/UVM, syntactic and preprocessor context matters — without a parser, you cannot reliably distinguish a real symbol usage from a harmless mention.


Verible-Based Approach

A Verible rule can be implemented to traverse the parsed syntax tree and match only relevant identifiers in the correct context:

 

  • Operates on the actual compiled source, respecting preprocessor defines.
  • Has clear access to token type (comment, string, identifier, etc.).
  • Avoids accidental matches in inactive code, macros, or comments.
  • Can be extended to match specific contexts (e.g. assignments only).

Leveraging Verible with AsFigo BYOL

While Verible provides the core parsing and AST infrastructure, building and maintaining lint rules at scale requires more tooling support.

AsFigo’s Build Your Own Linter (BYOL) framework accelerates this process by offering a modular environment to develop, test, and deploy custom Verible-based lint rules tailored to your codebase.

BYOL handles AST traversal, rule management, diagnostics reporting, and CI integration out of the box — letting engineers focus on the lint logic itself rather than infrastructure.

Using BYOL with Verible means you avoid reinventing the wheel and reduce the ongoing maintenance burden of home-grown scripts, resulting in faster and more reliable lint coverage.


Recommendation

For one-off local greps, regex may be fine.
For production lint rules that run in CI and gate merges, use a proper parser such as Verible — ideally within a framework like AsFigo BYOL.
It will save you from false failures, missed detections, and the constant maintenance cost of chasing corner cases.

Comments

Popular posts from this blog

Join Us in Cambridge - Advancing Chip Verification with UVMLint & SVALint

Open Source Chip Design and Verification Event: Unlocking the Future of Semiconductor Innovation

SVALint Technical Meetup – Reading, UK