Timing Sign-Off for the Open Silicon Era — and Why We Wrote It in Rust
The static timing sign-off the open-source EDA world never had — commercial-grade, CI-native, and built in Rust so it runs natively from a MacBook to an Arm64 cloud.
June 25, 2026 • By Shivaram Mysore
The gap: open EDA had timing analysis, not timing sign-off
The open-source EDA ecosystem is genuinely good now. Yosys synthesizes. OpenROAD floorplans, places, and routes. Magic and KLayout do DRC. ngspice solves circuits. OpenSTA does static timing analysis. You can take a design from RTL toward GDSII without a single proprietary license.
And then you hit sign-off — and the floor gives way.
Sign-off is the moment you answer, with your name on it: does this chip actually meet timing? Not "did the router think so," but the real question — worst negative slack across every path, total negative slack across the design, the single worst path, with crosstalk between neighboring nets changing the answer, and IR-drop in the power grid quietly slowing every cell it touches. That answer is what a foundry expects, what a tapeout depends on, and what — until now — you could only get from tools behind six- and seven-figure licenses.
That's the gap Vyges™ Loom closes: the sign-off math you used to pay a vendor for, open, scriptable, and CI-native, under Apache-2.0.
This post is about the timing spine of that stack — vyges-sta-si, static timing analysis with signal integrity — and the two decisions behind it that we think matter more than any single feature: why we built it, and why we built it in Rust.
What timing sign-off actually is: deep tree and graph processing
To understand the engineering choices, it helps to see what these tools are computationally. Strip away the domain language and timing sign-off is a small family of large traversals:
- Static timing analysis builds a timing graph — a directed acyclic graph over every pin and net in the design — and propagates arrival times forward and required times backward through it. On a real block that's millions of nodes and edges. The slack at every endpoint falls out of that propagation.
- Parasitic extraction turns interconnect geometry into RC trees — every net becomes a tree of resistors and capacitors that the timing solve walks to get real delays.
- Characterization is thousands of independent SPICE solves — one per cell, per arc, per corner — each producing a slice of a Liberty timing model.
- Power integrity solves the power-distribution network as a giant sparse mesh to find where the grid sags.
Two things are true of all of them. They are dominated by traversing big trees and graphs, and most of that traversal is independent — different timing endpoints, different RC trees, different cells, different corners don't need to wait on each other. In other words: this is an embarrassingly parallel workload trapped inside data structures that have to be designed, very carefully, to let the parallelism out.
That last clause is the whole game. Parallelism helps enormously here — but only if the data structures, the memory ownership, and the traversal are designed for it from the first line. Bolt threads onto a graph traversal that wasn't built for them and you get data races, nondeterminism, and the kind of heisenbugs that have haunted multithreaded EDA code for decades.
Which is exactly why the language mattered.
Why Rust
We wrote the engines in Rust, and not for fashion. For a parallel, long-running, correctness-critical tree-processing workload, Rust buys three things that are otherwise expensive:
1. Fearless concurrency — data races are a compile error. Rust's ownership and borrow checker prove, at compile time, that two threads can't mutate the same data without synchronization. For an engine whose performance comes from fanning a timing graph across every core on the machine, this is transformative: you parallelize aggressively, and the class of bug that makes multithreaded EDA tools flaky and nondeterministic simply doesn't compile. Safe data-parallelism stops being a feat of heroics and becomes the default.
2. C/C++-class performance with no garbage collector. Zero-cost abstractions, no runtime, no GC pauses stuttering through a multi-hour sign-off run. The performance is deterministic — the same run takes the same time — which is exactly what you want from a tool that gates a pipeline.
3. Memory safety without giving any of that up. No use-after-free, no buffer overruns, on traversals that chase pointers through millions of nodes. Correctness you'd otherwise pay for in sanitizer runs and crash triage, delivered by the compiler.
Here is the thesis in one line: by choosing Rust, correctness-under-parallelism comes from the compiler, not from us. We get to spend our effort on the timing math instead of on thread-safety archaeology. We get it, in effect, for free.
Portability: new platforms, for free
The same choice pays a second dividend that we think is underappreciated — where these tools can run.
Rust's LLVM-backed toolchain cross-compiles cleanly, and the engines are pure, dependency-light Rust. That means the same code builds and runs natively on:
- x86_64 — the datacenter default.
- Arm64 servers — Graviton, Ampere, and the rest of the dense, cheap-core cloud. For an embarrassingly-parallel sign-off workload, more cores per dollar is the entire ballgame, and Arm64 is where that math is best right now.
- Apple Silicon (Darwin arm64) — a designer on a MacBook runs the exact engine the CI cluster runs, natively, at full speed. No "Linux x86 only" wall, no emulation tax, no second-class local experience.
This is not a port we had to grind out per platform. It's a property of the language and a dependency-light design: write it once, run it native everywhere. Portability, like concurrency-correctness, comes largely for free.
The GPU horizon
And there's a further door this leaves open. The very structure that makes this workload map onto many CPU cores — independent traversals over regular tree and graph data — is the structure that maps onto GPUs and other accelerators. We designed the engines so the parallel kernels are isolable: the hot, data-parallel inner loops are the kind of thing that can be offloaded.
We are not claiming GPU-accelerated timing sign-off today — we're claiming something more durable. Rust's accelerator story is real and growing — GPU kernels in plain Rust via rust-gpu, compute shaders via wgpu — and because we built the engines parallel-by-design in a language that reaches those backends, the path to the GPU doesn't require a rewrite in another language. We bet on an architecture that doesn't foreclose tomorrow's hardware — and a language that makes following it cheap.
That's the quiet payoff of the language choice: it lets a small team take advantage of the industry's hardware trajectory — Arm64 today, accelerators next — without re-tooling each time.
Commercial-grade, for the open world
"Commercial-grade" isn't a slogan here; it's a checklist, and the timing engine meets it:
- Foundry-standard formats. It reads Liberty (
.lib) and SPEF — the same files a sign-off team and a foundry already expect. No proprietary formats, no hidden transforms. - The real metrics. WNS and TNS, and the actual worst path — not a proxy.
- Signal integrity. Crosstalk-induced delay and noise from coupling between nets, accounted for — the difference between "the router thought it passed" and "it passes."
- A CI gate. Standard CLI (
--json,--quiet,--verbose) plus--fail-on-violation, which returns a distinct exit code so a timing violation fails your pipeline automatically. Sign-off becomes a check on every commit, not a late, manual review.
It's built for the teams the proprietary model priced out: startups without seven-figure EDA budgets, platform teams standardizing CI flows, and foundries enabling reproducible PDK validation.
One spine, not a bundle
vyges-sta-si doesn't stand alone — it's the timing decision point on a cross-engine data spine, a software architecture rather than an EDA bundle:
.ext ─► vyges-extract ─► .spef ──┐
.char ─► vyges-char ─► .lib ───┼─► vyges-sta-si ─► WNS / TNS ─► resize · vt-swap · buffer-insert ─► fixed netlist
└─► vyges-power ─► activity ─┬─► vyges-em-ir ─► IR-drop / EM
└─► vyges-thermal ─► temp / hotspot
.gds + rules ─► vyges-lvs ─► MATCH / MISMATCH
Each engine takes one declarative job file and emits one standard artifact, and the next engine reads it directly — no glue scripts, no silent copy-drift. vyges-char is the parallel Liberty characterizer the open community never had; vyges-extract turns layout into the SPEF parasitics that make the timing real; vyges-power computes the switching activity that drives both a power number and vyges-em-ir's IR-drop; and vyges-lvs proves the layout actually implements the schematic.
What makes that spine real isn't magic — it's vyges-loom: one streamlined set of parsers and a shared design model (Liberty, SPEF, Verilog, SPICE) that every engine reads and writes. No engine ships its own half-correct Liberty reader; they all weave on the same threads. That's what kills the glue and the copy-drift — and it's what makes a new engine cheap, because it reuses the parsers and the timing graph instead of reinventing them.
Which is how sign-off learned to act. The same spine that says what's wrong now fixes it — a second family of engines reads the timer's verdict and edits the netlist to close it: vyges-resize picks a better drive strength per cell, vyges-vt-swap trades threshold voltage to cut leakage while timing holds, and vyges-buffer-insert splits over-loaded nets. Each rides the same vyges-loom spine and scores every candidate on the same vyges-sta-si timer — they exist because the parsers and the timer already did. Analysis and optimization, one architecture, over the vyges-layout geometry kernel — all Apache-2.0 and runnable today.
Why this matters
Open silicon has been climbing the stack for a decade — open cores, open buses, open PDKs, an open IP registry, open place-and-route. Sign-off was the last rung where the open path ran out and the only answer was a vendor invoice.
Making commercial-grade timing sign-off open, foundry-format, CI-native — and portable and parallel-by-design so it runs natively from a laptop to an Arm64 fleet — is part of letting anyone take a design all the way from RTL to a foundry without asking permission. The language choice is how a small team delivers that and keeps pace with where the hardware is going.
Try it
$ vyges-sta-si demo # instant — analyze a built-in design → WNS/TNS
$ vyges-sta-si run top.sta -o top.rpt # analyze → timing report
$ vyges-sta-si run top.sta --fail-on-violation # exit non-zero if WNS < 0 (CI gate)
Already on OpenROAD, LibreLane, or OpenLane 2?
vyges-sta-sislots in right next to OpenSTA — zero rewrite. Call the binary, point it at your existing OpenSTA script, or drop in the LibreLane step. Integration guide → docs.vyges.com.
See Vyges™ Loom and the data spine, or go straight to vyges-sta-si.
The sign-off math you used to pay a vendor for — open, in Rust, everywhere.