goo rendering benchmarks¶
what rebuild, diff, and apply actually cost in goo, measured two ways.
in goo, every UI class extends GooPanel<TRoot>. you implement Build(), which returns a tree of blobs describing the UI you want. goo runs Build(), compares the new tree against the prior one (the diff step), and mutates the engine Panel tree to match (the apply step). Rebuild() schedules the next Build() run. see build method for the full explanation.
the benchmarks measure two distinct phases:
- out of editor (
Code.Tests/, vanilla .NET test host). diff-only: Build plus the reconciler's diff pass. allocation is reported viaGC.GetAllocatedBytesForCurrentThread(), and wall-clock viaStopwatch. the apply step (mutating real engine panels) is excluded because there is no engine in the test host. - in editor (s&box runtime). end-to-end wall-clock: each benchmark component exposes a
Tickmethod that callsRebuild(), which triggersBuild(), the diff pass, and the apply pass against a realSandbox.UI.Paneltree. the panels are hosted inside an s&boxScreenPanelcomponent (the engine's built-in host for 2D world UI, not a goo type).
the diff-only numbers are the floor on goo's per-frame cost. the in-editor numbers are the truth, since the apply pass mutates real engine panels.
out of editor¶
allocations (bytes)¶
Init is the one-time cost when the screen first appears. Per-iter is the steady-state cost on every redraw after that.
| Surface | Pattern | Init | Per-iter |
|---|---|---|---|
| Flat | IdenticalTree | 268,040 | 0 |
| Flat | SingleTextEdit | 268,040 | 0 |
| ShallowGrid | IdenticalTree | 276,000 | 0 |
| ShallowGrid | SingleTextEdit | 276,000 | 0 |
| DeepNest | IdenticalTree | 40,480 | 0 |
| DeepNest | SingleTextEdit | 40,480 | 0 |
| RealisticUI | IdenticalTree | 59,576 | 0 |
| RealisticUI | SingleTextEdit | 59,576 | 0 |
| Counter | IdenticalTree | 6,832 | 0 |
| Counter | SingleTextEdit | 6,832 | 0 |
| AnimatedShape | IdenticalTree | 4,488 | 0 |
| AnimatedShape | SingleTextEdit | 4,488 | 0 |
| DiffKeyed-500 | Stable | 298,328 | 0 |
| DiffKeyed-500 | Reverse | 298,328 | 0 |
wall-clock per iteration (microseconds)¶
| Surface | Pattern | Goo |
|---|---|---|
| Flat | IdenticalTree | 47.48 |
| Flat | SingleTextEdit | 46.34 |
| ShallowGrid | IdenticalTree | 53.84 |
| ShallowGrid | SingleTextEdit | 53.99 |
| DeepNest | IdenticalTree | 7.80 |
| DeepNest | SingleTextEdit | 8.73 |
| RealisticUI | IdenticalTree | 15.08 |
| RealisticUI | SingleTextEdit | 15.60 |
| Counter | IdenticalTree | 2.34 |
| Counter | SingleTextEdit | 2.51 |
| AnimatedShape | IdenticalTree | 1.62 |
| AnimatedShape | SingleTextEdit | 1.72 |
| DiffKeyed-500 | Stable | 478.27 |
| DiffKeyed-500 | Reverse | 879.03 |
what this means¶
- every cell uses zero memory per redraw.
- this is achieved by using memory pools, keeping containers alive between redraws. each redraw they are cleared, refilled, read, then handed back to the pool.
- first mount uses memory once. bigger screens cost more.
- DeepNest is the fastest shape, around 8 microseconds.
in editor¶
the in-editor benchmark harness was removed from the source tree. the numbers below were captured from an earlier run and are preserved here as a reference. re-running them requires re-implementing a ConCmd harness in the s&box runtime.
wall-clock per iteration (microseconds)¶
| Surface | Pattern | Goo (us) | delta vs out-of-editor |
|---|---|---|---|
| Flat | IdenticalTree | 48.82 | +1.3 |
| Flat | SingleTextEdit | 49.96 | +3.6 |
| ShallowGrid | IdenticalTree | 58.75 | +4.9 |
| ShallowGrid | SingleTextEdit | 58.87 | +4.9 |
| DeepNest | IdenticalTree | 7.23 | -0.6 |
| DeepNest | SingleTextEdit | 8.18 | -0.5 |
| RealisticUI | IdenticalTree | 17.22 | +2.1 |
| RealisticUI | SingleTextEdit | 15.89 | +0.3 |
| Counter | IdenticalTree | 2.32 | -0.0 |
| Counter | SingleTextEdit | 2.70 | +0.2 |
| AnimatedShape | IdenticalTree | 1.81 | +0.2 |
| AnimatedShape | SingleTextEdit | 5.49 | +3.8 |
| DiffKeyed-500 | Stable | 855.59 | +377 |
| DiffKeyed-500 | Reverse | 3139.34 | +2260 |
in-editor allocations (qualitative)¶
captured from the s&box editor's allocations panel during the pre-pooling run. every dominant goo type from that baseline is now pool-backed:
| Type | Pre-pool count/iter | Status |
|---|---|---|
Entry[String,Fiber][] |
333 | eliminated (Dictionary pooled on BuildContext) |
Entry[String][] |
296 | eliminated (HashSet pooled on BuildContext) |
System.Int32[] |
220 | eliminated (HostPath int[] pooled by length on BuildContext) |
Goo.Op[] |
74 | eliminated (Op is a value type, List |
BuildContext is the internal per-build scratch context that the reconciler uses to track the current path and keyed sets. the out-of-editor alloc table above is the source of truth for diff-side cost. the in-editor panel reflects the same pooling, qualitatively.
what this means¶
- simple screens match in and out within a few microseconds.
- AnimatedShape is slower in the engine. the engine sets the size every redraw.
- DiffKeyed-500 Reverse is the slowest case. the engine moves 500 panels.
harness¶
- Build: Release (
dotnet test --configuration Releasefor out-of-editor. in-editor benchmark numbers above are from Release-loaded addons). - Trial structure: 5 trials per cell, each = 10 warmup iterations + 100 measured iterations. Reported value = median of the 5 trial means.
- Out-of-editor timer:
Stopwatch.Elapsed.TotalMicrosecondsdivided by iteration count. - Out-of-editor allocation:
GC.GetAllocatedBytesForCurrentThread()delta divided by iteration count. - In-editor timer:
DateTime.UtcNow.Ticks(100ns resolution, monotonic).Time.Nowis frame-locked and stays constant during a synchronous ConCmd loop.StopwatchandGC.*are not on the addon whitelist. - Out-of-editor source:
Code.Tests/AllocationProfileTests.cs,Code.Tests/DiffKeyedPerfTests.cs.
caveats¶
- No GC-pause attribution. Wall-clock includes any gen-0 collections that fire during the measured loop. DiffKeyed-500 cells (90-131KB alloc) plausibly trigger collections that inflate wall-clock somewhat, but the share is not separable from useful work.
- Single hardware sample. Numbers are valid for one Windows 11 machine. cross-machine comparison requires re-running the harness.
- In-editor numbers assume the bench surface is the only thing rendering. A real frame would mix this work with other engine work. the per-iter numbers here are isolated cost, not frame-budget share.
surfaces¶
- Flat: 1 container, 300 text children. SingleTextEdit swaps child 0 per iter.
- ShallowGrid: 1 container, 15 column containers, 20 text children each (316 nodes). SingleTextEdit swaps cell (0,0) per iter.
- DeepNest: 16 nested containers, 1 leaf text. SingleTextEdit swaps the leaf per iter.
- RealisticUI: 9 styled containers, 26 text children, depth 5, 12-field style profile per container. Typed init-properties. SingleTextEdit swaps card child 0 per iter.
- Counter: 1 styled container, 1 text label, 1 child button container with click handler and 1 text child. SingleTextEdit swaps the label per iter.
- AnimatedShape: 200x16 progress bar, child filled width driven by a float sampled from a damper in the runner. Identical holds the fill at 1. SingleTextEdit toggles fill between 0 and 1 per iter.
- DiffKeyed-500: 1 container, 500 keyed text children. Stable rebuilds same order. Reverse rebuilds reversed order on alternating iterations.
see also¶
- build method -
Rebuild()mechanics, the structural diff, and the apply pass in detail. - animations - damper types including the float dampers used to drive the AnimatedShape surface.
- shapes - the shape primitives (
Sector,Arc,Polygon) available as blob types.