goo rendering benchmarks

what rebuild, diff, and apply actually cost in goo, measured two ways.

in goo, every UI class extends GooPanel<TRoot>. you implement Build(), which returns a tree of blobs describing the UI you want. goo runs Build(), compares the new tree against the prior one (the diff step), and mutates the engine Panel tree to match (the apply step). Rebuild() schedules the next Build() run. see build method for the full explanation.

the benchmarks measure two distinct phases:

  1. out of editor (Code.Tests/, vanilla .NET test host). diff-only: Build plus the reconciler's diff pass. allocation is reported via GC.GetAllocatedBytesForCurrentThread(), and wall-clock via Stopwatch. the apply step (mutating real engine panels) is excluded because there is no engine in the test host.
  2. in editor (s&box runtime). end-to-end wall-clock: each benchmark component exposes a Tick method that calls Rebuild(), which triggers Build(), the diff pass, and the apply pass against a real Sandbox.UI.Panel tree. the panels are hosted inside an s&box ScreenPanel component (the engine's built-in host for 2D world UI, not a goo type).

the diff-only numbers are the floor on goo's per-frame cost. the in-editor numbers are the truth, since the apply pass mutates real engine panels.

out of editor

allocations (bytes)

Init is the one-time cost when the screen first appears. Per-iter is the steady-state cost on every redraw after that.

Surface Pattern Init Per-iter
Flat IdenticalTree 268,040 0
Flat SingleTextEdit 268,040 0
ShallowGrid IdenticalTree 276,000 0
ShallowGrid SingleTextEdit 276,000 0
DeepNest IdenticalTree 40,480 0
DeepNest SingleTextEdit 40,480 0
RealisticUI IdenticalTree 59,576 0
RealisticUI SingleTextEdit 59,576 0
Counter IdenticalTree 6,832 0
Counter SingleTextEdit 6,832 0
AnimatedShape IdenticalTree 4,488 0
AnimatedShape SingleTextEdit 4,488 0
DiffKeyed-500 Stable 298,328 0
DiffKeyed-500 Reverse 298,328 0

wall-clock per iteration (microseconds)

Surface Pattern Goo
Flat IdenticalTree 47.48
Flat SingleTextEdit 46.34
ShallowGrid IdenticalTree 53.84
ShallowGrid SingleTextEdit 53.99
DeepNest IdenticalTree 7.80
DeepNest SingleTextEdit 8.73
RealisticUI IdenticalTree 15.08
RealisticUI SingleTextEdit 15.60
Counter IdenticalTree 2.34
Counter SingleTextEdit 2.51
AnimatedShape IdenticalTree 1.62
AnimatedShape SingleTextEdit 1.72
DiffKeyed-500 Stable 478.27
DiffKeyed-500 Reverse 879.03

what this means

in editor

the in-editor benchmark harness was removed from the source tree. the numbers below were captured from an earlier run and are preserved here as a reference. re-running them requires re-implementing a ConCmd harness in the s&box runtime.

wall-clock per iteration (microseconds)

Surface Pattern Goo (us) delta vs out-of-editor
Flat IdenticalTree 48.82 +1.3
Flat SingleTextEdit 49.96 +3.6
ShallowGrid IdenticalTree 58.75 +4.9
ShallowGrid SingleTextEdit 58.87 +4.9
DeepNest IdenticalTree 7.23 -0.6
DeepNest SingleTextEdit 8.18 -0.5
RealisticUI IdenticalTree 17.22 +2.1
RealisticUI SingleTextEdit 15.89 +0.3
Counter IdenticalTree 2.32 -0.0
Counter SingleTextEdit 2.70 +0.2
AnimatedShape IdenticalTree 1.81 +0.2
AnimatedShape SingleTextEdit 5.49 +3.8
DiffKeyed-500 Stable 855.59 +377
DiffKeyed-500 Reverse 3139.34 +2260

in-editor allocations (qualitative)

captured from the s&box editor's allocations panel during the pre-pooling run. every dominant goo type from that baseline is now pool-backed:

Type Pre-pool count/iter Status
Entry[String,Fiber][] 333 eliminated (Dictionary pooled on BuildContext)
Entry[String][] 296 eliminated (HashSet pooled on BuildContext)
System.Int32[] 220 eliminated (HostPath int[] pooled by length on BuildContext)
Goo.Op[] 74 eliminated (Op is a value type, List._items pooled across Diffs)

BuildContext is the internal per-build scratch context that the reconciler uses to track the current path and keyed sets. the out-of-editor alloc table above is the source of truth for diff-side cost. the in-editor panel reflects the same pooling, qualitatively.

what this means

harness

caveats

  1. No GC-pause attribution. Wall-clock includes any gen-0 collections that fire during the measured loop. DiffKeyed-500 cells (90-131KB alloc) plausibly trigger collections that inflate wall-clock somewhat, but the share is not separable from useful work.
  2. Single hardware sample. Numbers are valid for one Windows 11 machine. cross-machine comparison requires re-running the harness.
  3. In-editor numbers assume the bench surface is the only thing rendering. A real frame would mix this work with other engine work. the per-iter numbers here are isolated cost, not frame-budget share.

surfaces

see also