Commit Graph

3124 Commits

Author SHA1 Message Date
chenmo.gl
8eeb0e80e4 refactor(shader): rename batch-tracking helpers to reflect their actual role
The list and helpers track renderer-group samplers and arrays — values that
GPU instancing forces every instance in a draw call to share, *not* values
that go into the per-instance UBO. The original `_instanceBatchFields` /
`_updateRendererInstanceBatchFields` / `_matchesRendererInstanceBatch` read
as "fields packed into the instance batch", which is the opposite of what
they store. Rename to `_batchSharedFields` etc. so the names match the
"must be shared across the batch" semantics, and tighten the field to
`private` (only used by helpers on the same class).
2026-05-28 00:56:33 +08:00
chenmo.gl
21cb793040 refactor(shader): extract renderer-group batch field tracking into a helper
`_setPropertyValue` was carrying ~25 lines of `_instanceBatchFields` maintenance
inline. Lift it into a dedicated `_updateRendererInstanceBatchFields` helper so
the main setter reads as "convert/validate property → maintain batch index →
store value" and the sorted-insert / unsorted-remove details stay localized.
2026-05-28 00:43:23 +08:00
chenmo.gl
d75a144815 fix(instancing): refuse instanced batches when renderer-group samplers/arrays differ
GPU instancing requires every instance in a draw call to share the same sampler
bindings and array uniforms — GLSL forbids those types inside per-instance UBOs.
`MeshRenderer._canBatch` previously ignored this and would group renderers with
different `renderer_*` textures or array uniforms into one instanced draw, where
the leader's bindings silently won and the followers rendered with the wrong
data. The matching instanced path in `RenderQueue.render` also skipped uploading
the renderer-block plain uniforms entirely, so even a single-renderer batch had
its samplers/arrays unbound.

Track each renderer-group sampler/array property as it's set and keep the ids in
a sorted list on `ShaderData`. `_canBatch` now compares those lists index by
index — equal lists mean every instance agrees on what to bind, so the batch is
safe; any divergence falls back to separate draw calls. `RenderQueue.render`
uploads the renderer block on the instanced path too, so the leader's plain
uniforms reach the GPU.
2026-05-27 18:33:01 +08:00
chenmo.gl
c46c15a5fb fix(ui): derive grid cell size from union AABB instead of referenceResolution
UIBatchSorter used `Math.max(refX, refY)` as `canvasLongestEdge` to derive
the grid cell size. `_referenceResolution` is a screen-space concept and
has no relation to world-space canvas-local coordinates, so on WS canvases
(and on SS canvases whose pivot pushes elements into the negative half)
the grid degenerated — elements collapsed onto edge cells or stretched
beyond the grid, both inflating the per-cell occupancy that the spatial
hash was meant to keep flat.

Walk the elements once at the start of `sort`, accumulating the union
AABB while we already do the world-to-canvas-local transform, then use
`unionRange / gridDim` for the grid footprint. The grid now auto-fits
the actual element distribution regardless of canvas mode, pivot, or
world scale, and the accumulation rides on the existing transform pass
so the added cost is just a few minmax compares per element.
2026-05-27 00:42:43 +08:00
chenmo.gl
5b0416368f fix(pipeline): isolate per-queue batch state when a RE crosses queues
A multi-pass material whose passes target different queues (e.g. Opaque +
Transparent) pushed the same `RenderElement` object into both queues.
`RenderElement` carries the batch result (`instancedRenderers`,
`_isBatched`), so the queue that batches first stamped those fields on
the shared object and the next queue's `BatcherManager.batch` then
short-circuited the polluted leader straight into its output — the
opaque instance list got drawn under the transparent pass, and the
remaining members were re-batched into a second leader and drawn again
(producing the "1 olive + 5 greener" signature on N-cube repros).

Clone the element once per additional queue: the first target keeps the
original (zero overhead), subsequent ones get a pool-allocated copy with
its own batch state via `_cloneFrom`. The pool is `ClearableObjectPool`
so no GC pressure. Identity fields are shared by copy, batch state stays
per-queue.
2026-05-25 16:46:52 +08:00
chenmo.gl
716e2f9529 fix(ui): clamp grid cell indices on both sides for negative canvas-local coords
The grid addressing in UIBatchSorter assumed canvas-local coords spanned
`[0, canvasLongestEdge]`, but `UITransform.pivot` defaults to `(0.5, 0.5)`,
making canvas-local `[-W/2, +W/2]`. For elements wholly in the negative
half (e.g. a button in the bottom half of a centered canvas), `floor()`
produced a negative `maxCell`, the `for (cellY <= maxCellY)` loop never
entered, and the element was both unregistered from the grid and skipped
during overlap queries. Missed overlaps left elements at `depth = 0`
together with the full-screen background, and `_compareEntries`'s
`textureId` tiebreaker reordered them across hierarchy, so the background
ended up drawn on top of buttons it should have sat behind.

Clamp every cell index on both sides. Negative-half elements now collapse
into edge cells; the precise `_rectOverlap` check still uses the real
bounds, so collapsing is safe and depth bumping behaves correctly.
2026-05-25 11:51:39 +08:00
chenmo.gl
0805f2d932 fix(instancing): use cofactor-form normal matrix to survive singular models
`renderer_NormalMat` was emitted as `mat4(transpose(inverse(mat3(renderer_ModelMat))))`.
GLSL `inverse()` is undefined for singular matrices, and any zero scale axis
(e.g. an animation hiding an entity via `scale = (1, 0, 1)`) makes the 3x3
model matrix singular — drivers typically return NaN/Inf, which then
contaminates the normal, the lighting, and finally the whole fragment.

The plain uniform path is protected by `Matrix.invert`'s `if (!det) return null`
early-out, but instancing recomputes the matrix on the GPU each draw with no
such guard, so the same scene that renders (with stale-but-finite normals)
without instancing went all-black with instancing.

Cofactor (cross-product) form equals `det(M) · transpose(inverse(M))`, so it
matches the classic formula in direction after `normalize()` but avoids the
divide-by-det entirely. Aligned with Filament / Unreal.
2026-05-19 21:36:39 +08:00
chenmo.gl
6f9436b52d Merge remote-tracking branch 'origin/dev/2.0' into feat/gpu-instancing 2026-05-19 21:05:24 +08:00
ChenMo
17e88c8966 feat(loader): recognize .aac and .flac extensions for audioClip loader (#3009)
add "aac" and "flac" to AudioLoader's resourceLoader extension list so
2026-05-19 19:07:56 +08:00
ChenMo
cc78efd981 feat(loader): recognize .m4a extension for audioClip loader (#3008)
feat(loader): recognize .m4a extension for audioClip loader
2026-05-19 15:57:18 +08:00
zhuxudong
2563fc5911 fix(shader-compiler, loader): silence verbose-mode noise + drop ShaderLoader dead code (#3007)
fix: silence verbose-mode noise + drop ShaderLoader dead code
2026-05-18 19:29:51 +08:00
luzhuang
f728209255 fix(animation, loader): 从 #2983 抽离动画与 GLTF 加载器修复 (#2999)
* fix(animation): add per-instance speed to AnimatorStatePlayData
2026-05-18 16:54:15 +08:00
cptbtptpbcptdtptp
9246f5ea97 chore: release v2.0.0-alpha.33 v2.0.0-alpha.33 2026-05-14 16:27:48 +08:00
zhuxudong
8a24469dad fix(shader-compiler): silence spurious form-param warnings + createGSError release-mode regression (#3005)
* fix(shader-compiler): silence spurious form-param "undeclared" warnings
2026-05-14 14:48:15 +08:00
zhuxudong
66ea595d0a fix(shader-compiler): preserve args for builtin-alias and user-fn-alias macro calls (#3001)
* fix(shader-compiler): enforce shift for known LALR conflicts
2026-05-14 11:51:07 +08:00
chenmo.gl
10b9265663 fix(instancing): reject unsupported uniform types at scan time
`_scanInstanceUniforms` stripped renderer-group uniform declarations
before `_buildLayout` decided whether they could fit the std140 layout.
For types like `mat3` (which std140's row-padded layout doesn't support
in the current `_std140TypeInfoMap`), the declaration was removed but
no UBO field or `#define` replaced it — every later reference became an
undeclared identifier and the whole pass silently failed to compile.

Check the storage type against the std140 map up front. Unsupported
types stay declared in the source and emit a clear log; supported types
follow the existing strip-and-collect path.
2026-05-13 21:21:00 +08:00
chenmo.gl
3b3d091f59 Merge remote-tracking branch 'origin/dev/2.0' into feat/gpu-instancing
# Conflicts:
#	packages/shader/compiledShaders/PBR.shaderc
#	packages/shader/compiledShaders/Pipeline/DepthOnly.shaderc
#	packages/shader/compiledShaders/Pipeline/ShadowCaster.shaderc
2026-05-13 20:46:40 +08:00
chenmo.gl
5b82d694e3 chore(batcher): drop dead leader guard in VertexMergeBatcher
`else if (curElement._isBatched) return` in `VertexMergeBatcher.batch`
was defensive: it caught the case where main pipeline fed an already-
batched leader back into `_batch(null, leader)`. After commit 9606bf3de
moved that check to `BatcherManager.batch`'s entry, the leader never
reaches this branch. Refresh `RenderElement._isBatched`'s doc to match
its current meaning ("batch leader — must not be merged again").
2026-05-13 20:42:23 +08:00
chenmo.gl
a25abab76e refactor(shader): drop dead macro-aware uniform scan in instance UBO inject
`_scanInstanceUniformsWithMacros` and `injectInstanceUBO`'s `activeMacros`
parameter only served the raw GLSL path, which was removed in e08af33d9
("refactor(shader): remove raw GLSL shader path"). The 4 preprocessor
regexes are only used by this function. All inputs are now ShaderLab
preprocessor-evaluated before reaching the injector.

Cleanup that should have landed alongside e08af33d9. Also reported by
reviewer as a dormant `#if`/`#elif` branch-stack bug — moot once the
function is gone.
2026-05-13 20:42:11 +08:00
chenmo.gl
9606bf3de8 fix(ui): skip re-batching of canvas-internal leaders in main pipeline
UICanvas pre-batches its children into leaders that carry a self-contained
draw range. When two canvases sharing the same atlas push their leaders
into the transparent queue, the main-pipeline batcher previously fed those
leaders back into `_canBatch`/`_batch` as if they were single sub-elements.
That corrupted `subMesh.count` and re-appended already-written indices,
dropping draws or overlapping ranges.

The batch boundary is the canvas — matches Unity uGUI's behavior — so
detect already-batched leaders at the batcher entry and pass them through.
2026-05-13 20:30:45 +08:00
zhuxudong
bba09be10e fix(shader-compiler): track every identifier in #define value (#2996)
* fix(shader-compiler): scan every identifier in `#define` value
2026-05-13 19:42:24 +08:00
luzhuang
373f559835 fix(physics-physx): skip initial overlap in raycast/sweep + reuse query callbacks (#2998)
* fix(physics-physx): skip initial overlap in raycast/sweep + reuse query callbacks
2026-05-13 19:40:59 +08:00
chenmo.gl
04ac8b3b7a fix(instancing): always run inject path so derived built-ins keep compiling
Shaders that only declare derived built-ins (e.g. `renderer_MVPMat`) had
their declarations stripped by `_scanInstanceUniforms` but skipped the
`#define` rewrite because `fieldMap` was empty, leaving dangling
`renderer_ModelMat` references that fail to compile.

This also fixed a latent bug where transform-free shaders with
`RENDERER_INSTANCING` enabled would silently lose `N-1` instances —
`_canBatch` returned true and only the leader was drawn.
2026-05-13 17:41:38 +08:00
zhuxudong
5fe4e64c74 fix(shader-compiler): move bundler output out of dist/ (#3000)
fix(shader-compiler): move bundler output out of dist
2026-05-13 16:57:31 +08:00
chenmo.gl
00b4462123 feat(instancing): support bool / bvec uniform types in instance UBO
ShaderData.setInt documents bool support and setVector* documents
bvec support, but the instance UBO layout table didn't list them —
declaring a `bool` or `bvec` renderer uniform would silently drop the
declaration without producing a #define, causing the shader to fail
with an undeclared identifier. Add std140 size/align entries, reuse
existing packScalar/packVec pack functions, and recognize the `b`
prefix for the intView dispatch.
2026-05-12 22:01:04 +08:00
chenmo.gl
bd96c42919 chore(ui): type UICanvas render element arrays and release them on disable
Replace `any[]` with `RenderElement[]` on `_renderElements` /
`_batchedRenderElements`, and clear both in `_onDisable` so the
pooled elements they hold don't surface stale references when other
renderers reuse the same pool slots while the canvas is disabled.
2026-05-12 21:44:47 +08:00
chenmo.gl
884cb97a9a fix(instancing): pack the renderer's transform-source matrix, not entity's
InstanceBuffer read renderer.entity.transform.worldMatrix, while
Renderer._updateTransformShaderData uses _transformEntity.transform.
They diverge whenever a subclass remaps _transformEntity (e.g.
SkinnedMeshRenderer points it at the root bone); switch InstanceBuffer
to _transformEntity to align with the shader-data path.
2026-05-12 21:44:26 +08:00
luzhuang
25ba6eb1cd chore: release v2.0.0-alpha.32 v2.0.0-alpha.32 2026-05-12 19:59:16 +08:00
luzhuang
56bfd1b3c4 refactor(loader): rename v2 scene.entities to scene.rootEntities (#2997)
* refactor(loader): rename v2 scene.entities to scene.roots
2026-05-12 19:55:36 +08:00
cptbtptpbcptdtptp
71efdd81e3 chore: release v2.0.0-alpha.31 v2.0.0-alpha.31 2026-05-12 17:45:50 +08:00
chenmo.gl
7ac1913bad refactor(ui): move UIBatchSorter from core to ui
UIBatchSorter is only used by UICanvas; keeping it in core forced a
cross-package export plus a ts-ignore at the UICanvas import site for
an @internal symbol. Move it next to its sole consumer in the ui
package and add RenderElement to core's RenderPipeline barrel so ui
can type the sort input. Utils._quickSort stays @internal; the new
call site carries a single ts-ignore acknowledging the reuse.
2026-05-12 17:40:20 +08:00
luzhuang
6732c76a1b feat(loader): support $class refs and numeric SpecularMode in v2 scene (#2994)
* feat(loader): support class refs in v2 values
2026-05-12 16:47:58 +08:00
chenmo.gl
41bca94452 chore(examples): wire Shader.create after WebGLEngine.create and add Stats
Calling Shader.create at module scope throws because the shader
compiler is only installed on Shader._shaderCompiler during engine
init. Move the call into the engine-create then callback. Re-add the
Stats overlay from @galacean/engine-toolkit-stats for dev observability.
2026-05-12 00:29:51 +08:00
chenmo.gl
50f9281355 fix(shader): force-inject renderer_ModelMat into instance UBO and declare camera matrices on demand
When a shader enables RENDERER_GPU_INSTANCE but never declares
`renderer_ModelMat` itself (e.g. it only references `renderer_MVPMat`),
the derived defines we inject — which expand to expressions using
`renderer_ModelMat` and `camera_ViewMat` / `camera_VPMat` — would
resolve to undeclared identifiers under WebGL.

Two complementary fixes:

- `_buildLayout` force-injects `renderer_ModelMat` (as mat3x4 affine
  pack) into the UBO whenever it's missing from `fieldMap`. Layout
  ordering is now an explicit priority skip rather than the
  `addField + delete` mutation pattern, which read like "add then
  delete" at a glance.
- `_buildMissingCameraDecls` scans the post-evaluate GLSL for an
  existing `uniform mat4 camera_*;` declaration and emits one only
  when shader-compiler DCE stripped it from Transform.glsl. Sources
  that legitimately pulled the camera matrix in (e.g. PBR fragment
  using camera_ViewMat for refraction) keep their single declaration.
2026-05-12 00:29:23 +08:00
chenmo.gl
a87acc8289 test(e2e,examples): rewrite gpu-instancing cases as ShaderLab
The 3-arg `Shader.create(name, vertexSource, fragmentSource)` overload no
longer exists after PR #2961 — passing GLSL strings drops into the SubShader
branch and crashes with `Cannot read properties of undefined (reading
'length')` from BasicRenderPipeline. Convert the two custom-instancing
cases to ShaderLab syntax with an explicit `ShaderCompiler` wired through
`WebGLEngine.create`.

Also pick up the upstream LFS baselines (particle/physx jpgs) — the merge
left their pointers at our pre-merge oid even though the working tree had
been resolved to upstream.
2026-05-11 23:24:41 +08:00
chenmo.gl
086dff0164 fix(shader): declare camera matrices when instance UBO rewrites derived defines
`injectInstanceUBO` rewrites `renderer_MVMat` / `renderer_MVPMat` to
`(camera_ViewMat * renderer_ModelMat)` / `(camera_VPMat * renderer_ModelMat)`,
introducing references the shader source itself may not have. shader-compiler
performs DCE on Transform.glsl declarations, so any vertex shader that only
reads `renderer_MVPMat` as a plain uniform ends up without `camera_VPMat`
visible to the rewritten GLSL — WebGL then fails to compile with
`'camera_VPMat' : undeclared identifier`.

Make the injector self-contained: scan the post-evaluate GLSL and emit
`uniform mat4 camera_*` declarations only for matrices not already present.
Sources that explicitly include them (e.g. PBR fragment using camera_ViewMat
for refraction) keep their single declaration intact.
2026-05-11 23:10:50 +08:00
chenmo.gl
bc4493f717 fix(shader): drop duplicate camera_ProjectionParams from Transform.glsl
Common.glsl already declares camera_ProjectionParams. Re-declaring it
in Transform.glsl makes every shader that includes both fail to
compile under WebGL (the entire PBR family among them). Regenerate
the affected compiled .shaderc artifacts (PBR, ShadowCaster,
DepthOnly).
2026-05-11 22:28:36 +08:00
chenmo.gl
1b4dd6b174 Merge remote-tracking branch 'origin/dev/2.0' into feat/gpu-instancing 2026-05-11 22:27:17 +08:00
zhuxudong
1bc2b102ad refactor(shader): migrate GLSL shaders to ShaderLab and clean up shader infrastructure(#2961)
* refactor(core): migrate shaders from core/shaderlib to shader package and clean up old files
2026-05-11 17:55:17 +08:00
AZhan
e19b764e1c fix(text): propagate WorldPosition dirty in _onRootCanvasModify when ReferenceResolutionPerUnit changes (#2981)
* fix(text): mark WorldPosition dirty after slot reallocation in _updateLocalData

Both Text (UI) and TextRenderer share a `bounds` getter that runs
`_updateLocalData` then checks `WorldPosition` dirty. `_updateLocalData`
internally `_freeTextChunks` + `_buildChunk → allocateSubChunk`, which
under PrimitiveChunk's first-fit + free-list-merge allocator can return
a slot previously owned by another renderer. `_buildChunk` writes UV
and color but never pos (pos is `_updatePosition`'s job), so the new
slot retains the previous owner's pos floats as residue.

Before this fix, when a path sets only `LocalPositionBounds` dirty
(e.g. `Text._onRootCanvasModify(ReferenceResolutionPerUnit)` in UI
Text), the bounds getter would:
  1. see LocalPositionBounds → run _updateLocalData (slot may swap)
  2. see WorldPosition not dirty → skip _updatePosition
  3. _setDirtyFlagFalse(Font) clear all dirty bits at once
The next _render also sees clean dirty bits and uploads the residue
pos to GPU — the renderer ends up rendering at someone else's old
world position. In practice this manifested as text glyphs jumping
to the wrong spot or appearing missing after UI tab switches that
free + reallocate chunk slots in the same frame.

Fix: force WorldPosition dirty at the end of _updateLocalData so the
contract "after this call, pos must be rewritten" is unconditionally
honored regardless of which caller invoked it.

Tests cover three layers:
  - dirty-flag invariant: _updateLocalData must leave WorldPosition
    dirty on exit
  - corrupted-slot: bounds getter with only LocalPositionBounds dirty
    rewrites pos even when the slot memory is poisoned
  - full slot-reuse repro: destroy a sibling renderer occupying a
    lower offset, then trigger bounds getter on the survivor — its
    pos must remain correct after the slot moves

Without the fix, all three regression tests fail with the survivor
rendering at the destroyed sibling's old position.

* chore: drop Chinese commentary from text dirty-flag fix

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ui): destroy engine after regression describe block

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(text): move dirty propagation to input side

Previous fix added _setDirtyFlagTrue(WorldPosition) at the end of
_updateLocalData in both TextRenderer and UI Text. That treats the
output side as the place to declare invalidation, which conflates
two concerns: dirty flags should declare staleness from input
semantics, and update methods should be pure compute units that
don't propagate flags themselves.

Root cause is on the input side: _onRootCanvasModify(ReferenceResolutionPerUnit)
declared LocalPositionBounds dirty but not WorldPosition, even though
ReferenceResolutionPerUnit affects both local layout and the world
positions derived from it. Fix the declaration where the input
semantic event lives.

TextRenderer needs no change — it has no entry point that dirties
LocalPositionBounds without also dirtying WorldPosition (all setters
use DirtyFlag.Position which includes both).

Tests rewritten from white-box (poking private _dirtyFlag, hardcoded
enum values) to public-API integration tests that drive the bug
through uiCanvas.referenceResolutionPerUnit and assert observable
vertex position changes. The new tests fail without the fix
(maxDelta = 0, positions don't update) and pass with it.

* fix(text): include WorldVolume in dirty flag for ReferenceResolutionPerUnit change

Use DirtyFlag.Position (= LocalPositionBounds | WorldPosition | WorldVolume)
instead of the manual two-flag combination. ReferenceResolutionPerUnit
also affects world bounding volume; without the WorldVolume bit,
_updateBounds is skipped in the bounds getter and stale BoundingBox
leaks into frustum culling and raycasting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: chenmo.gl <chenmo.gl@antgroup.com>
2026-05-07 17:23:26 +08:00
ChenMo
a6f05043d1 feat(shader-lab): make #define values first-class AST nodes (#2974)
* feat(shader-lab): make #define values first-class AST nodes
2026-04-28 17:12:18 +08:00
chenmo.gl
1c2b7f22a7 test(examples): bump ui-batch-massive to 9216 buttons (18432 sub-elements)
Stress level tuned to push DrawCall contrast against dev/2.0 (no batch
clustering) without overwhelming CPU. UI is typically static, so no
animation — keeps the test focused on batching algorithm cost.
2026-04-26 17:08:25 +08:00
chenmo.gl
25bef75d00 test(examples): upgrade ui-batch-massive with atlas icons & gradient panels
Programmatic textures are now closer to real game UI:
- gradient panel bg (electric-blue gradient + double-tone border)
- 64×64 icon atlas with 4 icons (sword/heart/bolt/gem); buttons cycle
  through atlas regions via Sprite Rect

Demonstrates the realistic batching scenario: many sprites sharing one
atlas texture still batch into a single draw call.
2026-04-26 16:55:58 +08:00
chenmo.gl
fb9561e41d test(e2e): switch ui-batch-order to ScreenSpaceCamera and add baseline
ScreenSpaceOverlay UI renders in a separate overlay pass that bypasses
camera.render(); initScreenshot's off-screen render-target therefore couldn't
capture it (downloaded image was empty grey). Switch the case to
ScreenSpaceCamera so the canvas goes through the main camera path and the
screenshot captures the full 4×3 button grid.

With the new pipeline + canvas-internal batching, the layout is now
deterministic across runs, so diffPercentage is set to 0.
2026-04-26 16:47:32 +08:00
chenmo.gl
72e079e731 test(e2e): update particle shape-transform baseline for gpu-instancing pipeline
The render pipeline rewrite for GPU instancing on feat/gpu-instancing produces
slightly different particle output than dev/2.0 (semi-transparent particles are
sensitive to draw-order microchanges). Visual is correct; baseline regenerated
to match the new pipeline output. Tolerance stays at 0.334%.
2026-04-26 16:35:52 +08:00
chenmo.gl
6619612503 Merge remote-tracking branch 'origin/dev/2.0' into feat/gpu-instancing 2026-04-26 16:18:12 +08:00
chenmo.gl
e30dd3e815 docs(e2e): clarify ui-batch-order is a regression guard
The case verifies the canvas hierarchy-order bug stays fixed: when
SubRenderElement was flattened, canvas sub-elements could be shuffled in
the main transparent queue under equal (priority, distance). Fix is in
place (canvas-internal batching + subDistancePriority); this e2e prevents
silent regression.
2026-04-26 16:04:25 +08:00
chenmo.gl
d4fb81f3ae perf(ui): canvas-internal batching with visual-layering driven sort
Adds UIBatchSorter that runs inside each canvas to cluster sub-elements by
(depth, material, texture, hierarchy) before they enter the main render queue.
Combined with VertexMergeBatcher, multi-material UI buttons collapse from
N draw calls to ~M (M = visual layer count), dramatically improving fps on
dense layouts (6000 buttons: 40fps → 67fps).

- UIBatchSorter: spatial-hash accelerated BatchSorting algorithm; cell size
  derives from canvas referenceResolution to stay optimal across designs.
- RenderElement.subDistancePriority: tiebreaker so canvas leaders keep their
  relative order when interleaved with 3D under unstable quicksort.
- RenderElement._isBatched: protects already-batched leaders from main-pipeline
  _batch(null, leader) re-init that would corrupt subMesh.start/count.
- Image/Text always populate renderElement.subShader at production (was
  conditional on overlay mode); needed because canvas-internal batching now
  runs before pushRenderElement which used to backfill it.
- Tests: 12 unit cases for sort correctness; e2e + example for batch demo.
2026-04-26 15:57:08 +08:00
chenmo.gl
c40602441b fix(ui): update UIRenderer to use renamed VertexMergeBatcher
Missed in earlier rename of BatchUtils -> VertexMergeBatcher / batchFor2D -> batch,
which broke type-check for the ui package and blocked CI.
2026-04-25 16:39:24 +08:00
chenmo.gl
29567302c4 fix(shader): scan instance uniforms with macro awareness for raw GLSL
_scanInstanceUniforms regex-matches uniform declarations without understanding
#ifdef blocks. For raw GLSL paths the source still contains preprocessor
directives at scan time, so uniforms inside inactive branches (e.g. renderer_JointMatrix
under #ifdef RENDERER_HAS_SKIN) get matched even when they won't compile.

This caused "GPU Instancing does not support array uniform" errors for plain
MeshRenderer batching whenever a SkinnedMeshRenderer had previously registered
renderer_JointMatrix under ShaderDataGroup.Renderer.

Add _scanInstanceUniformsWithMacros that walks the source line-by-line with
a branch stack for #ifdef/#ifndef/#else/#endif, delegating active lines to the
original scanner. compilePlatformSource passes its active macro set; the
ShaderLab path keeps using the plain scanner since ShaderMacroProcessor.evaluate
already expands directives there.

Also change the array-uniform fallback from deletion to keeping the declaration
as a regular uniform, so stray matches never directly fail shader compilation.
2026-04-23 20:37:08 +08:00