- expand core and ui regression coverage for analysis, result, and compare evaluation flows
- refresh text analysis and evaluation VCR fixtures after the workspace/result semantics change
- cover stale-state and analyze-created workspace behaviors in integration and e2e tests