Files
rustfs/scripts/run_object_batch_bench_enhanced.sh
houseme 8d24d9133b perf(put): comprehensive PUT performance optimization (#3514)
* perf(put): add eager path metrics and isolation tooling

* fix(decommission): persist progress adaptively (#3497)

Persist decommission progress after either the existing time interval or a migrated-item threshold, and flush progress baselines after bucket and terminal-state saves.

Also stabilize the OIDC discovery mock used by the pre-commit gate.

* refactor: move bucket operations contract (#3507)

* fix(s3): handle multipart flexible checksums (#3508)

* fix(io-core): avoid blocking on pooled buffer return

* perf(put): add slow inflight diagnostics

* perf(put): fix 16KiB regression with threshold and pool bypass

- Lower SMALL_EAGER_PUT_MAX_SIZE from 256KB to 8KB so objects >8KiB
  use the streaming BufReader path (matches baseline behavior)
- Add POOL_BYPASE_MAX_SIZE (16KiB) to bypass BytesPool for very small
  objects, avoiding Small-tier Mutex contention under high concurrency
- Add read_small_put_body_exact_direct() for direct Vec<u8> allocation
- Fix stale test assertions to match new 8KB threshold

Root cause analysis: the 16KiB regression was primarily caused by
instrumentation overhead in set_disk.rs (4x Instant::now() + metrics
per PUT), not BytesPool contention. Lowering the threshold eliminates
the eager-path overhead for 16KiB+ objects.

* perf(put): gate stage metrics behind observability flag

Add put_stage_metrics_enabled() AtomicBool switch in io-metrics crate.
When disabled (default), record_put_object_path() and
record_put_object_stage_duration() are no-ops, avoiding unnecessary
histogram/counter macro overhead in the PUT hot path.

The flag is set to true during startup when OTEL metric export is
enabled (rustfs_obs::observability_metric_enabled() == true).

This eliminates the per-request metrics overhead that contributed
to the 16KiB PUT regression when metrics collection is not active.

* perf(put): comprehensive optimization - restore eager path, cache env, remove UUID

Change 1: Restore SMALL_EAGER_PUT_MAX_SIZE from 8KB to 1MB
- The try_lock() fix (d13a189e3) eliminates the blocking that caused
  service health timeouts under 512KiB c64 load
- Eager path with BytesPool is now safe for objects up to 1MB
- Recovers the eager path benefit for 32KiB-256KiB objects

Change 2: Adjust POOL_BYPASE_MAX_SIZE from 16KB to 4KB
- With eager path restored to 1MB, objects 4KB-1MB benefit from pool reuse
- Only ≤4KB objects bypass the pool (allocation cost negligible)

Change 3: Cache RUSTFS_ERASURE_ENCODE_MAX_INFLIGHT_BYTES via OnceLock
- Eliminates per-encode std::env::var() syscall
- Env var still works (read once at first use)

Change 4: Replace Uuid::new_v4() with Uuid::nil() in Erasure construction
- _id field is unused in hot paths (documented in code)
- Eliminates CSPRNG syscall per PUT request

Change 5: Add concurrency-aware buffer sizing to PUT path
- Reuses get_concurrency_aware_buffer_size() from GET path
- Reduces buffer size under high concurrency (0.4x at >8 concurrent)
- Lowers memory pressure for >1MB streaming PUTs

* chore: add pyroscope feature flag and clean up imports

- Add pyroscope feature flag forwarding to rustfs-obs
- Remove unused allow(non_upper_case_globals) in globals.rs
- Sort imports and fix Cargo.toml formatting consistency

* style: fix import ordering and code formatting

- Sort imports alphabetically in globals.rs, encode.rs
- Fix indentation in erasure_coding encode/erasure
- Clean up HashReader formatting in object_usecase.rs

* fix(test): use tokio::test for request_logging_layer tests

The tests call tokio::spawn via RequestContextLayer, which requires a
Tokio runtime. Changed from #[test] + futures::executor::block_on to
#[tokio::test] + .await, and replaced tracing::subscriber::with_default
with tracing::subscriber::set_default to support async.

* fix(bench): normalize no-space throughput/latency parsing in to_bps/to_ms

When a benchmark tool prints throughput without a separator (e.g. 123MiB/s),
awk '{print $2}' returns empty because the whole string is one field,
causing to_bps to return N/A and losing valid measurements in CSV output.

Insert a space between number and unit via sed before awk field splitting.
Same fix applied to to_ms for latency values like '50ms'.

Also add TODO comment on PUT path noting that get_concurrency_aware_buffer_size
reads ACTIVE_GET_REQUESTS instead of PUT concurrency (PR #3514 review).

Refs: PR #3514 review comments by chatgpt-codex-connector

* fix(metrics): correct POOL_BYPASS comments and separate PUT vs generic stage metrics

- Fix 3 comment-code mismatches: POOL_BYPASS_MAX_SIZE is 4KiB, not 16KiB
- Add generic record_stage_duration() with separate histogram
  (rustfs_internal_stage_duration_ms) for non-PUT paths
- Replace record_put_object_stage_duration with record_stage_duration in
  metacache_set, store_list_objects, and bucket_lifecycle_ops to avoid
  polluting PUT-specific dashboards with listing/lifecycle timings
- Fix flaky test: serialize tests mutating PUT_STAGE_METRICS_ENABLED with
  METRICS_FLAG_LOCK mutex and explicitly set desired state at test start

Refs: PR #3514 review comments by chatgpt-codex-connector

* style: apply cargo fmt to metacache_set.rs

---------

Co-authored-by: cxymds <cxymds@gmail.com>
Co-authored-by: 安正超 <anzhengchao@gmail.com>
2026-06-17 21:19:11 +08:00

15 KiB
Executable File