rustfs

lywsvip/rustfs

Fork 0

mirror of https://github.com/rustfs/rustfs.git synced 2026-07-02 03:34:18 +08:00

Commit Graph

Author	SHA1	Message	Date
houseme	8d24d9133b	perf(put): comprehensive PUT performance optimization (#3514 ) * perf(put): add eager path metrics and isolation tooling * fix(decommission): persist progress adaptively (#3497) Persist decommission progress after either the existing time interval or a migrated-item threshold, and flush progress baselines after bucket and terminal-state saves. Also stabilize the OIDC discovery mock used by the pre-commit gate. * refactor: move bucket operations contract (#3507) * fix(s3): handle multipart flexible checksums (#3508) * fix(io-core): avoid blocking on pooled buffer return * perf(put): add slow inflight diagnostics * perf(put): fix 16KiB regression with threshold and pool bypass - Lower SMALL_EAGER_PUT_MAX_SIZE from 256KB to 8KB so objects >8KiB use the streaming BufReader path (matches baseline behavior) - Add POOL_BYPASE_MAX_SIZE (16KiB) to bypass BytesPool for very small objects, avoiding Small-tier Mutex contention under high concurrency - Add read_small_put_body_exact_direct() for direct Vec<u8> allocation - Fix stale test assertions to match new 8KB threshold Root cause analysis: the 16KiB regression was primarily caused by instrumentation overhead in set_disk.rs (4x Instant::now() + metrics per PUT), not BytesPool contention. Lowering the threshold eliminates the eager-path overhead for 16KiB+ objects. * perf(put): gate stage metrics behind observability flag Add put_stage_metrics_enabled() AtomicBool switch in io-metrics crate. When disabled (default), record_put_object_path() and record_put_object_stage_duration() are no-ops, avoiding unnecessary histogram/counter macro overhead in the PUT hot path. The flag is set to true during startup when OTEL metric export is enabled (rustfs_obs::observability_metric_enabled() == true). This eliminates the per-request metrics overhead that contributed to the 16KiB PUT regression when metrics collection is not active. * perf(put): comprehensive optimization - restore eager path, cache env, remove UUID Change 1: Restore SMALL_EAGER_PUT_MAX_SIZE from 8KB to 1MB - The try_lock() fix (d13a189e3) eliminates the blocking that caused service health timeouts under 512KiB c64 load - Eager path with BytesPool is now safe for objects up to 1MB - Recovers the eager path benefit for 32KiB-256KiB objects Change 2: Adjust POOL_BYPASE_MAX_SIZE from 16KB to 4KB - With eager path restored to 1MB, objects 4KB-1MB benefit from pool reuse - Only ≤4KB objects bypass the pool (allocation cost negligible) Change 3: Cache RUSTFS_ERASURE_ENCODE_MAX_INFLIGHT_BYTES via OnceLock - Eliminates per-encode std::env::var() syscall - Env var still works (read once at first use) Change 4: Replace Uuid::new_v4() with Uuid::nil() in Erasure construction - _id field is unused in hot paths (documented in code) - Eliminates CSPRNG syscall per PUT request Change 5: Add concurrency-aware buffer sizing to PUT path - Reuses get_concurrency_aware_buffer_size() from GET path - Reduces buffer size under high concurrency (0.4x at >8 concurrent) - Lowers memory pressure for >1MB streaming PUTs * chore: add pyroscope feature flag and clean up imports - Add pyroscope feature flag forwarding to rustfs-obs - Remove unused allow(non_upper_case_globals) in globals.rs - Sort imports and fix Cargo.toml formatting consistency * style: fix import ordering and code formatting - Sort imports alphabetically in globals.rs, encode.rs - Fix indentation in erasure_coding encode/erasure - Clean up HashReader formatting in object_usecase.rs * fix(test): use tokio::test for request_logging_layer tests The tests call tokio::spawn via RequestContextLayer, which requires a Tokio runtime. Changed from #[test] + futures::executor::block_on to #[tokio::test] + .await, and replaced tracing::subscriber::with_default with tracing::subscriber::set_default to support async. * fix(bench): normalize no-space throughput/latency parsing in to_bps/to_ms When a benchmark tool prints throughput without a separator (e.g. 123MiB/s), awk '{print $2}' returns empty because the whole string is one field, causing to_bps to return N/A and losing valid measurements in CSV output. Insert a space between number and unit via sed before awk field splitting. Same fix applied to to_ms for latency values like '50ms'. Also add TODO comment on PUT path noting that get_concurrency_aware_buffer_size reads ACTIVE_GET_REQUESTS instead of PUT concurrency (PR #3514 review). Refs: PR #3514 review comments by chatgpt-codex-connector * fix(metrics): correct POOL_BYPASS comments and separate PUT vs generic stage metrics - Fix 3 comment-code mismatches: POOL_BYPASS_MAX_SIZE is 4KiB, not 16KiB - Add generic record_stage_duration() with separate histogram (rustfs_internal_stage_duration_ms) for non-PUT paths - Replace record_put_object_stage_duration with record_stage_duration in metacache_set, store_list_objects, and bucket_lifecycle_ops to avoid polluting PUT-specific dashboards with listing/lifecycle timings - Fix flaky test: serialize tests mutating PUT_STAGE_METRICS_ENABLED with METRICS_FLAG_LOCK mutex and explicitly set desired state at test start Refs: PR #3514 review comments by chatgpt-codex-connector * style: apply cargo fmt to metacache_set.rs --------- Co-authored-by: cxymds <cxymds@gmail.com> Co-authored-by: 安正超 <anzhengchao@gmail.com>	2026-06-17 21:19:11 +08:00
houseme	2953558f41	fix(lifecycle): prevent eager date-expiry deletion on config update (#2708 )	2026-04-28 10:26:14 +00:00
houseme	960c13a34b	feat(storage): wire capacity/object perf tuning and add batch benchmark runners (#2628 )	2026-04-21 07:20:57 +00:00

Author

SHA1

Message

Date

houseme

8d24d9133b

perf(put): comprehensive PUT performance optimization (#3514 )

* perf(put): add eager path metrics and isolation tooling

* fix(decommission): persist progress adaptively (#3497)

Persist decommission progress after either the existing time interval or a migrated-item threshold, and flush progress baselines after bucket and terminal-state saves.

Also stabilize the OIDC discovery mock used by the pre-commit gate.

* refactor: move bucket operations contract (#3507)

* fix(s3): handle multipart flexible checksums (#3508)

* fix(io-core): avoid blocking on pooled buffer return

* perf(put): add slow inflight diagnostics

* perf(put): fix 16KiB regression with threshold and pool bypass

- Lower SMALL_EAGER_PUT_MAX_SIZE from 256KB to 8KB so objects >8KiB
  use the streaming BufReader path (matches baseline behavior)
- Add POOL_BYPASE_MAX_SIZE (16KiB) to bypass BytesPool for very small
  objects, avoiding Small-tier Mutex contention under high concurrency
- Add read_small_put_body_exact_direct() for direct Vec<u8> allocation
- Fix stale test assertions to match new 8KB threshold

Root cause analysis: the 16KiB regression was primarily caused by
instrumentation overhead in set_disk.rs (4x Instant::now() + metrics
per PUT), not BytesPool contention. Lowering the threshold eliminates
the eager-path overhead for 16KiB+ objects.

* perf(put): gate stage metrics behind observability flag

Add put_stage_metrics_enabled() AtomicBool switch in io-metrics crate.
When disabled (default), record_put_object_path() and
record_put_object_stage_duration() are no-ops, avoiding unnecessary
histogram/counter macro overhead in the PUT hot path.

The flag is set to true during startup when OTEL metric export is
enabled (rustfs_obs::observability_metric_enabled() == true).

This eliminates the per-request metrics overhead that contributed
to the 16KiB PUT regression when metrics collection is not active.

* perf(put): comprehensive optimization - restore eager path, cache env, remove UUID

Change 1: Restore SMALL_EAGER_PUT_MAX_SIZE from 8KB to 1MB
- The try_lock() fix (d13a189e3) eliminates the blocking that caused
  service health timeouts under 512KiB c64 load
- Eager path with BytesPool is now safe for objects up to 1MB
- Recovers the eager path benefit for 32KiB-256KiB objects

Change 2: Adjust POOL_BYPASE_MAX_SIZE from 16KB to 4KB
- With eager path restored to 1MB, objects 4KB-1MB benefit from pool reuse
- Only ≤4KB objects bypass the pool (allocation cost negligible)

Change 3: Cache RUSTFS_ERASURE_ENCODE_MAX_INFLIGHT_BYTES via OnceLock
- Eliminates per-encode std::env::var() syscall
- Env var still works (read once at first use)

Change 4: Replace Uuid::new_v4() with Uuid::nil() in Erasure construction
- _id field is unused in hot paths (documented in code)
- Eliminates CSPRNG syscall per PUT request

Change 5: Add concurrency-aware buffer sizing to PUT path
- Reuses get_concurrency_aware_buffer_size() from GET path
- Reduces buffer size under high concurrency (0.4x at >8 concurrent)
- Lowers memory pressure for >1MB streaming PUTs

* chore: add pyroscope feature flag and clean up imports

- Add pyroscope feature flag forwarding to rustfs-obs
- Remove unused allow(non_upper_case_globals) in globals.rs
- Sort imports and fix Cargo.toml formatting consistency

* style: fix import ordering and code formatting

- Sort imports alphabetically in globals.rs, encode.rs
- Fix indentation in erasure_coding encode/erasure
- Clean up HashReader formatting in object_usecase.rs

* fix(test): use tokio::test for request_logging_layer tests

The tests call tokio::spawn via RequestContextLayer, which requires a
Tokio runtime. Changed from #[test] + futures::executor::block_on to
#[tokio::test] + .await, and replaced tracing::subscriber::with_default
with tracing::subscriber::set_default to support async.

* fix(bench): normalize no-space throughput/latency parsing in to_bps/to_ms

When a benchmark tool prints throughput without a separator (e.g. 123MiB/s),
awk '{print $2}' returns empty because the whole string is one field,
causing to_bps to return N/A and losing valid measurements in CSV output.

Insert a space between number and unit via sed before awk field splitting.
Same fix applied to to_ms for latency values like '50ms'.

Also add TODO comment on PUT path noting that get_concurrency_aware_buffer_size
reads ACTIVE_GET_REQUESTS instead of PUT concurrency (PR #3514 review).

Refs: PR #3514 review comments by chatgpt-codex-connector

* fix(metrics): correct POOL_BYPASS comments and separate PUT vs generic stage metrics

- Fix 3 comment-code mismatches: POOL_BYPASS_MAX_SIZE is 4KiB, not 16KiB
- Add generic record_stage_duration() with separate histogram
  (rustfs_internal_stage_duration_ms) for non-PUT paths
- Replace record_put_object_stage_duration with record_stage_duration in
  metacache_set, store_list_objects, and bucket_lifecycle_ops to avoid
  polluting PUT-specific dashboards with listing/lifecycle timings
- Fix flaky test: serialize tests mutating PUT_STAGE_METRICS_ENABLED with
  METRICS_FLAG_LOCK mutex and explicitly set desired state at test start

Refs: PR #3514 review comments by chatgpt-codex-connector

* style: apply cargo fmt to metacache_set.rs

---------

Co-authored-by: cxymds <cxymds@gmail.com>
Co-authored-by: 安正超 <anzhengchao@gmail.com>

2026-06-17 21:19:11 +08:00

houseme

2953558f41

fix(lifecycle): prevent eager date-expiry deletion on config update (#2708 )

2026-04-28 10:26:14 +00:00

houseme

960c13a34b

feat(storage): wire capacity/object perf tuning and add batch benchmark runners (#2628 )

2026-04-21 07:20:57 +00:00

3 Commits