* fix(tier): stop sending nil/garbage versionId to warm backend S3 Three bugs caused NoSuchVersion errors when reading tiered objects: 1. warm_backend_s3sdk: GET and DELETE ignored rv/range opts entirely — fixed to forward version_id and byte-range to the SDK request. 2. version.rs (MetaObject + MetaDeleteMarker): transition_version_id was parsed with unwrap_or_default(), turning invalid/wrong-length bytes into Uuid::nil(). The nil UUID was then serialized and sent as ?versionId=00000000-... to the tier backend -> NoSuchVersion. Fixed: .and_then(.ok()).filter(!is_nil()) so only valid non-nil UUIDs are forwarded as versionId. 3. bucket_lifecycle_ops: add debug/error logs in get_transitioned_object_reader to record tier, tier_object, and tier_version_id before and on failure of the tier GET. Also adds tier transition fields to dump_fileinfo example for offline xl.meta inspection, and fixes Docker build (cargo path + entrypoint). Adds CLAUDE.md with tier architecture and debugging notes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * more fixes for versionId * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Marcelo Bartsch <marcelo@bartsch.cl> * remove branch * Add tests and fix cargo path, add load to build-docker * update documentation (CLAUDE.md) * more fixes for recover * More fixes to ILM recover * final fix * chore: add missing-shard first-scene diagnostics (#3213) chore(ecstore): add missing-shard first-scene diagnostics Log rename_data quorum context behind RUSTFS_ISSUE3031_DIAG_ENABLE so partial-disk success can be correlated with later missing shard reads. Also log put_object commit success and tmp cleanup boundaries to capture when successful quorum writes are followed by tmp_dir cleanup. * fix test anmd fmt * fix cargo path fix test * fix(tier): format copy_object self-copy guard --------- Signed-off-by: Marcelo Bartsch <marcelo@bartsch.cl> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: 安正超 <anzhengchao@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: houseme <housemecn@gmail.com> Co-authored-by: cxymds <Cxymds@qq.com> Co-authored-by: loverustfs <hello@rustfs.com>
10 KiB
RustFS — CLAUDE.md
S3-compatible object store in Rust, derived from MinIO. Erasure-coded, multi-pool, supports ILM tiering/lifecycle.
Commands
cargo build --release --bin rustfs # production binary
cargo build # dev build
cargo check -p <crate> # fast type-check one crate
cargo test -p <crate> # test one crate
cargo fmt --all # format (required before PR)
make pre-commit # full pre-PR gate (fmt + clippy + test)
make build-docker BUILD_OS=ubuntu22.04 # Docker cross-build
Docker build note:
buildx buildwithout--loadkeeps the image in the buildx cache only —docker runwill use a stale local image. The Makefile already includes--load; if you suspect a stale binary, add--no-cacheto thebuildx buildinvocation inside.config/make/build-docker.mak.
Agent/PR rules: see
.github/copilot-instructions.md. Crate membership:Cargo.toml[workspace].members. CI gates:.github/workflows/ci.yml.
Workspace layout
rustfs/src/main.rs # binary entry point
crates/ecstore/src/
set_disk.rs # ErasureSet: transition_object, restore_transitioned_object
store.rs / store_api/ # ECStore trait + ObjectInfo / TransitionedObject types
bucket/lifecycle/
bucket_lifecycle_ops.rs # ILM actions: transition_object, expire_transitioned_object,
# get_transitioned_object_reader, gen_transition_objname
tier_sweeper.rs # background sweep: delete_object_from_remote_tier
tier/
warm_backend.rs # WarmBackend trait (put/get/remove/in_use)
warm_backend_s3.rs # HTTP-client based (TransitionClient) — used for S3/MinIO
warm_backend_s3sdk.rs # aws-sdk-s3 based — alternative S3 backend
warm_backend_minio.rs / _rustfs.rs / … # per-provider wrappers (all delegate to _s3 or _s3sdk)
tier.rs # TierConfigMgr, new_warm_backend dispatch
client/transition_api.rs # TransitionClient HTTP plumbing; UploadInfo, to_object_info
client/api_put_object_streaming.rs # put_object_do → UploadInfo (version_id from x-amz-version-id)
crates/filemeta/src/
filemeta.rs # FileMeta (xl.meta top-level), is_skip_meta_key
filemeta/version.rs # FileMetaVersion, MetaObject, MetaDeleteMarker
# → to_fileinfo() reads transition_version_id
# → set_transition() writes raw UUID bytes
# → From<FileInfo> for MetaObject writes all meta
fileinfo.rs # FileInfo struct (transition_version_id: Option<Uuid>)
examples/
dump_fileinfo.rs # CLI: parse xl.meta, print transition fields + metadata
dump_versions.rs # CLI: list all versions in xl.meta
crates/utils/src/http/metadata_compat.rs # SUFFIX_* constants, insert_bytes/get_bytes (dual RustFS+MinIO keys)
Metadata key conventions
Internal metadata is stored under both x-rustfs-internal-<suffix> and x-minio-internal-<suffix> for MinIO interoperability. get_bytes prefers the RustFS key with MinIO fallback.
Key suffixes (from metadata_compat.rs):
| Suffix | Meaning |
|---|---|
transition-status |
"complete" when tiered |
transitioned-object |
tier key path (without prefix) |
transitioned-versionID |
S3 version_id returned by tier PUT (16 raw UUID bytes, or absent) |
transition-tier |
tier name |
tier-free-versionID |
delete-marker version for free-version sweep |
Tier / ILM transition architecture
Transition flow (hot → cold)
transition_object(lifecycle_ops) →ECStore::transition_object→set_disk.rsgen_transition_objname(bucket)→{sha256_hash[0..16]}/{uuid[0..2]}/{uuid[2..4]}/{uuid}(unique per object version)tgt_client.put_with_meta(dest_obj, …)→ returnsrv: String(remote S3 version_id, or"")fi.transition_version_id = if rv.is_empty() { None } else { Some(Uuid::parse_str(&rv)?) }fi.transitioned_objname = dest_obj(without tier prefix)- Written to xl.meta via
MetaObject::from(FileInfo)→insert_bytes(SUFFIX_TRANSITIONED_VERSION_ID, uuid.as_bytes())(16 raw bytes)
Tier GET flow (restore/read)
get_transitioned_object_reader (lifecycle_ops):
- reads
oi.transitioned_object.name(=fi.transitioned_objname) - reads
oi.transitioned_object.version_id(=fi.transition_version_id.to_string()or"") - calls
warm_backend.get(name, version_id, opts) warm_backend_s3.rs::get: adds?versionId=…only whenrv != ""
Tier prefix handling
WarmBackendS3::get_dest(object) prepends self.prefix to the object name.
transitioned_objname is stored without the prefix — get_dest adds it on every call.
xl.meta on disk
Path: {disk}/{bucket}/{object}/xl.meta — one per erasure shard disk.
All shards should be identical for a healthy object.
Known bugs & fixes
Bug 1: NoSuchVersion on tier GET — nil UUID sent as versionId
Root cause: transitioned-versionID metadata key exists with empty string value (0 bytes). Old reading code:
// OLD — unwrap_or_default() converts 0-byte or wrong-length slice to Uuid::nil()
get_bytes(…).map(|v| Uuid::from_slice(v.as_slice()).unwrap_or_default())
// → Some(Uuid::nil()) → sends ?versionId=00000000-… → NoSuchVersion
Fix (version.rs, MetaObject::to_fileinfo + MetaDeleteMarker::to_fileinfo):
get_bytes(…)
.and_then(|v| Uuid::from_slice(v.as_slice()).ok()) // None for wrong-length bytes
.filter(|u| !u.is_nil()) // None for nil UUID (old write-back)
Regression tests (crates/filemeta/src/filemeta/version.rs mod tests): 6 tests cover absent key, empty bytes, nil UUID, and valid UUID round-trip for both MetaObject and MetaDeleteMarker paths.
Bug 2: warm_backend_s3sdk.rs ignored rv and range opts
Fix: added req.version_id(rv) and req.range(…) to GET; req.version_id(rv) to DELETE.
Bug 4: set_disk::copy_object returns 501 for tiered objects (storage class restore)
Root cause: set_disk::copy_object immediately returns StorageError::NotImplemented when src_info.metadata_only = false. For tiered objects, metadata_only is never set to true (guarded by transitioned_object.tier.is_empty()). So mc cp --storage-class STANDARD obj obj on a tiered object always returns 501.
Fix (crates/ecstore/src/set_disk.rs, copy_object):
if !src_info.metadata_only {
if path_join_buf(&[src_bucket, src_object]) == path_join_buf(&[dst_bucket, dst_object]) {
if let Some(mut put_reader) = src_info.put_object_reader.take() {
return self.put_object(dst_bucket, dst_object, &mut put_reader, dst_opts).await;
}
}
return Err(StorageError::NotImplemented);
}
When a self-copy has a put_object_reader (data already fetched from tier in execute_copy_object), writes it back locally via put_object, effectively de-tiering the object.
How mc cp --storage-class STANDARD flows:
- mc sends
PUT /bucket/keywithx-amz-copy-source,x-amz-metadata-directive: REPLACE,x-amz-storage-class: STANDARD execute_copy_object→get_object_readerfetches data from tier backend → stores insrc_info.put_object_readerstore.copy_object(...)→ now callsput_objectwith tier data and STANDARD storage class indst_opts- New xl.meta written locally with STANDARD class, no tier metadata → object de-tiered
Bug 3 (open): race in expire_transitioned_object
Order is: delete remote tier version → delete local object.
A concurrent GET between those two steps fetches a valid stored version_id but the tier version is already gone → NoSuchVersion.
The proper fix is to delete local metadata first (making the object unreachable) before deleting the remote tier version.
Debugging tier issues
Inspect xl.meta directly
cargo build -p rustfs-filemeta --example dump_fileinfo
./target/debug/examples/dump_fileinfo /srv/rustfs/data/disk0/{bucket}/{object}/xl.meta
# Shows: transition_status, transition_tier, transitioned_obj, transition_ver_id
transition_ver_id: <none> → no versionId will be sent to tier (correct for non-versioned tier bucket).
transition_ver_id: <uuid> → that UUID will be sent as ?versionId=<uuid>.
Check what versionId is being sent at runtime
Enable debug logging:
RUST_LOG=rustfs_ecstore::bucket::lifecycle=debug rustfs …
Log line: fetching transitioned object from tier (DEBUG before request).
Log line: tier GET failed (ERROR on failure, includes tier_version_id).
Metadata key to watch
x-minio-internal-transitioned-versionID= ← empty string = will cause NoSuchVersion with old code
x-rustfs-internal-transitioned-versionID= ← same
If both are empty string, the object was transitioned to a non-versioned tier bucket. The versionId should NOT be sent — fixed by Bug 1 above.
Common patterns
Writing internal metadata (binary values)
insert_bytes(&mut meta_sys, SUFFIX_TRANSITIONED_VERSION_ID, uuid.as_bytes().to_vec());
// stores under both x-rustfs-internal-* and x-minio-internal-* keys
Reading internal metadata (binary values)
get_bytes(&self.meta_sys, SUFFIX_TRANSITIONED_VERSION_ID)
.and_then(|v| Uuid::from_slice(v.as_slice()).ok())
.filter(|u| !u.is_nil())
// Returns None for: absent, wrong-length bytes, nil UUID
WarmBackend trait
put_with_meta(object, reader, length, meta) -> Result<String> // returns S3 version_id or ""
put(object, reader, length) -> Result<String>
get(object, rv, opts) -> Result<ReadCloser> // rv="" means no versionId
remove(object, rv) -> Result<()>
in_use() -> Result<bool>
rv = remote version, always pass as empty string when transition_version_id is None.