Skip to content

Conversation

@robbat2
Copy link
Contributor

@robbat2 robbat2 commented Oct 3, 2024

Expose what component devices are part of a MD raid device, as well as the most common flags per-component. This will enable a future node_exporter metric showing which component of a RAID had failed.

This PR builds up top of PR 505, implementing the requested changes to detect the known RAID types on a given system.

Reference: #505
Signed-off-by: Robin Johnson robbat2@orbis-terrarum.net

@robbat2 robbat2 force-pushed the rjohnson/mdstat-devices branch 5 times, most recently from 1ad0a36 to dca5b72 Compare October 3, 2024 23:13
@robbat2
Copy link
Contributor Author

robbat2 commented Oct 5, 2024

@SuperQ can you have a look at this please?

@dswarbrick
Copy link
Contributor

I'm pretty sure this is already at least partially implemented by the mdraid sysfs parsing, https://pkg.go.dev/github.com/prometheus/procfs@v0.15.1/sysfs#Mdraid

Drawing attention once again to prometheus/node_exporter#1085 - I really think it's time to mark the old /proc/mdstat parser as deprecated, before any more technical debt gets added to it.

@SuperQ
Copy link
Member

SuperQ commented Oct 9, 2024

I agree, I think we should focus the effort on the sysfs /sys/block/md* parsing.

@robbat2
Copy link
Contributor Author

robbat2 commented Oct 12, 2024

@SuperQ can you please merge prometheus/node_exporter#3031 then - and I'll build on top of that to expose per-device state?

@SuperQ
Copy link
Member

SuperQ commented Oct 12, 2024

@robbat2 Sure, sounds good.

Faulty: strings.Contains(match[3], "(F)"),
Spare: strings.Contains(match[3], "(S)"),
Journal: strings.Contains(match[3], "(J)"),
Replacement: strings.Contains(match[3], "(R)"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! R stands for Replacement!

@Finomosec Finomosec mentioned this pull request Oct 16, 2024
@SuperQ
Copy link
Member

SuperQ commented Jul 3, 2025

This needs a rebase.

ImSingee and others added 5 commits July 4, 2025 12:53
Note: rebased on top of master for reformatting

Signed-off-by: Singee <git@singee.me>
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
Note: rebased on top of master for reformatting

Signed-off-by: Singee <git@singee.me>
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
Note: rebased on top of master for reformatting

Signed-off-by: Singee <git@singee.me>
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
Signed-off-by: Singee <git@singee.me>
Expose what component devices are part of a MD raid device, as well as the most
common flags per-component. This will enable a future node_exporter metric
showing which component of a RAID had failed.

Signed-off-by: Robin H. Johnson <robbat2@orbis-terrarum.net>
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
@robbat2 robbat2 force-pushed the rjohnson/mdstat-devices branch from dca5b72 to 22c345a Compare July 4, 2025 19:57
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
@robbat2
Copy link
Contributor Author

robbat2 commented Jul 4, 2025

This needs a rebase.

Done

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Comment on lines +46 to +49
// Some additional flags that are NOT exposed in procfs today; they may
// be available via sysfs.
// In_sync, Bitmap_sync, Blocked, WriteErrorSeen, FaultRecorded,
// BlockedBadBlocks, WantReplacement, Candidate, ...
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know

@SuperQ SuperQ merged commit 638b55c into prometheus:master Oct 25, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

7 participants