Benefits for LWN subscribers
The primary benefit from subscribing to LWN |
Darrick Wong has been doing work on XFS online
repair for a number of years
and things are getting to the point where most of the filesystem-internal work
has been completed and is under review. The work remaining mostly concerns
the user-space side
to set up a periodic scan and repair cycle, so he wanted to discuss what
user space needs from this kind of feature in a filesystem session at the
2023 Linux Storage, Filesystem,
Memory-Management and BPF Summit that he led remotely. The session may
not have gone quite as he hoped, as it got somewhat derailed by topics that
spilled over from the earlier session on
unprivileged image mounts.
His current patch set for XFS online repair is “out for review on Dave
Chinner’s laptop right now”, so it is time to start talking about the
missing pieces. That means that he will be talking more about user space
than he would normally; there is a user-space driver program that controls
how often the online fsck mechanism runs. There is nothing yet
for notifying user space of problems that were found by an online fsck
pass, nor is there a daemon monitoring for notifications to do anything
about them, such as to issue repair requests. There is no good
infrastructure in the kernel for handling and dispatching such things, he
said.
He said that the earlier discussion in the unprivileged-mounts session
on using fsck to decide that an image was sound enough to mount
made him think that it was a good time to discuss these kinds of issues.
As he noted, there is a command-line program, xfs_scrub,
which opens the block device and root directory, then starts issuing the right
ioctl() commands, but the real use case is not for running a tool
in that fashion. Instead, the idea is that it would do a background check
and repair periodically from
a systemd service; he is struggling a bit with setting that up, but has
something working. It is not, however, much different from the age-old
periodic cron job that reports its results to the system log and hopes an
administrator is paying attention.
He would like to create a notification
system that would allow the system to respond dynamically to the events
that get reported by the periodic scrubbing. He would also like there to
be a way for programs to initiate scrubbing for various reasons, such as a
container manager that notices relatively low activity so it kicks off
scrubbing on the mounted filesystems. Maybe that could mesh with the
unprivileged-mounting use case in some fashion as well, Wong said.
So he wondered if any user-space developers had thoughts on ho