Skip to content Skip to footer

Login orRegister

0 items - $0.00 0

The evolution of Unix facilities and architecture (2017) by networked

10CommentsShare PostShare on Facebook Share on XShare by EmailSend Link

The evolution of Unix facilities and architecture (2017) by networked

ByHackTech January 14, 2023

Share This Article

Share Post

Newsletter

Sed ut perspiciatis unde.

The Wayback Machine – https://web.archive.org/web/20170515160229/http://minnie.tuhs.org/pipermail/tuhs/2017-May/009935.html

Theodore Ts’o
tytso at mit.edu

Sun May 14 14:30:24 AEST 2017

Previous message (by thread): [TUHS] The evolution of Unix facilities and architecture
Next message (by thread): [TUHS] The evolution of Unix facilities and architecture
Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]

On Thu, May 11, 2017 at 03:25:47PM -0700, Larry McVoy wrote:
> This is one place where I think Linux kicked Unix's ass.  And I am not
> really sure how they did it, I have an idea but am not positive.  Unix
> file systems up through UFS as shipped by Sun, were all vulnerable to
> what I call the power out test.  Untar some big tarball and power off
> the machine in the middle of it.  Reboot.  Hilarity ensues (not).
> 
> You were dropped into some stand alone shell after fsck threw up its
> hands and it was up to you to fix it.  Dozens and dozens of errors.
> It was almost always faster to go to backups because figuring that 
> stuff out, file by file (which I have done more than once), gets you
> to the point that your run "fsck -y" and go poke at lost+found when
> fsck is done, realize that there is no hope, and reach for backups.
> 
> Try the same thing with Linux.  The file system will come back, starting
> with, I believe, ext2.
> 
> My belief is that Linux orders writes such that while you may lose data
> (as in, a process created a file, the OS said it was OK, but that file 
> will not be in the file system after a crash), but the rest of the file 
> system will be consistent.  I think it's as if you powered off the
> machine a few seconds earlier than you actually did, some stuff is in
> flight and until they can write stuff out in the proper order you may
> lose data on a hard reset.

So the story is a bit complicated here, and may be an example of
"worse is better" --- which is ironically one of those things which is
used as an explanation for why BSD/Unix won ever though the Lisp was
technically superior[1] --- but in this case, it's Linux that did
something "dirty", and BSD that did something that was supposed to be
the "better" solution.

[1] https://www.jwz.org/doc/worse-is-better.html

So first let's talk about ext2 (which indeed, does not have file
system journalling; that came in ext3).  The BSD Fast File System goes
to a huge amount of effort to make sure that writes are sent to the
disk in exactly the right order so that fsck can actually fix things.
This requires that the disk not reorder writes (e.g., write caching is
disabled or in write-through mode).  Linux, in ext2, didn't bother
with trying to get the write order correct at all.  None.  Nada.  Zip.
Writes would go out in whatever order dictated by the elevator
scheduler, and so on a power failure or a kernel crash, the order in
which metadata writes would be sent to the disk was completely
unconstrained.

Sounds horrible, right?  In many ways, it was.  And I lost count of
how often NetBSD and FreeBSD users would talk about how primitive and
horrible ext2 was in comparison to FFS, which had all of this
excellent engineering work to make sure writes happened in the correct
order such that fsck was guaranteed to always be able to fix things.

So why did Linux get away with it?  When I wrote the fsck for ext2, I
knew that anything can and would happen, so it was implemented so that
it was extremely paranoid about not ever losing any data.  And if
there was a chance that an expert could recover the data, e2fsck would
stop and ask the system administrator to take a look.  In the case
that the user ran with fsck -y, the default was drop files into
lost+found, where as in some cases with the FFS fsck, it "knew" that
in a particular case, the order in which writes were staged out the
right thing to do was to let the unlink complete, so it would let the
refcount go to zero, or stay at zero.

The other thing that we did in Linux is that I made sure we h

Written by

HackTech

View all posts by HackTech

Show comments (10)

Leave a comment Cancel reply

You must be logged in to post a comment.

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Log in to your account