Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Bug

Data-Corrupting ext3 Bug In Linux 2.4.20 27

linuxjack55 writes "Kerneltrap is reporting a data-corrupting bug in the ext3 code of kernel 2.4.20. The scope of the problem (and workarounds) are described in the article, which also includes a link to an interesting interview with kernel hacker Andrew Morton. In it, he states that the '2.4.x core has only stabilized very recently' and the 2.4.x kernel is 'even now...in a late beta state.' He was also asked when the 2.4 kernel could be considered stable. His reply: 'Six months, perhaps?' If that prediction is accurate, 2.6.x could arrive before a 'stable' version of 2.4.x does." (The interview with Morton is from last February -- how stable you consider 2.4 right now is up to you.)
This discussion has been archived. No new comments can be posted.

Data-Corrupting ext3 Bug In Linux 2.4.20

Comments Filter:
  • Duplicate. (Score:3, Insightful)

    by Naikrovek ( 667 ) <jjohnson@p[ ]com ['sg.' in gap]> on Monday December 02, 2002 @11:18PM (#4798985)
    they get paid to *not* do this. Why do they not read their own site?

    Mod me down if you must, but I have a good point here.

    http://developers.slashdot.org/article.pl?sid=02 /1 2/02/0128206&mode=nested&tid=106
    • Re:Duplicate. (Score:4, Insightful)

      by Naikrovek ( 667 ) <jjohnson@p[ ]com ['sg.' in gap]> on Monday December 02, 2002 @11:57PM (#4799164)
      To be fair, I suppose these guys see *thousands* of submissions per week and easily forget what's been posted and what hasn't.

      But writing a simple filter to check past stories for the same hyperlinks shoudln't be too hard, i wouldn't think.

      Ah well.
      • Like I said on a previous post [slashdot.org] on another dupe story, perhaps a mechanism which filters on the links attached to a story would surely eliminate a big deal of dupes...
      • To be fair, I suppose these guys see *thousands* of submissions per week and easily forget what's been posted and what hasn't.

        There are thousands of submissions, but much much much fewer make it as stories. Surely people that get paid to do this can keep track of just the stories.
  • Dupe (Score:2, Redundant)

    by FattMattP ( 86246 )
    It's a dupe [slashdot.org]. Please move along.
  • by Anonymous Coward on Monday December 02, 2002 @11:18PM (#4798990)
    this is a first post
    the editor's are on crack again
    let's beat the dead horse
  • If my code is buggy, will the data corrupting bug fix it? Kind of one of those reverse deals, like "I don't not hate you!"

    Well, it would be nice....
  • by mhesseltine ( 541806 ) on Monday December 02, 2002 @11:41PM (#4799106) Homepage Journal

    Because, the corruption of the databases would explain the duplicate story postings. Come on guys, ride this one for all it's worth.

  • by MattCohn.com ( 555899 ) on Monday December 02, 2002 @11:45PM (#4799129)
    Andrew Morton then continued... 'One popular company to be affected by the bug was the OSDN. Their technology news site Slashdot.org [slashdot.org] updated their database server with the new 2.4.20 kernal.'

    It was a disaster. 'The problem was with syncing during unmount. When the people at Slashdot upgraded, their databases became corrupted. Suddenly, duplicate stories began to appear!'

    Added Andrew, 'Wow, duplicate stories. There's a shocker'

    The full interview will be availible soon on The Onion [theonion.com]
  • Wasn't the big benefit that came with ext3 the journaling capacity? Is there some type of journaling that the normal, lay user, isn't aware of?

    How can you verify that this option is not enabled? What, if it is enabled, can be done about it now - can you change the filesystem type (e.g. revert to ext2) or is all hope lost?

    • If ext3 is selected as the filesystem type, then journaling is enabled automatically. You can however boot an ext3 filesystem as ext2 since it is backwards compatible, so no, all hope is not lost.
    • Re:Journaling (Score:4, Informative)

      by 0x0d0a ( 568518 ) on Tuesday December 03, 2002 @01:42AM (#4799525) Journal
      The default mode is ordered. Basically, this journals only metadata, preventing your filesystem from becoming corrupted. This is the big worry for most people -- losing everything on your partition because of a power loss at a bad time. This may sound not so great, but it's what most other journalling filesystems do --only worry about metadata.

      Journalled mode journals everything, including file data and metadata. This is the uber-reliable (well, when it doesn't have corruption-causing bugs) mode that most filesystems don't bother with because of the speed hit.

      How can you verify that this option is not enabled

      You can look for options in /etc/fstab...it's ordered by default, but if there's an option data=journal, then it's journalled.

      If you're using 2.4.20 right now, I think I'd reboot into your older kernel right now. :-)
    • As others have stated, the default is data=ordered. If you haven't explicitly specified data=journal in /etc/fstab or via mount, then you're not affected (i.e. safe). /sbin/mount will spit out the parameters you used.

      This bug is in 2.4.20-pre5+, and Stephen and Andrew have both proposed workarounds. Ultimately, it might just be "more" "worth it" to switch to data=ordered if you've been using data=journal after syncing, dropping to single user, and remounting your ext3 partitions as data=ordered.

      Or you could just back out the offending diff.
  • The symptoms are that any file data which was written within the thirty seconds prior to the unmount may not make it to disk. A workaround is to run `sync' before unmounting
    So I guess you posted this story twice just in case the last one didn't get synced to disk properly, eh? *ducks*
    • Amazing thing is that my netbsd box sync's disks as part of the shutdown script.

      Interestingly enough so does shutdown on Linux. So what's the problem?

      • The `sync` program simply executes the fsync [google.com] system call, which is what does the actual work (tells the kernel to flush its filesystem buffers to disk). The thing is, `umount` is also supposed to call fsync but apparently in this EXT3 configuration it isn't calling it properly.

        Running `sync` before a `umount` should be redundant -- if `umount` is working correctly.
  • by tpv ( 155309 )
    Come on, everyone has known [slashdot.org] about this bug for more than 24 hours.

    You think they'd have fixed it by now.

Things are not as simple as they seems at first. - Edward Thorp

Working...