Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Linux Software

2.4.20 ext3 Data Corrupting Bug Fixed 34

An anonymous reader writes "The ext3 data corrupting bug found in the latest stable Linux kernel and reported by Slashdot here and here has been fixed. In this interesting KernelTrap story Andrew Morton describes the problem and offers a working patch. Evidently the bug has its roots in a much bigger design issue, something that won't likely be fixed in the current 2.4 kernel series. In any case, with Morton's patch applied your data will not be corrupted."
This discussion has been archived. No new comments can be posted.

2.4.20 ext3 Data Corrupting Bug Fixed

Comments Filter:
  • QA test cases. (Score:3, Insightful)

    by Trusty Penfold ( 615679 ) <jon_edwards@spanners4us.com> on Saturday December 07, 2002 @01:00PM (#4833245) Journal

    Where can I find the QA documentation, test cases and scripts for ext3? I would like to verify that this bug, and variations thereof, will be caught before release in the future. Thanks.

    They don't seem to be at the ext3 home (linked to in the story).

    Open Source is useless without Open Procedures, Open Documentation and Open Quality Control.

  • by jericho4.0 ( 565125 ) on Saturday December 07, 2002 @01:09PM (#4833294)
    Why did /. have to cover this 3 times in the dev section. I know many non-dev types who jump on point releases as soon as they come out. They should know about this.

    I hate to say it, but maybe /. doesn't like stories that make linix look bad.

    • because anyone who runs the brand new shiniest version of anything on their machine should expect that it's not perfect, and should go look at resources for developers.

      slashdot:
      news for whiners, stuff for people who need things explained to them in very small words
      • Hmm, yup, 'ignorant newbie' just about sums it up. This is a 2.4 problem. You know, 2.4, the current STABLE, safe release of the Linux kernel?

        Sometime in the future, 2.4 will go down in history as one serious cluster-fuck of a kernel.

    • "I hate to say it, but maybe /. doesn't like stories that make linix look bad."

      The fact that you are modded up +4 proofs that that is untrue.
  • vs 3.0 (Score:3, Interesting)

    by Lord Bitman ( 95493 ) on Saturday December 07, 2002 @03:16PM (#4833977)
    And here we're talking about calling the next major release "3.0" while things as important as /the file system/ need to be majorly reworked. Perhaps we shouldnt jump the gun on this. 3.0 should not have things laying around in it that need to be completely re-worked if they're going to work right. It doesnt count as a culmination of significant changes since 2.0 if those changes wont actually be working in 3.0.1
    • I don't know where you've been for the last couple of months, but the next release is going to be 2.6, according to Linus.

      I can only assume that the moderators that moderated this up are similarly misinformed, which is why I chose to reply rather than moderate this "Overrated" like it should be.

      • And if I said we shouldnt go to the moon, I suppose you'd assume I meant "at all" instead of "again" and correct me there, too.
      • From what I've read on lkml, Linus hasn't yet decided what the release will be, whether it's 2.6 or 3.0. in the meantime, he's calling it 2.6.
        • From what I've read in interviews since the recent cruise it's definately 2.6, and Linus didn't ever seriously consider calling it 3.0.

    • And here we're talking about calling the next major release "3.0" while things as important as /the file system/ need to be majorly reworked.

      2.4.x is the "stable" kernel. That means its not supposed to incorporate radical changes to its infrastructure. Apparently, the maintainer thought they could add some "safe" changes off of the 2.5.x kernel research to add functionality. The team was wrong. The ideal correction would include a radical change, so its going to be a kludge fix instead.

      The file system is getting major rework, IN the development kernel (2.5.x). 2.4 is not 3.0. 2.5 is not 3.0. 3.0 will be out when its ready. Stop judging 3.0 (actually 2.6) based on what's going on in 2.4.

      Besides, only an incompetent would use ext3 in a production machine.

      • So I'm incompetent. I had another major problem with a nic that was fixed in 2.4.20 and moved up this morning after having waited a week for bug reports. My root partition is ext3 (most of the others are Reiserfs) but I do not use journalling in the way that causes the problem.

        Actually, the server is an emergency backup / mirror server and it has been pretty unreliable for ages. I am not allowed to replace the nic and a kernel that does not require me to pull the power cable every time things go wrong is a big plus. Maybe there was another solution, but anything more than a day per month on that project is seen as lost time for me.

        As to you other point, I hope that Linus's feature freeze does not preclude fixes for problems like this making the next stable set of kernels. Whatever they are called.
        • The problem with using ext3 in a production system is that is "new". That means its subject to "bugs". Some bugs don't get picked up until many months after its in use. On a filesystem, that means you can get data corruption and lose files/data for months before you realize there is a problem. (And the corruption would be handed down to your backups.) Also, with ext3 being new, it won't have many diagnostic tools or other utilities.

          I have heard BAD things about reiserfs. Its a fact that they don't journal the metadata, just the filesystem structures. In certain crashes, you can lose some data while rapidly bringing up the system. But there are other people who swear by it, and perhaps its better than nothing.

          Myself, I use XFS. There are people who will grouse endlessly about it, but I've never encountered a problem with it. In any case, the whole point of a journaling filesystem quick restart of the filesystems (no fsck) AND integrity of the data. Competent sysadmins don't use flaky filesystems or new kernels on PRODUCTION machines.

          Actually, the server is an emergency backup / mirror server and it has been pretty unreliable for ages.

          Aiieeee... How can it be an emergency backup/mirror server if its unreliable? Mind you, its childsplay to use the machine for prototyping and backup merely by adding a harddrive to it, and doing your prototyping work on the second drive. How the heck can they refuse the replace the NIC if its a clunker? Its a lousy $20 bucks. You probably can cannabalize an old machine's NIC for free.

          Maybe there was another solution, but anything more than a day per month on that project is seen as lost time for me.

          Screwing around for a day because the company is too cheap to spend $20 for a good NIC is ridiculous as well. Its about 1 hour of your salary. I've worked for cheap companies, but that's plain stupid. As does having you mess around with kernels released days ago.

          As to you other point, I hope that Linus's feature freeze does not preclude fixes for problems like this making the next stable set of kernels.

          The whole point of the feature freeze is to stop incorporating NEW features. Bugfixes are the only thing allowed in a frozen development kernel until release. Its a mistake to think of a stable kernel (2.4) as being bugfree for each release. There were shops that still ran 2.2 kernels, because they didn't like the "instability" of the 2.4 kernels.

  • by Anonymous Coward
    Which distributions (Redhat, mandrake, debian, etc) are affected by this in their default ISO images? ie - which ones do I have to update just to get around this fatal error?
    • by Cecil ( 37810 ) on Saturday December 07, 2002 @05:57PM (#4834667) Homepage
      Which distributions ship using ext3 filesystems by default and setting them to mode journalled in their default ISO images? Um, none?

      Did you mean that you run your ext3 filesystems in full-journal mode, and would like to know if you have to update? Yes. Regardless of distro.

      In either case, please remember that journalled mode is NOT the default. The default is ordered. Unless you're explicitly setting your filesystem to full journalling, you aren't affected by this problem.

      HTH.
  • ...which is a lot more mature and thoroughly tested than ext series. Heres howto install RedHat on XFS:

    Install redhat on ext3,
    configure redhat, esp the networking
    get online, get the latest 2.4 kernel
    get XFS patch and xfsprogs and install
    recompile a new kernel with XFS in it and boot.
    mkfs.xfs /dev/, mount /dev/(xfs) to /mnt
    cd /
    cp -a {bin,usr,etc,... except tmp,mnt,proc} /mnt
    fix /mnt/etc/fstab to point at new partion for older redhats.
    reboot.

    This still gives some obscure errors on bootup, but maybe because of redundant scripts. works very fast and stable for me. If you get around to fixing those errors, please roll out a HOWTO since noone can take filesystem instability on production servers, yet everyone wants to use 2.4.

The best defense against logic is ignorance.

Working...