Forgot your password?
typodupeerror
Open Source Programming News

Code Cleanup Culls LibreOffice Cruft 317

Posted by timothy
from the let's-just-call-them-introns dept.
mikejuk writes with an interesting look at what coders can get around to after a few years of creating a free office suite: dealing with many thousands of lines of deprecated code: "Thanks to the efforts of its volunteer taskforce, over half the unused code in LibreOffice has been removed over the past six months. It's good to see this clean-up operation but it does raise questions about the amount of dead code lurking out there in the wild. The scale of the dead code in LibreOffice is shocking, and it probably isn't because the code base is especially bad. Can you imagine this in any other engineering discipline? Oh yes, we built the bridge but there are a few hundred unnecessary iron girders that we forgot to remove... Oh yes, we implemented the new chip but that area over there is just a few thousand transistors we no longer use... and so on." Well, that last one doesn't sound too surprising at all. Exciting to think that LibreOffice (which has worked well for me over the past several years, including under the OpenOffice.org name) has quite so much room for improvement.
This discussion has been archived. No new comments can be posted.

Code Cleanup Culls LibreOffice Cruft

Comments Filter:
  • I'd bet there is. (Score:5, Informative)

    by JGuru42 (140509) on Friday January 13, 2012 @11:01PM (#38694698)

    It would not be very surprising to see a lot of dead code.

    I maintain the code for MoreTerra, a Terraria map editor program and I'm pretty sure I've got dead code in there and that's a pretty small project.

    With a large number of people working on the code it likely ends up slowly clogging up as no one quite knows what the others are doing.

    Dare I ask what type of dead code exists in something extra huge, but closed source, like the Windows code base or for MS Office? But I'd
    bet for all MS's faults that the code for Norton Antivirus is 10x worse.

  • Re:Worked Well? (Score:5, Informative)

    by theguyfromsaturn (802938) on Friday January 13, 2012 @11:19PM (#38694790)

    I never get crashes with LibreOffice. Whenever I try Word on some documents (docx) I get a crash. I was completely unable to edit some documents in Word (sent to me by colleagues) until I opened them in LibreOffice, saved them in doc format, then reopened them in Word. It happens with distressing regularity. I find LibreOffice much more stable than Word personally. The worst part is when once I edited a doc in Word, saved it, and when later tried to open it again had a similar problem. I am not sure what document elements cause this but it's a sad state of affairs when LibreOffice is not only more stable (for me), but handles better MS own file format (even though there are still big deficiencies in the docx file handling in LibreOffice). So, stability issues? I guess it depends on your computer.

  • Re:It doesn't matter (Score:5, Informative)

    by Mr2cents (323101) on Friday January 13, 2012 @11:23PM (#38694814)

    I do think it matters. Yes, a compiler can throw out dead code, but not in all cases. E.g. if you have an enum where some values aren't used, and you then call a function if a variable has that unused value, how is the compiler going to find out? It's not only functions, there could be unused tests in code etc. All this clogs up the code and can make reading the code a living hell. It can turn an elegant part into a mess. Not mentioning the time wasted of developers trying to find out what a function does, only to discover it's not used. The article doesn't deal with the results in terms of code size or performance, but I'm very interested to find out.

    Anyway: you can either have clean code or maintainable code, but not both at once in my experience.

  • Re:Worked Well? (Score:3, Informative)

    by hedwards (940851) on Friday January 13, 2012 @11:32PM (#38694844)

    LIbreOffice hasn't been OO in well over a year. But nice try with the trolling.

  • Re:It doesn't matter (Score:2, Informative)

    by Renegrade (698801) on Friday January 13, 2012 @11:34PM (#38694858)

    And this, boys and girls, is how we end up with Windows 7/64 guzzling two gigs of memory after start-up.

    Not by this one isolated idea, but the very concept of "meh it doesn't cause a problem" snowballing until it IS a problem.

    I drafted up a mini-essay assuming it was C-style code, but the article is talking about methods. Clearing out half of the methods means that those virtual method tables are now half the size, which will result in much snappier execution. Less cache misses, less trash in the cache lines, shorter hash collision tables, it's all good stuff!

    Nevermind all of the benefits of faster loading times, less address exhaustion, etc that apply to ANY language.

  • Re:oooh yes (Score:5, Informative)

    by zidium (2550286) on Saturday January 14, 2012 @12:17AM (#38695026) Homepage

    Mr2Cents,

    Your actions are indicative of a person who is not yet truly a craftsman of the software engineering trade.

    Speaking from personal experience dealing with huge, complex, unmaintainable PHP legacy systems for the last ten years, let me tell you a far better path:

    1. Search the code base for what may be directly calling the code.
    2. Set debug breakpoints at the start of each piece of cruft code and rigorously test the app.
    3. Create a custom exception (e.g. CrapCodeHitException) and throw it at the beginning of each code segment you want to remove. If you don't hit any of the exceptions after, say, a week of normal browsing doing other things, plus testing, then proceed to step 4.
    4. Catch the CrapCodeHitExceptions at the highest level you dare, log this into a separate log file you will have permission to read. Commit the code into a releaseable branch so that it ends up on your QA and staging servers.
    5. Get approval to have the logging code be pushed to staging. Add comments above each cruft piece of code stating a) the level of risk you think if it is removed, b) when one should feel free to remove it (pick increments like 3 mo, 6 mo, 1 yr, based on risk), c) your name. If shit hits the fan cuz of removal, you want to man up and accept responsibility so your peers don't waste precious cycles needlessly troubleshooting why this "perfectly fine" code was seemingly arbitrarily removed.
    6. After each time of your comments has elapsed, if the code was never triggered (parse the logs!), feel free to remove it. Please leave a note behind that you removed such and such, tho, and stick your name on it. Remove these notes after a year.

    I've personally cleaned up 100,000s lines of code using this mechanism on several large and complex sites, without a single failure.

  • by FienX (463880) on Saturday January 14, 2012 @12:45AM (#38695166)

    I've worked on enterprise asset management systems in a number of different industries including electrical utilities, natural gas pipelines, and military. In almost every company they've had some variant of an "abandoned in place" asset status. In cases like power plants, trying to remove a single cable from a series of cable trays or raceways is rarely, if ever, worth the effort and risk. Some cable trays have dozens of cables (and I'm not talking cat5) in them, sometimes half of which are "dead" but removing those from the middle of a stack of hot cables in a working power plant doesn't have much of an ROI.

  • Re:It doesn't matter (Score:5, Informative)

    by smi.james.th (1706780) on Saturday January 14, 2012 @01:26AM (#38695316)

    I am not a Microsoft fan, but Windows 7 is actually a very well-written OS, in my experience. If you have lots of RAM then it uses it, there's no sense in having 8GB of RAM if it's only using 250MB and paging the rest of what it needs.

    As a point of reference, have a look at this article [zdnet.com]. If you only have 512MB of RAM then Win7/64 will only use about 200MB of RAM.

  • Re:I'd bet there is. (Score:4, Informative)

    by Anonymous Coward on Saturday January 14, 2012 @01:36AM (#38695368)

    Dead code is a result of OOP development (C++/Java)

    Because a lot of OOP code has extra methods used to circumvent or enforce memory protection (private/public) with variables inside classes. Sometimes the methods are created with anticipation that they will be used, but all the code is is a getVARname or setVARname dozens or hundreds of times when something like get(varname) set(varname) would be more efficient. in C you don't have this problem because memory protection basically doesn't exist unless you roll your own.

  • by rsilvergun (571051) on Saturday January 14, 2012 @01:48AM (#38695400)
    but recent (3.x) versions of OpenOffice ate my kids documents. It really sucked. From what I can gather it's a known bug in the document recovery module that hasn't been fixed to this day. The program crashes, writes a blank document out as the 'recover' document, then cheerfully overwrites all your original file and any of the automatically made backups. I suppose that somewhere along the line there was some user error. My kid probably could have said 'no' to something and stopped the whole mess. But seriously, she shouldn't have too. I've got a 500 frickin' gig drive in her machine. The biggest word doc I've ever seen in my life was 5 megs (mostly pictures). Why the hell do we still delete shit? Just make a huge undo buffer or something. I've got half a fscking terabyte. Come on OO.org, just use it already!
  • by theNAM666 (179776) on Saturday January 14, 2012 @01:52AM (#38695410)

    If you have newspaper or other similar material in your walls, which wasn't processed and designed as insulting filler, I have one word for you: mold.

    You'd better know.

  • by Nethead (1563) <joe@nethead.com> on Saturday January 14, 2012 @02:15AM (#38695472) Homepage Journal

    Ever been in the telco closets of a 50 year old office building? Old 9600 baud modems still powered up and connected to 66 blocks, old DS0 smartjacks with red lights, all next to Cat5 and fiber cross connects.

    Look above the drop ceiling of an old department store store sometime and gander at all the serial cable wire that is covered by the Token Ring wire covered by the 10base5 wire that is covered by the ThinNet wire that is covered by the Cat5 wire that is covered by the fiber ducts. All that tangled in with the old 25 pair telco wire.

    If it's not on your work order, you don't touch it.

  • Re:It doesn't matter (Score:4, Informative)

    by westlake (615356) on Saturday January 14, 2012 @02:55AM (#38695618)

    And this, boys and girls, is how we end up with Windows 7/64 guzzling two gigs of memory after start-up.

    RAM is there to be used, not hoarded.

    In short, Windows 7 (unlike XP and earlier Windows versions) goes by the philosophy that empty RAM is wasted RAM and tries to keep it as full as possible, without impacting performance.

    Windows 7 memory usage: What's the best way to measure? [zdnet.com] [Feb 25, 2010]

  • Re:It doesn't matter (Score:5, Informative)

    by bonch (38532) * on Saturday January 14, 2012 @03:17AM (#38695696)

    No, it's not. Windows 7 will use as little as 200MB of RAM if you only have 512 physically available. You're misunderstanding what's actually going on as you fret over megabytes in Task Manager.

  • by Grishnakh (216268) on Saturday January 14, 2012 @05:16AM (#38696024)

    Actually, building materials (at least here in the USA) are dirt cheap. Labor by far is the biggest factor in a house's cost. There are some exceptions, like appliances, cabinets, granite countertops, plumbing fixtures, etc., but things like framing, drywall, nails and other fasteners, even electrical outlets (except the GFCI ones) are all dirt cheap. That's why there's waste; the contractor doesn't care if his workers waste $2 worth of materials as long as they're fast. However, when the plumbers are putting in expensive polished faucets, they're probably more careful since replacing one of those is worth several hours of labor.

  • by Anonymous Coward on Saturday January 14, 2012 @06:51AM (#38696304)

    LO 3.5 beta on Windows does have an update notifier that also offers to download the update for you, too.

  • Re:I'd bet there is. (Score:5, Informative)

    by Stevecrox (962208) on Saturday January 14, 2012 @08:07AM (#38696468) Journal
    The Eclipse Java Compiler can indicate a warning if a private function is never called, the Eclipse Code Compiler and Findbugs will both throw warnings if an area of code is unreachable. Findbugs is able to detect if a variable is declared but never used (dead store) and will throw a warning. Lastly CPD (a part of PMD) is able to look for identical code blocks allowing you merge duplicate functions.

    Sure that doesn't cover public functions but I don't think there is harm in unused getters and setters and it's easy to find if a function is called through tools found in Eclipse. Just because Java developers don't use these tools doesn't mean they don't exist.
  • Re:It doesn't matter (Score:4, Informative)

    by shutdown -p now (807394) on Saturday January 14, 2012 @10:02PM (#38702740) Journal

    The problem with this is that I use task manager to see how close to the limit my system is, and to gauge how much memory to get on a system. It's all well and fine to use free memory for caching disk or whatever, but I'd like my "gas gauge" back please! Give me a memory monitor that actually tells me how much memory I could use for my apps, and how close to the edge I am.

    As things are, you need Process Explorer [microsoft.com] for that, and a decent understanding of virtual memory management in NT to understand what it actually tells you.

One man's constant is another man's variable. -- A.J. Perlis

Working...