Forgot your password?
typodupeerror
Open Source Programming News

Code Cleanup Culls LibreOffice Cruft 317

Posted by timothy
from the let's-just-call-them-introns dept.
mikejuk writes with an interesting look at what coders can get around to after a few years of creating a free office suite: dealing with many thousands of lines of deprecated code: "Thanks to the efforts of its volunteer taskforce, over half the unused code in LibreOffice has been removed over the past six months. It's good to see this clean-up operation but it does raise questions about the amount of dead code lurking out there in the wild. The scale of the dead code in LibreOffice is shocking, and it probably isn't because the code base is especially bad. Can you imagine this in any other engineering discipline? Oh yes, we built the bridge but there are a few hundred unnecessary iron girders that we forgot to remove... Oh yes, we implemented the new chip but that area over there is just a few thousand transistors we no longer use... and so on." Well, that last one doesn't sound too surprising at all. Exciting to think that LibreOffice (which has worked well for me over the past several years, including under the OpenOffice.org name) has quite so much room for improvement.
This discussion has been archived. No new comments can be posted.

Code Cleanup Culls LibreOffice Cruft

Comments Filter:
  • by Anonymous Coward on Friday January 13, 2012 @11:00PM (#38694688)

    There are probably dozens of extra nails that were just hammered in rather than be removed. There are extraneous pieces of lumber.

    And a house that was remodeled? I've seen newspaper used as filler. I've seen layers of roofing, with things buried in between layers.

    Frankly I don't know what's inside my walls, and I'm not sure I want to know.

    • by Anonymous Coward on Saturday January 14, 2012 @12:05AM (#38694984)

      I've seen chemical plants built with millions of dollars worth of unnecessary piping and valves, because the project timeframe meant that it was cheaper to install extra connections that might never be used and save engineering time than waste time re-engineering it.

      If removing unnecessary items can save thirty thousand dollars (say) at the cost of three days, removing the cruft is only worth it if the delay costs less than ten thousand a day.

      • Re: (Score:3, Interesting)

        by Anonymous Coward

        Good analogy.

        But Libreoffice still uses Java. I don't see that fitting into your analogy, because the Java dependency really has to be removed. It was put there only because OpenOffice was in the hands of Sun. Now Java is in the hands of Oracle. The Java dependency has to go.

    • Except building a house is a one time endeavor. A better analogy would be to build a house on the foundation of another house and then remodel it a dozen times and then add on a few additions. Soon there after divorce your spouse and find another to guide the new additions and find another site upon which you can move that house to and hopefully not have to deal with that awful foundation.

      • by Grishnakh (216268) on Saturday January 14, 2012 @02:32AM (#38695522)

        No, a better analogy is to build a house (full of extra materials as the parent said), and then use a giant replicator machine to mass-produce the house, almost instantly, and create thousands and thousands of new homes using that house as the basis. The wasted material in the one house is bad, but not that bad because it's one house, and it takes extra time and labor to do it more efficiently. But multiplied across thousands of identical copies, that wasted material adds up a lot. Plus, it's inefficient and you could have a better-performing house by doing a better job with the small details (better at energy efficiency for example). The slight increase in energy efficiency with that one house, realized by spending a bunch of extra time and effort removing wasted materials and doing a better job with various small details (like making sure the house wrap is applied extremely well rather being hurried and missing some staples in important places), won't amount to much with just the one house. However, multiplied across many thousands of houses, those energy savings add up to a lot.

        The fact that software is easily and quickly replicated with perfect precision and little or no effort or time really makes it hard to make good analogies for it without resorting to Star Trek-style replicators; it's the only technology we have that's like that. And because it can be and is copied so easily, very different dynamics apply to it than to many other fields of endeavor.

      • Re: (Score:3, Interesting)

        by rgmoore (133276)

        Building a house isn't a one-time endeavor. Much like code, houses are never 100% finished. They're frequently repaired and less frequently remodeled, renovated, or expanded. If you look at photographs of the same house over the span of a century or more, it's sometimes hard to believe that the final version is the same building as the original. And when people work on their houses, they usually go for the most cost effective approach, even if that means leaving no longer used stuff in place because it

    • by Belial6 (794905) on Saturday January 14, 2012 @12:50AM (#38695188)
      If it is a house that I have owned, it would be Pez dispensers. Whenever we do remodeling, we make a point to slip a pez despenser into the walls. My wife and I figure that some day a long time in the future, someone will have a mildly amusing story.
    • by theNAM666 (179776) on Saturday January 14, 2012 @01:52AM (#38695410)

      If you have newspaper or other similar material in your walls, which wasn't processed and designed as insulting filler, I have one word for you: mold.

      You'd better know.

    • by dokc (1562391)
      A lot of European cities were rebuilt on the same place over and over again, after wars, natural catastrophes, crazy rulers,...
      There are layers and layers of building foundations and walls under the modern (and especially not so modern) buildings. It's not uncommon constructions to be stopped for excavations to take place.
    • You've probably got a dead cat. Sometimes they get stuck between the layers. In old houses they were sometimes put there intentionally as protection against witches.

  • by kvvbassboy (2010962) on Friday January 13, 2012 @11:01PM (#38694694)

    Nice alliteration.

  • I'd bet there is. (Score:5, Informative)

    by JGuru42 (140509) on Friday January 13, 2012 @11:01PM (#38694698)

    It would not be very surprising to see a lot of dead code.

    I maintain the code for MoreTerra, a Terraria map editor program and I'm pretty sure I've got dead code in there and that's a pretty small project.

    With a large number of people working on the code it likely ends up slowly clogging up as no one quite knows what the others are doing.

    Dare I ask what type of dead code exists in something extra huge, but closed source, like the Windows code base or for MS Office? But I'd
    bet for all MS's faults that the code for Norton Antivirus is 10x worse.

    • Re:I'd bet there is. (Score:5, Interesting)

      by hedwards (940851) on Friday January 13, 2012 @11:31PM (#38694840)

      I'm mostly surprised that they're still getting performance improvements. It seems like they've done more over the last year than Sun did during the entire time it owned the project to unbloat it.

    • Re:I'd bet there is. (Score:4, Informative)

      by Anonymous Coward on Saturday January 14, 2012 @01:36AM (#38695368)

      Dead code is a result of OOP development (C++/Java)

      Because a lot of OOP code has extra methods used to circumvent or enforce memory protection (private/public) with variables inside classes. Sometimes the methods are created with anticipation that they will be used, but all the code is is a getVARname or setVARname dozens or hundreds of times when something like get(varname) set(varname) would be more efficient. in C you don't have this problem because memory protection basically doesn't exist unless you roll your own.

      • Re: (Score:3, Interesting)

        by Anonymous Coward

        I don't think they are talking about unused mutators. If anything Object Oriented programming makes it much easier to find and get rid of unused code BECAUSE of the data protection it implies. Having the code segmented and modular in different classes would make it worlds easier to find and remove dead code at all stages of development.

        But really, C, with their fancy "structs" and "flow control" just leads to unnecessary cruft, we should just stick with ASM and Goto, b/c that's way more maintainable ;)

        • Re:I'd bet there is. (Score:4, Interesting)

          by Anonymous Coward on Saturday January 14, 2012 @03:31AM (#38695746)

          Just to be Snarky, I'll point out that the Glasgow Haskell Compiler politely informs me whenever it finds a dead function. Functional languages are light years ahead of anything else when it comes to the Compiler actually being able to reason about the code it's compiling.

          • Re:I'd bet there is. (Score:5, Informative)

            by Stevecrox (962208) on Saturday January 14, 2012 @08:07AM (#38696468) Journal
            The Eclipse Java Compiler can indicate a warning if a private function is never called, the Eclipse Code Compiler and Findbugs will both throw warnings if an area of code is unreachable. Findbugs is able to detect if a variable is declared but never used (dead store) and will throw a warning. Lastly CPD (a part of PMD) is able to look for identical code blocks allowing you merge duplicate functions.

            Sure that doesn't cover public functions but I don't think there is harm in unused getters and setters and it's easy to find if a function is called through tools found in Eclipse. Just because Java developers don't use these tools doesn't mean they don't exist.
  • Automate it (Score:4, Insightful)

    by wisnoskij (1206448) on Friday January 13, 2012 @11:06PM (#38694720) Homepage

    Sounds like they already put a lot of work into this, but someone should tell them that you can automate things like removing unused code.

    • Re:Automate it (Score:4, Insightful)

      by rgmoore (133276) <glandauer@charter.net> on Saturday January 14, 2012 @03:00AM (#38695628) Homepage

      I'm pretty sure that they don't want to automate it. One of the first things Libre Office did after they forked from OO.o was to come up with a list of "easy hacks" for people who wanted to get involved but didn't know where to start. That includes stuff like dead code removal and translating comments from German to English. By leaving that stuff marked out but undone, they hope to ease new people into the project. That may not be the most efficient way of doing this kind of thing, but if it helps to recruit new developers it will do a lot more for the project in the long run than just getting rid of the cruft. It's a big difference between a project run by paid coders on a tight budget and one that depends on a variable number of volunteers.

    • Re:Automate it (Score:4, Interesting)

      by AmiMoJo (196126) <mojo@NOspaM.world3.net> on Saturday January 14, 2012 @04:51AM (#38695938) Homepage

      Yeah but they don't help much. The compiler will kill off any code that really isn't used so to make noticable performance imporvements you have to do stuff at the achitectural level. Maybe someone wrote a function but then later there were performance issues and it was replaced in some code but not elsewhere. Now you have two functions doing the same thing but the compiler and automated tools can't really tell that. The other classic one is where you have features that are no longer used or no longer make sense but are still possible to invoke, and again you need to work at the architectural level for that stuff.

  • oooh yes (Score:4, Interesting)

    by Mr2cents (323101) on Friday January 13, 2012 @11:09PM (#38694744)

    I've been working on a project where there were 3 separate wrappers around a database, each returning different objects containing the same data... So you had to convert those each time two modules using different wrappers needed to communicate. I tried to clean it up a bit, but eventually I stopped because my manager was frowning upon that because "I broke working code". Also there were parts that I didn't know if they were still in use. I also ran a profiler and found 80% of the functions never got called. That doesn't mean it's dead code of course, but looking at the function names I got an eerie feeling with a bunch of them. Anyway, I learned a lot about how not to manage software, I quit the company since then and I can only hope things have changed over there.

    • Re:oooh yes (Score:5, Informative)

      by zidium (2550286) on Saturday January 14, 2012 @12:17AM (#38695026) Homepage

      Mr2Cents,

      Your actions are indicative of a person who is not yet truly a craftsman of the software engineering trade.

      Speaking from personal experience dealing with huge, complex, unmaintainable PHP legacy systems for the last ten years, let me tell you a far better path:

      1. Search the code base for what may be directly calling the code.
      2. Set debug breakpoints at the start of each piece of cruft code and rigorously test the app.
      3. Create a custom exception (e.g. CrapCodeHitException) and throw it at the beginning of each code segment you want to remove. If you don't hit any of the exceptions after, say, a week of normal browsing doing other things, plus testing, then proceed to step 4.
      4. Catch the CrapCodeHitExceptions at the highest level you dare, log this into a separate log file you will have permission to read. Commit the code into a releaseable branch so that it ends up on your QA and staging servers.
      5. Get approval to have the logging code be pushed to staging. Add comments above each cruft piece of code stating a) the level of risk you think if it is removed, b) when one should feel free to remove it (pick increments like 3 mo, 6 mo, 1 yr, based on risk), c) your name. If shit hits the fan cuz of removal, you want to man up and accept responsibility so your peers don't waste precious cycles needlessly troubleshooting why this "perfectly fine" code was seemingly arbitrarily removed.
      6. After each time of your comments has elapsed, if the code was never triggered (parse the logs!), feel free to remove it. Please leave a note behind that you removed such and such, tho, and stick your name on it. Remove these notes after a year.

      I've personally cleaned up 100,000s lines of code using this mechanism on several large and complex sites, without a single failure.

  • by Rubinstien (6077) on Friday January 13, 2012 @11:15PM (#38694768)

    ...lots of stuff is left lying about which might not be used any longer on the off chance that it might be adapted to some future purpose. Sounds like genetics.

    • Don't leave code around that is no longer needed. Just don't. I've deleted heaps of code in my career that was obviously unreachable. And remember that all of your history should be in source control, so the code isn't permanently dead anyway. If you *really* need to look at how a method used to work you can check out an old version of the application and look. And that is a fairly rare occurrence anyway. It is *far* more important to make sure the current version of the code is readable.
      • by Rubinstien (6077)

        I've deleted heaps of code too, but I seldom remove a still-functional API, even if nothing is currently using it. Generally add a comment to that effect, though. I've been grateful in the past when someone prior to me decided to keep something that was no longer in use but still potentially useful: http://slashdot.org/comments.pl?sid=1445528&cid=30120820 [slashdot.org] . If the code has unit test stubs I try to keep unused API functional and testable as well. It generally takes little effort to do so.

        Agree as wel

  • by Kenja (541830) on Friday January 13, 2012 @11:18PM (#38694778)
    so most people wont notice a new build.
    • That's a good point - why the heck isn't there an update script or something on Windows & Macs? I get you don't want to push nightly build but major point releases couldn't hurt. I think my in-laws are still on OO 2.3...
  • by RelaxedTension (914174) on Friday January 13, 2012 @11:19PM (#38694786)

    Can you imagine this in any other engineering discipline? Oh yes, we built the bridge but there are a few hundred unnecessary iron girders that we forgot to remove...

    Those would be perfectly valid if upon discovering your girder was 3 inches too short you could instantly create a copy of it, set the original aside, then alter and test that copy of the girder. Then you might leave a few extras lying around.

    • Those would be perfectly valid if upon discovering your girder was 3 inches too short you could instantly create a copy of it, set the original aside, then alter and test that copy of the girder. Then you might leave a few extras lying around.

      Only in a lab environment. If the bridge was taken out of the lab, and placed on a busy river, then even if you wanted to alter the girders, the traffic wouldn't allow it. You'd have to wait until midnight on Christmas Eve to fix not just the girder but make al

  • Bad examples (Score:5, Insightful)

    by intx13 (808988) on Friday January 13, 2012 @11:22PM (#38694806) Homepage

    Bridges often have unused structural elements: walk-ways made unsafe by modern traffic levels, maintenance accesses unused for safety reasons, supports made redundant beyond the factor of safety by bridge improvements, etc. Chips and boards too: FPGAs with 10% utilization, chip designs re-purposed with functional components disabled, subsystems replaced in boards by new designers not confident enough to remove the old design, etc.

    Cruft in software is more often removed because (1) software has a potentially longer lifetime than hardware and (2) it's a lot easier to remove an uncalled function from a program than a girder from a bridge! Software cleanup should be an expected and planned part of a project's life cycle.

    • Re:Bad examples (Score:4, Interesting)

      by Ethanol-fueled (1125189) on Friday January 13, 2012 @11:40PM (#38694882) Homepage Journal

      ...subsystems replaced in boards by new designers not confident enough to remove the old design, etc.

      It sounds crazy, but I work with a real-life example, a beamforming [wikipedia.org] circuit board that utilizes a certain technique, but has all the legacy components utilizing another technique that was never even implemented!

      In that case, it wasn't a matter of confidence, but probably corporate sloth - engineers are expensive, and so they figure that paying the board-house more for the extra components per board would be cheaper than getting an engineer to redesign the board.

      • Re:Bad examples (Score:4, Insightful)

        by kbielefe (606566) <karl.bielefeldt+ ... m ['ail' in gap]> on Saturday January 14, 2012 @12:54AM (#38695206)

        It's not crazy. A major board redesign will set a schedule back three months or more, so if you have two options and aren't sure which one will work, it's not uncommon to design for both if you have the room. Maybe you're evaluating two vendors. There are also usually components that are only used during development. Sometimes there's an experimental or premium feature that requires an extra chip, but you don't want to make two boards. Of course, most of the time unused components get left off in mass production, but developer's boards or ones from prototype runs might still have them.

  • by Forbman (794277) on Friday January 13, 2012 @11:35PM (#38694862)

    Oh yes, we built the bridge but there are a few hundred unnecessary iron girders that we forgot to remove...
    Well, look at bridges built in the 1800's compared to the ones today. Would we build a modern bridge today using wrought iron links http://en.wikipedia.org/wiki/Clifton_Suspension_Bridge [wikipedia.org]? Each building made in a certain period in a way represents a degree of refinement compared to its predecessors. Better materials, better methods. Buildings in general cannot be "cleaned up" the way code can, where "cruft" today was yesterday's conservative design.

    Read a book about the differences in the construction of the World Trade Centers versus the Empire State Building, for example (the WTC has sibling buildings still around using the same techniques, such as the Aon [nee Amoco] Building in Chicago)...

    • by epyT-R (613989)

      or, today's conservative design is yesterday's+cruft added by redundant bloat in managed runtimes that're already available from the host OS. this happens when programmers graduate and take their professors' ivory tower vacuum 'everything should be portable/who cares about performance/efficiency computers are so fast anyway' mentality with them to their employers.

  • by tragedy (27079) on Friday January 13, 2012 @11:46PM (#38694910)

    The quoted section in the summary asks if we could imagine this in other engineering disciplines. As the rest of the summary points out, it happens all the time in microchips. It also happens a lot in civil engineering, including bridge building. Removing things takes work. Unless there's work to be saved by doing it, or some way to profit from selling what's removed as scrap or it's a safety issue to leave it most engineers won't remove old parts of a structure. Consider underground pipes. How often are they removed when they're replaced? If the new ones are being laid down where the old ones went, they'll be replaced. Otherwise, 90% of the time they'll just leave the old ones there. Same goes for just about everything. Old installations of any kind are full of stuff that no longer serves any purpose. Brackets and supports for heavy equipment that isn't used anymore, old wiring and panels, concrete slabs that something mystery object used to sit on, etc. When was the last time you saw anyone take away some 30 ton piece of equipment then pay more money to have the floor where it used to sit un-reinforced? Now, sometimes they do. Usually it's when the place is being sold and the new owners are re-modelling. Other times the owners do decide to do a major cleanup. That's exactly what's being done here with libreoffice. Makes it no different than any other engineering discipline then.

    Incidentally, if it's truly "dead" code, then it shouldn't actually be compiled, so it's not like the bridge engineer left in a bunch of extra girders, it's more like he's keeping addendum 6-c to revision 12b of the plans for section 3 in the same file cabinet as revision 13 rather than shifting it to a storage box and warehousing it.

    • by deodiaus2 (980169)
      I once inherited a motor design which I was modifying. There was a strut going along the entire length. I was thinking of removing it, as I could not find a use for it. However, I wasn't sure if there might not be a manufacturing reason for it, or to provide additional structural support..
      So I left it in.
      Why, because 10 years ago, I was fired because I removed some venting slots on another generator. I removed them because they only provided minimal ventalation, but really interfered with the magn
    • by Anonymous Coward on Saturday January 14, 2012 @12:15AM (#38695018)

      A nice engineering example is the stone pylons at each end of the Sydney Harbour Bridge. They were built to support the cranes that were used in constructing the steel arch of the bridge. Since the bridge's completion they've served no structural purpose whatsover.

      As the parent poster suggests, it would have cost time and money to remove them. However, in bridge building they plan around that - a bit of extra effort was put in at the start and the pylons were designed and built in such a way that they looked good after the bridge was finished. They were left in place as a feature of the completed structure and, as they were built in sandstone, they do a reasonably good job of making the bridge work visually with the feel of the historic precint beneath the southern end of it.

      Dead code rarely adds anything to the aesthetics of software.

    • by thegarbz (1787294)

      This happens in all facets of engineering, not just civil structures. Go into any oil refinery that's more than 10 years old and you may find whole units left in place filled with nitrogen as it's too costly to remove.

      The only time unused stuff is cleaned up in the engineering world is if the realestate becomes valuable, or if it is mandated to by some safety / environmental code.

      • Re: (Score:3, Informative)

        by FienX (463880)

        I've worked on enterprise asset management systems in a number of different industries including electrical utilities, natural gas pipelines, and military. In almost every company they've had some variant of an "abandoned in place" asset status. In cases like power plants, trying to remove a single cable from a series of cable trays or raceways is rarely, if ever, worth the effort and risk. Some cable trays have dozens of cables (and I'm not talking cat5) in them, sometimes half of which are "dead" but r

        • by thegarbz (1787294)

          I know what you mean. The plant I work at currently has had a method of cut and cap for old decommission / damaged and irreparable cables. They were historically buried in the ground for some 500 motors on site.

          Now we have more failing cables and the ground is full. Actually not only is the ground full but so are the switchroom entries. The engineering behind the infrastructure going in to run cables (some of them as big as 240mm^2) is amazing. Cable trays are no longer installed by electrical contractors b

    • by Nethead (1563) <joe@nethead.com> on Saturday January 14, 2012 @02:15AM (#38695472) Homepage Journal

      Ever been in the telco closets of a 50 year old office building? Old 9600 baud modems still powered up and connected to 66 blocks, old DS0 smartjacks with red lights, all next to Cat5 and fiber cross connects.

      Look above the drop ceiling of an old department store store sometime and gander at all the serial cable wire that is covered by the Token Ring wire covered by the 10base5 wire that is covered by the ThinNet wire that is covered by the Cat5 wire that is covered by the fiber ducts. All that tangled in with the old 25 pair telco wire.

      If it's not on your work order, you don't touch it.

      • by Anonymous Coward on Saturday January 14, 2012 @08:44AM (#38696598)

        We are currently preparing to move to another office building in another town.

        Our old premises are just like you describe. Build in 1964.
        Not counting anything else: There is 600 kilometer of Cat5e cabling in the building-complex. (I was involved in the Coax -> Cat5e upgrade back in '97. Still remember some details.)

        Actually it is a good thing.
        The recycle value of the wiring is more than the value of the building and the grounds.

        With todays fucked up real-estate market for office buildings we couldn't sell the old location to anybody.
        But because of the recycle value we had no trouble selling the entire site to a demolition company.

        Win-win for everybody:
        We get money for what is basically an unsellable building-complex.
        They will break it down and recycle it. Gives them about a year of guaranteed work and a reasonable profit when due to the recession their business is at an all time low.
        Afterwards they will sell the cleared grounds to the City Councel who are desperate to get a reasonably priced area of land at the edge of the city-center by 2014 so they can build a new City hospital. (The current hospital is on land not owned by the City, The contracts end in 2016 and can not be renewed because that would require a re-check of the environmental status of the grounds and everybody knows there is pollution there. That's wasn't an issue in 1986 when the old hospital was build, but new legislation passed in 2008 making it an issue now.)
         

  • by Freddybear (1805256) on Friday January 13, 2012 @11:57PM (#38694956)

    Human DNA (and just about every other species as well) is full of things like inactive duplicate genes (some with slight alterations), pieces of old retroviruses, and other mutations and replication errors that have been "commented out". Plus a whole lot of sequences which we don't know what they're good for yet.

  • From summary:

    Exciting to think that LibreOffice has quite so much room for improvement

    Sorry but removing the dead code is not really "room for improvement". It's rather "fix a code that lack(ed) a proper project management".

  • 80286 microcode (Score:2, Interesting)

    by Anonymous Coward

    Intel's microcode to support 16-bit protected mode became obsolete as soon as the 80386 was released, but they had to support it for backward compatibility, in case someone tried to install Windows 3.0 on an IBM AT clone, for instance. Probably that microcode has been carried forward ever since. Also, there are a lot of CISC instructions such as SCAS* with the "REP" prefix which were heavily used in assembly language in the eighties, but which are now deprecated and typically slower than the RISC-style re

    • The only people bashing CISC are sheltered academics who have never been out in the real world or have heard of benchmarks. On my Core i7, for example, the benchmarks [sourceforge.net] show that string instructions usually are the fastest way to do things.

      A basic rep stos yields 25 GBps. So does the fastest recommended SSE method using movntps and prefetch. A RISC-style fill loop only manages 12 GBps. You can improve its performance by unrolling the loop four times, getting the same 25 GBps as a string instruction with a lot

  • by junglebeast (1497399) on Saturday January 14, 2012 @04:45AM (#38695920)

    "...and it probably isn't because the code base is especially bad."

    Yes, it does mean that. Thousands of lines of deprecated code implies it was written in a sloppy and disorganized way and this is not only an indicator but, I would argue, a definition, of a terrible code base.

When Dexter's on the Internet, can Hell be far behind?"

Working...