Become a fan of Slashdot on Facebook

 


Forgot your password?
Close
typodupeerror
DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Also, Slashdot's Facebook page has a chat bot now. Message it for stories and more. Check out the new SourceForge HTML5 internet speed test! ×
Programming IT Technology

Performance Bugs, 'the Dark Matter of Programming Bugs', Are Out There Lurking and Unseen (forwardscattering.org) 68

Posted by msmash from the dark-matter-of-things dept.
Several Slashdot readers have shared an article by programmer Nicholas Chapman, who talks about a class of bugs that he calls "performance bugs". From the article: A performance bug is when the code computes the correct result, but runs slower than it should due to a programming mistake. The nefarious thing about performance bugs is that the user may never know they are there -- the program appears to work correctly, carrying out the correct operations, showing the right thing on the screen or printing the right text. It just does it a bit more slowly than it should have. It takes an experienced programmer, with a reasonably accurate mental model of the problem and the correct solution, to know how fast the operation should have been performed, and hence if the program is running slower than it should be. I started documenting a few of the performance bugs I came across a few months ago, for example (on some platforms) the insert method of std::map is roughly 7 times slower than it should be, std::map::count() is about twice as slow as it should be, std::map::find() is 15% slower than it should be, aligned malloc is a lot slower than it should be in VS2015.

Performance Bugs, 'the Dark Matter of Programming Bugs', Are Out There Lurking and Unseen More | Reply

Performance Bugs, 'the Dark Matter of Programming Bugs', Are Out There Lurking and Unseen

Comments Filter:

  • It's stupid to call them "the dark matter of programming bugs". We were just accustomed to this being the way Microsoft did things, not a bug, a feature.

    That stems from Microsoft, originally writing for IBM, being paid per thousand lines of code. As such it made sense that software was not written efficiently because the programmer was not rewarded for efficiency, it merely had to fit within available memory. Unfortunately it seems that this practice has not stopped given the sheer size of Microsoft oper

  • If performance sucks, buy a faster computer. Speed covers a multitude of sins.

    • In many simple situations, yes.

      But there are plenty of cases where you are operating on such a large scale that using programming resources to optimize performance is a a good tradeoff. As example: in one customer case a 1% total increase in efficiency maps to 5000+ euro/month in the costs they pay to host the solution, on a yearly bases that buys quite many programmer days of optimization.

  • Shhh...don't tell him about scripting languages... (Score:4, Insightful)

    by xxxJonBoyxxx ( 565205 ) on Wednesday March 22, 2017 @12:18PM (#54088683)
    >> that the user may never know they are there

    They will if they try to run a lot of them on a machine with finite resources, like a phone. Or it's a process that's iterated frequently, like a "big data" operation. But if the end user STILL doesn't notice it...then it's hard to call it a bug.

    On the other hand, the performance/just-get-er-done trade-off is well known to programmers of all stripes. (At least I hope it is - are people really finding new value in the article?) There's the quick and dirty way (e.g., a script), and then there's the "I can at least debug it" way (e.g., a program developed in an IDE), and then there's the optimized way, where you're actually seeing if key sections of code (again, especially the iterated loops), are going as fast as possible. Generally your time/cost goes up as your optimization increases, which becomes part of the overall business decision: should I invest for maximum speed, maximum functionality, maximum quality, etc.
  • Is it a problem, if no one notices? Most code can be optimized more than it is, but if everyone's happy with how it works, what's the problem?

  • "A little bit" (Score:1)

    by Anonymous Coward

    Like windows update doing something in N*N instead of N?

    Yeah it's a little bit slower... At the beginning.

    Programmers need to always keep in mind that their software is always going to be used, by some people, for a longer duration than intended, and with larger datasets than expected.

  • The first example: std::insert should do nothing if an item is already present. So calling std::insert only if the object is not yet present is 100% equivalent to calling std::insert directly, but it is 7 times slower.

    This violates the important principle that when using a library, the obvious way to do things should be the fasted. So hacks are required to make your code fast, and that shouldn't happen.

    I assume the explanation is probably that std::find is small enough to be inlined, while std::insert

    • Re: (Score:2)

      by pruss ( 246395 )

      Actually, the explanation in the article is that there is a memory allocation for a node done *before* checking whether the object is present. So if the object is present, there is a pointless memory allocation and deallocation done. Nothing to do with inlining, and an easy fix for the library: just swap the order of the check for presence and the memory allocation.

  • It is a losing battle to try and solve performance in the programmer space. The Compiler does a much better job of optimization due to a multitude of compiler trics including both Static and dynamic analysis, cache analysis and so on. The programmer trying to write the most efficient code should rather spend his/her time trying to use out of the box algos as far as possible as the compiler knows how to fine tune those. next they should run a profiling tool like jprofiler and see where the job is actually sp

    • Re: (Score:2)

      by guruevi ( 827432 )

      The article is talking about Visual Studio, possibly the worst compiler in the world. There isn't much optimization going on there.

  • Two Solutions (Score:4, Insightful)

    by UnknownSoldier ( 67820 ) on Wednesday March 22, 2017 @12:27PM (#54088789)

    Programmers love to use the cop-out

    "Premature Optimization is the root of evil"

    dogma which is complete bullshit. It tells me your mindset is:

    "Oh, we'll "fix" performance issue later."

    Except later never comes. /Oblg. Murphy's Computer Law: [imgur.com]

    * There is never time to do it right, but there is always time to do it over.

    As Fred Brooks said in Mythical Man-Month.

    "Show me your flowcharts and conceal your tables, and I shall continue to be mystified.
      Show me your tables, and I won't usually need your flowcharts; they'll be obvious."
    -- Fred Brooks

    Which can be translated into the modern vernacular as:

    * Show me your code and I'll wonder what your data structures are,
    * Show me your data and I'll already know what your code is

    There are 2 solutions to this problem of crappy library code.

    1. You are benchmarking your code, ALONG THE WAY, right?

    Most projects "tack-on" optimization when the project is almost completed. This is completely BACKWARDS. How do you know which functions are the performance hogs when you have thousands to inspect?

    It is FAR simpler to be constantly monitoring performance from day one. Every time new functionality is added, you measure. "Oh look, our startup time went from 5 second to 50 seconds -- what the hell was just added?"

    NOT: "Oh, we're about to ship in a month, and our startup time is 50 seconds. Where do we even begin in tracking down thousands of calls and data structures?"

    I come from a real-time graphics background -- aka games. Every new project our skeleton code runs at 120 frames per second. Then as you slowly add functionality you can tell _instantly_ when the framerate is going down. Oh look, Bob's latest commit is having some negative performance side effects. Let's make sure that code is well designed, and clean BEFORE it becomes a problem down the road and everyone forgets about it.

    2. You have a _baseline_ to compare against? Let's pretend you come up with a hashing algorithm, and you want to know how fast it is. The *proper* way is to

    * First benchmark how fast you can slurp data from a disk, say 10 GB of data. You will never be FASTER then this! 100% IO bound, 0% CPU bound.
    * Then, add a single-threaded benchmark where you just sum bytes.
    * Maybe, you add a multi-threaded version
    * Then you measure _your_ spiffy new function.

    Library Vendors, such as Dinkumware who provide the CRTL (C-Run Time Library), _should_ be catching these shitty performance bugs, but sadly they don't. The only solution is to be proactive.

    The zeroth rule in programming is:

    * Don't Assume, Profile!

    Which is analogous to what carpenters say:

    * Measure Twice, Cut Once.

    But almost no one wants to MAKE the time to do it right the first time. You can either pay now, or pay later. Fix the potential problems NOW before they become HUGE problems later.

    And we end up in situations like this story.

  • Check the links, decent code and analysis. Short and simple. I recently found a very similar bug in both PHP and HHVM with their trim() function (and variants there of). In both PHP and HHVM, trim() unconditionally allocates more memory, even if there is no white-space on either end of the string to trim. It is faster to write PHP code to check for white-space on both ends and then conditionally call trim() on a string.

  • The programmer is writing crappy and unoptimized code.
  • It's a myth of the cult of unit-testing that all tests passed == no bugs. Correctness bugs outlive all executed tests, and are a worse form of dark matter because they give no outward signs.
  • I interact with a piece of code I don't maintain that has this issue. A piece of Visual Basic (don't get me started) that copies from a source to a destination. However instead of using B=A to move the data, the author used copy(A) and paste(B). The data therefore interacts with the clipboard, slowing the process down.

    The issue might not be noticeable on a small amount of data. However, I use this piece of code to move gigabytes of data every day.
  • Showed this to Netflix once and they stated 'This fixes everything we are currently having issues with'. Apparently the entire industry has implemented API's in distributed architectures in such a way as to create architectural cross cutting concerns... https://www.slideshare.net/bob... [slideshare.net]

  • We were required to install it on our Linux servers - we run CentOS (same as RH). Every few days, the stupid monitor is suddenly eating 99%-100% of the CPUs... for *hours*. Overnight.

    I attached strace to it, and it's in some insanely tight loop, looking at its own threads.

    Maybe if I prove that it's doing it on multiple servers (it is, but I have to catch it - nothing's reporting this, unless it runs the system so hard it throws heat-based machine checks), and put a ticket in, and *maybe* the team that forc

  • Being an old fart, in my day, I remember the worst performance problems were caused by programmers with their own badly written library of functions and objects that they included everywhere, most of those were from their very first weeks of being a programmer and they sucked badly.

  • And if you think the performance N.F.R. is dark, wait 'til you find out about security.

    NFR [wikipedia.org]

  • In 1973 or 1974 I was the systems programming manager at the National Academy of Sciences after our IBM mainframe had been upgraded to the first version of the OS supporting virtual storage. And many programs, mostly Fortran programs, were running much slower than they used to. The problem was two-dimensional arrays and how they were initialized and searched. If you're looping through the wrong subscript in a big array, you cause a page fault every time you increment it. Flash storage makes that a much smal

Slashdot Top Deals

10 to the minus 6th power mouthwashes = 1 Microscope

Close