Slashdot stories can be listened to in audio form via an RSS feed, as read by our own robotic overlord.

 



Forgot your password?
typodupeerror
Programming IT Technology

Memory Checker Tools For C++? 398

Posted by kdawson
from the heaps-and-bounds dept.
An anonymous reader writes "These newfangled memory-managed languages like Java and C# leave an old C++ dev like me feeling like I am missing the love. Are there any good C++ tools out there that do really good memory validation and heap checking? I have used BoundsChecker but I was looking for something a little faster. For my problem I happen to need something that will work on Windows XP 64. It's a legacy app so I can't just use Boosts' uber nifty shared_ptr. Thanks for any ideas."
This discussion has been archived. No new comments can be posted.

Memory Checker Tools For C++?

Comments Filter:
  • by DrXym (126579) on Wednesday June 06, 2007 @04:59AM (#19408549)
    I've played with Boundschecker, Purify (& Quantify) and Fortify. My experience of these tools is that they either take a painfully long time to run, throw up too many spurious warnings or crash outright after eating all available memory / disk space.

    They might be useful for small apps but if you have a massive app they are almost more trouble than they are worth.

    It's hard to say what you can do except foster safe coding practice and highlight the common pitfalls such as memory leaks, buffer overflows etc. Many compilers can help detect heap / memory overruns because the debug libs put guard bytes on the stack & heap that trigger exceptions when something bad happens. There are also 3rd party libs such as Boehm [hp.com] which help with memory leeak / garbage collection issues and dump stats. I'd say using STL & Boost is also a very good way of minimizing errors too simply because doing so avoids having to write your own implementations of arrays, strings etc. which are bound to be less stable.

  • Re:Boost? Ugh (Score:2, Insightful)

    by kazade84 (1078337) on Wednesday June 06, 2007 @05:21AM (#19408659)
    If you wanna "get the job done" then boost is sometimes exactly what you need. I'm not saying that you should memorize the whole library but sometimes a boost library is available for what you want to do. I couldn't live without boost::lexical_cast, boost::shared_ptr, boost::tokenizer and boost::python is genius. You *could* write all of those things yourself but honestly, why?

    If you wanna code in C++ then you'd better get used to the "weird syntax" of templates and especially the boost libraries, they ARE the basis of most of the additions to the standard library in TR1 so they will become the "C++ norm"
  • Re:Boost? Ugh (Score:5, Insightful)

    by Viol8 (599362) on Wednesday June 06, 2007 @05:28AM (#19408693)
    "you'd better get used to the "weird syntax" of templates and especially the boost libraries"

    I'm used to templates syntax (though I think its ugly and Stroustrup could have done a lot better) but Boost makes it worse by overloading operators and then using them in ways never intended that produce syntax that a plain C++ wouldn't even recognise, never mind understand what its doing.eg the gratiutous overload of () for matrix ops where a simple function call would have been much cleaner and easier to follow.
  • by gladish (982899) on Wednesday June 06, 2007 @05:31AM (#19408707)
    This is a result of the way that most people are developing code. They build some huge app and then finally realize that it doesn't work because it's riddled with defects and memory leaks. What should have been taking place is the creation of small units tests which were then run under some runtime analyzer like pruify. With that said, I've used purify, insure++, and valgrind. I found insure++ to be the best. I will admint that the code runs much more slowly but I was amazed at the stuff that it found. I've never used the windows version, but you can get it for windows with dev studio integration.
  • by Anonymous Coward on Wednesday June 06, 2007 @05:33AM (#19408713)
    Purify has been considered the best solution for this problem for many years. (How many readers have their 'Purify Mug'?)

    Linux developers can use Valgrind, which is also very good and is free. But it won't run on your platform.

    Then there are the static checking tools like Coverity. I believe that they do great things, though I have never used them. If you are a big company I think it would be well worth getting them to talk to you; you would probably find it intellectually interesting, if nothing else. There are other tools in the same field; Wikipedia has a list.

    You may find that Purify is too slow. It has various options that you can tweak. It also benefits from having loads of RAM (steal it from your colleagues while they have lunch). But basically you need to live with the speed and either be patient or hack your application to go straight to the problematic bit.

    In my experience this sort of debugging is always painful, and the lesson it teaches us is to *not put the f***ing bugs in the code in the first place!* By that I mean:

    - Avoid dynamic memory allocation when possible (i.e. use std containers instead).
    - Every time you type 'new' or 'malloc', think to yourself "where does this get deleted/freed?"; ideally the call to delete or free should be a few lines away from the call to new or malloc and it should be blindingly obvious that they occur in pairs.
    - Be really clear about ownership of pointers.
    - Use smart pointers (like the Boost scoped_ptr and shared_ptr) when appropriate.
    - Avoid pointer arithmetic.
    - Don't use a NULL sentinel value.

    Every time you find yourself doing one of these "bad" things, try to remember your last epic all-night debug session with Purify and fix it....

    By following these sorts of practices, I have managed to avoid any nasty memory-allocated related bugs for a few years. But of course it doesn't help with your legacy codebase.
  • Re:Boost? Ugh (Score:0, Insightful)

    by Anonymous Coward on Wednesday June 06, 2007 @05:42AM (#19408761)
    Overloading operators is exactly the norm in standard C++ it is one of the most ingrained principles
    with which C++ was designed. Creating DSEL (Domain Specific Embedded Languages) like boost::spirit (where you basically write EBNF syntax in C++ to generate parsers) was and is a key goal when C++ was developed.

    So you now don't know what an operator like does? So what, it does the what it is supposed to do in this context.
    This helps to really rise the abstraction level of the language to solve problems of the problem domain, in a very efficient way.
  • by RatCommander (212548) on Wednesday June 06, 2007 @06:39AM (#19409015)
    Valgrind's default memcheck tool is an excellent way of finding memory errors - ranging from extremely subtle to obvious. In addition, Valgrind can be used as a code profiler, cache simulator and many other things. It really is an excellent tool - I recommend it to anybody writing C++.
  • by peterpi (585134) on Wednesday June 06, 2007 @07:16AM (#19409199)
    I love this comment, from Bjarne Stroustrup's home page (href=http://www.research.att.com/~bs/bs_faq2.html #memory-leaks [att.com])

    Q) How do I deal with memory leaks?

    A) By writing code that doesn't have any. (goes on to advocate vector & string)

    And also: C++ Is my favorite garbage collected language because it generates so little garbage (http://www.research.att.com/~bs/bs_faq.html#reall y-say-that [att.com])

    Over the past 6 months or so, I've really made an effort to better my usage of C++ (using Effective C++, Effective STL and C++ Coding Standards). With a combination of STL, references, RAII, std::string and boost::shared_ptr, all of my memory, ownership & null-pointer problems just went away. I hardly ever actually write 'new' any more. The Java model of just leaking objects and hoping they'll get collected sooner or later seems horrible.

    But I'm not maintaining old code, so this is completely -1 Offtopic.
  • Re:Boost? Ugh (Score:3, Insightful)

    by maxwell demon (590494) on Wednesday June 06, 2007 @07:23AM (#19409229) Journal

    "I suppose you like adding vector components manually, instead of doing v1 + v2?"

    No , something like vectorAdd(v1,v2) would be a lot more readable and a damn site easier to grep for. Idiot.
    Then probably we should remove the operators for built-in types as well. After all, you could use functions like doubleAdd(a, b). As a bonus, you'd not get nasty surprises when mixing unsigned and signed integers. intGreater(a, n) would always give you the expected answer, even if a is negative and n is unsigned. If you'd want to compare in unsigned arithmetics, you'd use uintGreater(a, b) instead. And what dereferenePointer(p) does is self-evident, unlike *p. Also, removing all operators would greatly simplify the parser, because the only types of expressions it would have to parse would be constants, variables and function calls.

    But I just see you signed your post with "Idiot." Thus I guess I shouldn't have taken it seriously anyway. :-)
  • Re:um (Score:4, Insightful)

    by Shadowlion (18254) on Wednesday June 06, 2007 @07:23AM (#19409237) Homepage
    > and works fine in legacy apps

    Regarding legacy applications, I think the point was that he can't go back through the app and rewrite everything to use smart_ptr.
  • by Anonymous Coward on Wednesday June 06, 2007 @07:51AM (#19409417)
    Most people can't understand Purify's output, and I've actually ran across coders who actually believe their code can't be as bad a Purify says it is.

    For example, this code has serious issues:

    extern string method_that_returns_string_object();
    char *ptr;
          .
          .
          .
    ptr = method_that_returns_string_object();
          .
          . .
    That actually will compile, and seem to "work". But it's horribly wrong, and Purify will find the problems.

    And FWIW, I've used Purify on massive apps, and found huge problems that the developers didn't even know were there. On one project, they couldn't explain why their "perfect" app kept crashing, either. Worse for them, I had been hired as a consultant to fix their problems that they couldn't seem believe existed (HINT: your boss hired someone from the outside...), and after watching the team flail and spend literally almost a man-year trying to find one memory bug, I finally had enough of "advice giving" being ignored and got on their system, linked their app under Purify, ran it, and found the bug - a double delete of an object from two different threads. It all took me about fifteen minutes. I did that in front of their management. I made my point.

    Purify (and like tools) are a great help. Not using them is like trying to build a house without power tools. Yeah, it can be done. But what would you think if hired a builder to make your house and his team showed up carrying hand saws? Oh, and you are paying that team to hand-saw all the lumber...

    What would you think of that builder?

    Yet, when a developer asks for tools like Purify, management often balks. Because 1) they're shortsighted, and 2) developers don't know how to use such tools.

    Like I said - what would you think of a construction company where the workers don't know how to use modern power tools to help their productivity?

    Well, you just put yourself in that category.

    Yes, Purify is somewhat slower than running without Purify. But it's a lot faster than most other full-memory checking methods. If you're worried about speed, link against the Win32 debug libraries - they'll at least show problems with double free() calls, access of free()'d and deleted objects, etc. And without too much performance problems.
  • Re:um (Score:3, Insightful)

    by Evanisincontrol (830057) on Wednesday June 06, 2007 @08:14AM (#19409595)
    I hope that was supposed to be sarcastic, otherwise you are the worst developer I've ever heard of. Rewriting an entire legacy application just to use shared pointers is downright stupid. He might as well just redesign the entire software and build it from the ground up... but then you're not "maintaining" anymore. You've completely redefined your job description.
  • by seniorcoder (586717) on Wednesday June 06, 2007 @08:19AM (#19409653)
    We are running many high speed financial message processing applications. A crash for any reason (including a leak) would be very costly for us.

    We pre-allocate pools of objects at startup and then re-use them. No other memory is allocated or freed while the process is running. Our pools of reusable objects are monitored very carefully as an object that isn't release back to its pool when the job is done is akin to a memory leak. Use of sentries to automatically release objects back to the pools when they fall out of scope is mandatory.

    So my answer is to the problem is:
    1. Use sentries (or some other mechanism) to guarantee memory is released.
    2. Don't allocate except at startup.
    3. No need for elaborate tools due to the above.

    I'm sure that not all applications data usage would fit into this model, but it is surprising how many can.

    We have seen some leaks in our applications. These were tracked down to STL internally leaking. They weren't generally very large and therefore we continue to live with them.

    On the subject of garbage collectors, some of our colleagues use Java and .NET. Both sets of colleagues have had major performance problems caused directly by the garbage collectors kicking in and consuming vast CPU power while they did their thing. The result was a failure to process messages in a timely manner in our high speed environment. The solution in both languages was to use pools of reusable objects and never cause their reference counts to drop to 0. Thus they implemented the very same mechanism that we use in C++ and avoided the garbage collectors.

    So don't think that a garbage collector is the solution. Perhaps in less demanding applications it is a potential answer.

    Lastly, I strongly dislike anything from Rational. I find them overpriced unreliable bloatware (YMMV). Purify used to be good some time ago, but those days are long gone.

    I echo what others have said above. You are a developer. You know your requirements. Build a simple tool to monitor and check your usage. For us it was managed pools of re-usable objects.
  • Re:um (Score:5, Insightful)

    by bhsurfer (539137) <bhsurfer.gmail@com> on Wednesday June 06, 2007 @08:47AM (#19409931)
    The first thing I was told by my boss when I got hired was "You're going to look at this app and want to rewrite it from scratch. Don't do it, that's not what we want you for." Software doesn't need to be pretty, you just make improvements as you can and leave the ugly but solid code alone until necessary. It's an extremely rare situation to have the luxury of a complete redesign/rewrite.

    I guess that's a long way of saying "I agree completely with what you just said."

  • by Rogerborg (306625) on Wednesday June 06, 2007 @08:49AM (#19409963) Homepage
    > However, if one manages their objects correctly, the current generation of jvms perform quite well IMHO.

    If one manages their objects correctly, C/C++ perform quite well too.

  • by peterpi (585134) on Wednesday June 06, 2007 @09:12AM (#19410277)
    I didn't explain it all that well. What I mean is; I love destructors.

    A good example of what I'm talking about is a std::ifstream versus a java.io.FileInputStream. If you make an ifstream on the stack, you can be absolutely certain that when it goes out of scope, the destructor will be called and the file closed. You can be certain that it will happen, and you can also be certain when it happens; at the very point it goes out of scope.

    With a heap based FileInputStream, you have no such gaurentee. You leak it, and you just hope that the finaliser gets called soon (if at all). I've had more than one occasion where I've been leaking FileInputStreams quicker than the garbage collector cares to clean them up, and sooner or later the OS says 'no' and you get an exception. And it's very difficult to reproduce, because it's all down to the whim of the garbage collector, and you always go slower when you're looking for a bug.

    Of course the answer to this is to say "Well you should Close() your input stream beforehand". But that's just as bad as saying "You should delete your heap based objects" in C++. It's that situation of having to manually shut down objects that seems old fashioned to me.

    Maybe there's a better way these days, I've been away from Java for a couple of years now.

    (I do enjoy coding in either language though!)
  • C++ errors suck! (Score:1, Insightful)

    by Anonymous Coward on Wednesday June 06, 2007 @10:01AM (#19410907)
    Yes, absolutely true that error messages are useless. The blame for this can be shared between the language, the compiler, and the library. The worst I have encountered is Boost.Spirit where it would churn out messages hundreds of lines long with no clue whatever what the real problem was; you might as well just make random changes until the message goes away. In the end this makes you code in a very defensive style, not deviating far from the examples that you've copied-and-pasted from the documentation.

    I would say that this is a serious enough problem that we ought to stop and fix it before developing yet-more-complex libraries. One attempt to fix it at the language level is the introduction of 'concepts' in the next version of C++, which allows template classes to specify properties that their parameters must have - and which presumably allows more sane error messages when the properties do not hold. An attempt to fix it at the library level can be seen in the message that you cite: "property map not found". Yes, it's embedded in a load of stuff from the compiler, but maybe that's the best it can do.

    I'd be very interested to know whether any of the other compilers give more comprehensible error messages than g++.
  • by SparkyFlooner (1090661) on Wednesday June 06, 2007 @10:04AM (#19410955)
    You don't let the garbage collector determine when shared resources get closed. You explicitly close it yourself when you're done with it. Closing a stream and deleting the object that manages it are two different things
  • Re:um (Score:4, Insightful)

    by bhsurfer (539137) <bhsurfer.gmail@com> on Wednesday June 06, 2007 @10:42AM (#19411551)
    I meant "ugly" in the sense of "Not the way I would have done it" rather than in a "Holy shit, what a freakin' mess! This guy should be bagging groceries, not writing software!" kind of way. I certainly do not think that clever tricks and mounds of complex spaghetti code that were designed by avalanche is maintainable, believe me.

    I also have (unfortunately) written enough ugly stuff that when I go back later I say "I can't believe I actually did something that stupid."

    You live, ideally you learn, and when you look at code you wrote 5 years ago you likely slap your forehead in embarrassment - that's how you know you're getting better. That, and when your coworkers aren't trying to slash their wrists when they get handed something you wrote...

  • Re:um (Score:5, Insightful)

    by joto (134244) on Wednesday June 06, 2007 @10:44AM (#19411587)

    If its ugly its not solid. Ugly code is hard to understand at first glance, and its easy to introduce an error. Or do you consider code that's easy to make a mistake with as actually being "maintainable"?

    You are confusing two aspects here. Ugliness does limit maintainability. But it does not limit "solidness". "Solidness" would mean that the code actually works, and has a proven track record, such as being used in production for over 20 years. Code that has been in production for over 20 years is usually both solid and ugly.

    That ugly code is usually a monument to the "there's not enough time to do it right, but there's always enough time to do it over ... and over ... and over" and "ship it now - fix it later."

    Or it could be a monument over "the world is a complex place, and if you change anything here, and it causes the program to fail in some weird special case, your company is going to loose umpteen zillion dollars". While the reality is probably somewhere in between, rewrites should still be avoided like the plague. However, if you really have taken the time to understand what some nasty bit of code does, there's nothing wrong about cleaning it up. But most of the time, the ugly code is there for a reason.

  • by cant_get_a_good_nick (172131) on Wednesday June 06, 2007 @11:11AM (#19412079)
    GLIBC allows you to create hooks for the standard mem functions (malloc/realloc/free). Remember that g++ still calls these under new/delete so it works for C++ also.

    One of our guys coded up a simple shared lib that can be loaded with LD_PRELOAD that sets simple hooks of printing memory locations for new/realloc/delete. He then wrote a perl script that kept track of these things and spit out anything that was malloc'ed and not realloc'ed or free'd.

    I can't post it, because technically it's not my code it's my company's. But his shared lib code is just 300 lines long, and shouldn't be hard to duplicate. The perl log filter is even more straighforward. Each malloc gets saved. Each free removes the malloc. Each realloc removes the old malloc and adds a new one. Anything left over is a leak.

    Override __malloc_initialize_hook with a pointer to your init_function. In your init_function, save the old functions at __malloc_hook __free_hook __memalign_hook and __realloc_hook and substitute your own. Now write your replacement functions, in it, do your logging and temporarily replce the old hooks and call the original functions, replace with your hook on the way out to get the next call. All of the hooks should be wrapped in a mutex to help re-entrancy problems.

    It's not a full memory detector, just does leaks, but it's non-intrusive, requires no recompiles, and is the best way we have to leak detect our huge server long running code.
  • Re:Boost? Ugh (Score:3, Insightful)

    by pzs (857406) on Wednesday June 06, 2007 @11:25AM (#19412351)
    Amen, brother.

    The only thing I can add to this is that an error message that only takes up 8 lines is a cissy error coming from BGL. I had errors that were multiple screenfuls. It seems somehow wrong when a tiny type error that can be fixed with maybe 3 or 4 well placed characters can be so verbose. I guess that's C++ for you.

    Peter
  • Re:two points (Score:5, Insightful)

    by Anonymous Brave Guy (457657) on Wednesday June 06, 2007 @12:15PM (#19413093)

    It's not automatically bad, but using semi-automated memory management like this tends to reduce the emphasis on constructing things only when they're needed and destroying them immediately when you're done with them. This concern, known as "Java bloat syndrome" in honour of the language that first popularised it, can lead to major performance problems in applications that manipulate a lot of data, and is a favourite mistake made by the cult of "hardware is cheap, so optimisation doesn't matter".

    The thing is, this sort of care-free programming philosophy is natural in languages like Java, so languages like Java have had to learn from their early mistakes and adapt. There have been dramatic improvements in GC technology since those early days, and today there isn't the same degree of performance penalty associated with relying on GC to clear everything up.

    However, this sort of behind-the-scenes magic isn't really the "C++ way". You can do it, but tools like shared_ptr don't have the same level of sophistication as full-blown GC. Using them requires some care from programmers, and as the grandparent post said, this can lead to problems if the programmers come to rely on them more than they ought.

    FWIW, I'm not sure I'd have described things in quite such black-and-white terms as the GP, but I can see the underlying point and I think it's a valid one.

  • Re:Boost? Ugh (Score:3, Insightful)

    by NoOneInParticular (221808) on Wednesday June 06, 2007 @02:24PM (#19415123)
    The effect in Java is like every pointer (or object reference or whatever you want to call them) were shared_ptr (except better).

    Nope. garbage collection solves one problem, memory management, but does not solve the more general issue of resource management. Incorporating a few file handles, database connections or what you have into Objects in java leads immediately to manual resource management issues. You cannot reflect a couple of resources into an object and have deterministic release behaviour unless you explicitly code for it. Shared pointers (reference counting) does cater for this, albeit at a performance cost. RAII is impossible in Java, yet commonplace in C++, with or without reference counting. They're just different, each with tradeoffs of their own, mkay?

  • by ShakaUVM (157947) on Wednesday June 06, 2007 @04:38PM (#19416993) Homepage Journal
    >>If you write a lot of code, you WILL make mistakes like memory leaks. A lot of them. If you don't think so, you're living in
    >>fantasy land and you're nowhere near as good a coder as you think you are.

    Pfft.

    Actually, good coding habits will indeed work.

    We were three people coding a 100,000 line program, 0 memory leaks. C.
  • Re:two points (Score:4, Insightful)

    by shutdown -p now (807394) on Thursday June 07, 2007 @12:45AM (#19420617) Journal
    In other words, programmers shouldn't use shared_ptr as if it were a replacement for GC. When it is worded thus, I can fully agree with that (and indeed, anyone who understands how reference counting works, will agree as well). The nice thing about shared_ptr is that, unlike GC, it is still fully deterministic, and so it properly preserves the "C++ spirit".
  • Re:Boost? Ugh (Score:3, Insightful)

    by John Betonschaar (178617) on Thursday June 07, 2007 @05:09AM (#19421521)
    I was not talking about Boost, that's a completely different subject (on which I mostly disagree with you as well but lets forget about that). I've read most of your replies in this topic, and that's what I'm basing my judgement on. You cannot seriously believe that things like operator overloading, design patterns, RAII, Boost, templates and the other things you see no purpose for are all 'flavour of the month paradigms' and only useful for programmers to show off. If you do, you are not an experienced developer to me, at least not one that moved along with advancements in the field. But maybe you're just trolling by blowing your arguments completely out of proportion.

    If you really consider yourself a skilled C++ programmer you'd acknowledge that C++ provides 1001 ways to do the same thing. For some purposes using operator overloading and templates is better, for other purposes using method overloading and OO inheritance is better. Same goes for other problems C++ offers multiple solutions for. Sometimes multiple inheritance is ok, sometimes it is terrible. Heck, sometimes using 'goto' even makes sense. If you're not only consider yourself a skilled C++ programmer but also a skilled software engineer, you'd also acknowledge that code re-use and design patterns are almost always good things if applied properly, irrespective of the implementation language. If you say 'design patterns' is just new and cool terminology for clueless programmers you probably never even opened the de-facto standard work about design patterns (Gamma et. al) and browsed it a little. It's just common solutions to recurring problems, that can save you a lot of work because you don't have to re-invent them yourself. It's just design re-use on the architectural level, which is even more important than re-use on the implementation level.

    I think you should try coding up some Java, C#, D or Python some day. You'll probably be disgusted by all the 'paradigms of the month' they applied to those languages, how much re-use and design patterns are incorporated into them etc. You think it's just because the people who created the languages wanted to show off?
  • by maxwell demon (590494) on Thursday June 07, 2007 @05:37AM (#19421635) Journal
    Why should I ever have to do that? Either I'm checking the correctness of operator+, then I'm interested in the implementation of it, and don't care at all about who might call it where for whatever reason. Or I'm checking the correctness of code which uses operator+, and then I already know where it is used (in the code I'm checking right now, and I don't care it it is called from anywhere else).

    Indeed, with generic programming, the same code may call several different implementations of operator+, depending on what type it is used on. The same goes BTW for normal named functions. And the same is also true for virtual member functions (operator or not) in an OOP context.

    I'd say if you have to find the callers of operator+ in order to check if it is implemented correctly, there's something fundamentally wrong with your code.
  • Re:two points (Score:3, Insightful)

    by TheRaven64 (641858) on Thursday June 07, 2007 @07:27AM (#19422111) Journal
    Most real projects use a build system that invokes the compiler once for each file, and then the linker once at the end. While GCC may be run hundreds of times, each invocation only lasts for a few seconds. If your code is taking longer to compile, then you should possible consider structuring it better.
  • by Viol8 (599362) on Thursday June 07, 2007 @10:45AM (#19424367)
    Meanwhile , back in the real world...
  • by Anonymous Coward on Thursday June 07, 2007 @09:30PM (#19432911)
    "Why should I ever have to do that?"

    Oh I dunno , because you've changed the implementation and need to know where in the program its used so you create suitable tests? Just a wild guess.


    Oh, come on, really? You didn't immediately think of searching for the name of the class on which operator+ is overloaded? The class name will appear in the function parameters or the local variable declarations for code where the unit tests might need to be examined. Given that hint, you should be able to think of where else you might need to look for the class name. If you cannot handle that, consider sticking to that language where "&" is used for string catenation, as you noted elsewhere. Presumably you were referring to VB6 or earlier?

    - T
  • by Anonymous Coward on Friday June 08, 2007 @03:14AM (#19434815)
    I was a developer for a "top 5" web browser for 6 years. I often worked on porting to small platforms and I also focused on how to improve performance on these platforms.

    Before I go into a rant, I must first rant about how much I hate Java. I feel that it was a great proof of concept and that they should have taken what they learned and went back and did it better. Java is a lot of great ideas implemented poorly due to lack of experience. I think they should have spent more time with the SmallTalk guys who actually almost had it right to begin with. Hell, evolving SmallTalk would have been far more intelligent than turning C++ into SmallTalk.

    Ok... here's the thing. I tried a lot of competing products. I tried memory checkers, memory allocators (and SmartHeap is the shit!), I tried memory profilers, hand instrumented memory logging, etc... what I've learned are a few things...

    Garbage collection (even reference counting) can improve performance greatly, but it had little or no impact on fragmentation. A system that I slapped together as a malloc/free new/delete override proved quite successful at drasticly improving browsing performance. What I did was that instead of deleting memory, I queued deletes and when the pool needed to be grown, I would process the deletes or I would process the deletes during idle cycles.

    This just made the program seem faster during runtime... obviously, the added overhead just made it slower.

    To explain why a web browser is one of the most rigorous tests of a memory environment... just think of the hundreds to thousands of DOM nodes/elements/etc..., script objects, images, etc... there are in sinlge page. Add that each element is typically represented by a single allocated object. Consider that images can decode to 100 megs in size (yes, it happens), most often closer to 2-3 megs for background images.
    A web browser can't use a memory system optimized for specific object sizes.
    Due to the dynamic nature of the objects, object reusability is not really an option
    Scripts can grow or shrink memory usage thousands of times per second
    Browsers typically contain 3rd party code from plugins which need to interact with the browser

    I can go on and on... a web browser is possibly the worst memory management nightmare on the planet. Often I worked with customers that were developing their own operating system. They used to tell me that my web browser must suck because making it stable on their system was a pain in the ass and often required them to either change or rewrite their entire memory management system to get good performance on embedded devices. Then I'd explain to them that up until now, their memory manager has been having a friendly snow-ball fight with a penguin, now it's running for it's life from an avalanch caused by a mean Yeti.

    So here's the deal, GC really didn't pay off for us... helped a little, but I have to say that once we started using reference counting and simple GC, the quality of the code got really poor. Just look at Symbian for an example of a product that suffers from using great memory management system that increases coding complexity 10 fold. It makes it so that you spend all your time coding for the memory manager and you run out of time to make the program itself work.

    Now on the other hand, I played with a few Java web browsers and learned something important. Java is made for phones. If it has no other purpose (and I'm convinced it doesn't), it's for embedded devices. Because of relocatability, fragmentation doesn't occur (in a good VM) and applications run much better. The GC + Relocation system is REALLY REALLY REALLY good, if I were to start writing a new web browser today, I'd find a good alternative to Java and get moving on it.

    Oh... last thing about auto-pointers, they're a blessing and a curse. For the most part, I find the best solution to be to use a proper system library like Qt instead of boost or STL. Qt seems to actu

* * * * * THIS TERMINAL IS IN USE * * * * *

Working...