Memory Checker Tools For C++? 398
An anonymous reader writes "These newfangled memory-managed languages like Java and C# leave an old C++ dev like me feeling like I am missing the love. Are there any good C++ tools out there that do really good memory validation and heap checking? I have used BoundsChecker but I was looking for something a little faster. For my problem I happen to need something that will work on Windows XP 64. It's a legacy app so I can't just use Boosts' uber nifty shared_ptr. Thanks for any ideas."
um (Score:5, Informative)
Re:two points (Score:5, Informative)
Boudewijn
valgrind, libgc (Score:2, Informative)
Note this is not 'memory management tool', but one to help you find and clean up the memory leaks. There is no way to do proper garbage collection using the STL's allocators, though there is a 'gc' library http://www.hpl.hp.com/personal/Hans_Boehm/gc/ [hp.com] which tends to do the job. Haven't used it, though projects like Mono http://www.mono-project.org/ [mono-project.org] use it extensively.
Purify (Score:2, Informative)
valgrind and gc.h (Score:1, Informative)
If you didn't already know of such tools then go do some research; what you want probably already exists.
STLPort (Score:4, Informative)
Re:two points (Score:4, Informative)
shared_ptr is a blessing and a curse. It saves you from manually destructing objects held in a collecton (good) but too many developers use it for lifetime management (bad).
You are looking for PageHeap (Score:5, Informative)
For more information look here:
http://support.microsoft.com/kb/286470 [microsoft.com]
Valgrind (Score:5, Informative)
Re:Boost? Ugh (Score:1, Informative)
Personally, I find that
shared_ptr f (new Foo);
and also scoped_ptr are simple to use and very useful.
As with a lot of things there can sometimes be a tradeoff between the slope of the learning curve and the expressive power that you get as a result. Yes, there are more than a couple of Boost libraries that are over my head and where I write "standard" code instead. But Boost is not all-or-nothing - you should just use the bits that you like.
Re:Valgrind (Score:3, Informative)
http://www.ibm.com/developerworks/java/library/j-
Re:Fluid Studio's Memory Manager (MMGR) (Score:3, Informative)
Re:Boost? Ugh (Score:2, Informative)
Re:I've used... (Score:5, Informative)
It is a very handy feature for finding leaks, buffer overflows, ect. The only other product I've used to find memory problems on recent incarnations of Windows is Purify. The MS solution is infinitely simpler because it's built into the environment and it's narrowly focused on memory problems. To state the obvious and in fairness to Purify, MSDev is infuriatingly fussy when it comes to building debug modules for IBM's Purify.
Re:um (Score:3, Informative)
I must agree - Purify is it. (Score:3, Informative)
Yes, it isn't cheap. Just do it. You'll thank me.
The productivity increase alone will make it worthwhile for management.
Re:STLPort (Score:5, Informative)
Re:Boost? Ugh (Score:3, Informative)
And Boost.Build is muuuuuch more powerful than makefiles. You can try to use Boost.Build v2 in your own projects - it's very very useful.
Re:Boost? Ugh (Score:2, Informative)
By the way, if you're not developing for Boost and aren't using one of 8 or 9 libraries, you don't *have* to run their build system. Just pointing Visual Studio or GCC to $BOOST_ROOT is all you need to do. It sounds like taking 5 minutes to read the "getting started" page would have saved you a lot of grief
Re:Boost? Ugh (Score:1, Informative)
I would also recommend checking scoped_ptr, boost::random, boost::threads, boost::date_time and boost::filesystem. I use them all extensively in a single library that has to compile on gcc (Mac & Linux), MSVC80 and (!) Borland C++.
As for the complaints about the boost build system - bjam. I really have to disagree with you. You *don't* need to use it. Use it only to compile boost libraries, and then integrate them using whatever tool you want. I personally recommend cmake - it generates Xcode, MSVC, Borland projects or simple Makefiles among others... I use it to generate all of these.
Yes, it generates awful error messages, but, one could argue that STL and template metaprogramming in general generate awful error messages. It is something that is deemed to be fixed in new C++ standard.
I will agree with grandparent that some boost libraries are mental masturbation - boost::lambda comes to my mind
Detailed article on memory usage (Score:4, Informative)
"Monitoring Your PC's Memory Usage For Game Development" [gamasutra.com]
While the title says it's for game development, I found that the meat of the article applies to any windows developer.
Re:Valgrind (Score:3, Informative)
It's been a few years since I last programmed in C++, but Valgrind indeed really saved the day on a regular basis. Also look into (KDE-)frontends if you think looking at text output is not very convenient. Couldn't agree more with this part, it's the best tool I've seen - and it's free software, too.
Afraid I have to disagree here though. First of all, a language is not the thing that's "slow" or "fast". It may be the case that no very efficient compilers or virtual machines exist (for a particular language). I will admit that it is hard(er) to create efficient implementations of some languages (functional languages, logic-based languages), but C# and Java are definitely not among those. Second, Java VM's used to be slow in 1997. It's 2007 now, I'd suggest you try again (and be surprised). Third, I definitely do not agree that "you're not missing out on anything special" by using C++ instead of any garbage collected language (not necessarily Java or C#). For one, you're (do I need to state the obvious here?) missing out on garbage collection! I would say garbage collection is a clear advantage in the vast majority of programming scenario's. I would even argue that it's the biggest practical advancement in programming since the introduction of the procedural paradigm - perhaps even more important than object orientation.
Last of all, you're complaining about "language complexities" in Java/C# and (thereby) suggesting C++ is better in this regard? Hmm, I guess my sarcasm detector must be broken or something
Re:Boost? Ugh (Score:2, Informative)
Re:two points (Score:3, Informative)
In order to use it, you just have to include 'gc.h' in your files, which replaces 'new' with 'gcnew' using macros. Alternatively, you can bypass the standard library and provide a replacement for 'malloc' and 'free', but I did not manage to do that (due to the configuration bugs mentioned above).
Re:I've used... (Score:1, Informative)
http://www.codeproject.com/tools/visualleakdetect
Memory Validator (Score:4, Informative)
Re:I must agree - Purify is it. (Score:3, Informative)
We 'use' Insure++ as well, but unfortunately 'using it' is limited to tracking down arcane, semi-valid C++ constructs in our code that Insure++ b0rks on. Insure++ supposed to be a pretty advanced tool, but it is not actively maintained anymore and it's full of limitations that make it almost unusable for existing codebases. Especially stuff with templates, stuff using classes from external namespaces and dodgy C++ constructs that compile without warnings on every compiler I know will make Insure++ instrumentation fail. Also, linking the instrumented source files together fails on 1 out of every 10 object files, especially if they also link against third-party static libs (.a on linux). Sometimes you can fix this, but most of the time it is completely unclear why the thing fails. This effectively rendere the whole tool useless as you cannot be sure there will be no problems left in the non-instrumented parts of your codebase. And in those cases instrumentation *does* work, the output you'll get when running the instrumented binaries is sometimes really unclear, confusing or downright nonsense.
Last but not least, Parasoft support is awful. We've been told some of our linkage problems, for which we filed bug reports, would be fixed in a later release. No new version was released for 1.5 years, and on multiple inquiries about this we did not even get a reply. Then, all of a sudden Parasoft released a new version a while ago, which we found out about 4 months later as they did not bother to notify us about it...
Re:two points (Score:5, Informative)
You'll be able to use C++-0x in gcc-4.3 with a switch.
I also heard that std::auto_ptr is being deprecated (not removed) I guess in favor of rvalue references.
Finally, there [open-std.org] is a motion to include garbage collection in the C++ language. This is sponsored by none other than Hans Boehm among others.
I realize this doesn't help immediately.
Re:Valgrind (Score:4, Informative)
Memory Validator (Score:2, Informative)
http://www.softwareverify.com/cpp/memory/index.ht
-----------
Fight Entropy!!! Fight Entropy!!! Figth Etnropy! !
iFgth Etnrop!y ! giFth tErno!py ! giFt htrEno!p y! --- Well maybe
not...
Use a testing framework (Score:4, Informative)
SmartHeap (Score:1, Informative)
Some Stuff For Roll Your Own Types (Score:1, Informative)
Use _CrtSetDbgFlag to get the memory manager to test the heap periodically during use among other things. Or use _CrtCheckMemory to do it strategically.
Using _CrtMemCheckpoint and _CrtMemDumpAllObjectsSince can check to see if any heap blocks have been left on the heap in a range of code. You can use _CrtSetBreakAlloc to break on the allocation to locate the point where a widowed block was allocated.
You can also use GetProcessHeaps and HeapValidate to check heap integrity. In particular, it can be used to check all heaps in the process.
Re:two points (Score:3, Informative)
I've deployed Boehm GC in real world applications before now, and consider the quality of the collector pretty good. It isn't a real-time collector, but on a reasonably low-end computer (350MHz PII) running applications with average memory requirements (total ~50MB in objects of around 100 bytes - 2K in size) delay times were minimal (around 10ms, which I arranged to occur after event processing when it was found that no new event was waiting, hence was rarely detected by the user). You can use any type you want with it -- there are at least three different approaches:
* Make your classes inherit from a 'gc' base class
* Use a placement new operator (e.g. "new (GC) MyObject;" or "new (PointerFreeGC) char[16384];")
* Use the provided 'malloc' replacement library to replace all dynamic memory allocations with a GC allocation.
I chose to go the second route, because it gave me the most control. I had to spend a little time debugging problems caused by forgetting to include the (GC), but certainly less than I would have spent debugging undeleted objects had I not used the collector.
Re:um (Score:3, Informative)
"leave the ugly but solid code alone until necessary."
If its ugly its not solid. Ugly code is hard to understand at first glance, and its easy to introduce an error. Or do you consider code that's easy to make a mistake with as actually being "maintainable"?
If its ugly, there's probably a non-ugly way to do the same thing that's better, more efficient, AND more maintainable.
Remember, 90% of the investment in code is in the ongoing maintenance. Having to "relearn" all the "cute little hacks" that make that ugly POS code work, every time you have to change something, is a waste of resources. That ugly code is usually a monument to the "there's not enough time to do it right, but there's always enough time to do it over ... and over ... and over" and "ship it now - fix it later."
I've written enough ugly code to know that if its ugly, I'm not approaching the problem properly.
Re:two points (Score:2, Informative)
"Finally, there is a motion to include garbage collection in the C++ language. "
Call me old-fashioned, but I hope that's one that they will throw in the garbage. Call it something else if you're going to have garbage collection as an integral part. As a set of libraries, or as a compiler-time switch, fine - but not as part of the core. That's not C++, that's D [digitalmars.com].
Re:Boost? Ugh (Score:3, Informative)
Huh? Where did you get that idea? The Boost libraries most definitely are intended for production use, in a wide variety of environments. From the home page:
They do aim to define and refine libraries so that they may eventually be appropriate for standardization, but the *primary* purpose of the Boost libraries is to provide tools that working programmers need, to get their production work done.
GCC mudflap (Score:3, Informative)
One thing is that it used not to properly instrument some really basic C++ operation in gcc 4.1 (I don't remember what exactly, something like copy construction of an object containing pointers) and was reporting spurious leaks because of that. It may have been fixed in 4.2.
Search for "mudflap" in the following page: http://gcc.gnu.org/onlinedocs/gcc-4.2.0/gcc/Optim
As well as http://gcc.gnu.org/wiki/Mudflap%20Pointer%20Debug
Microsoft Application Verifier (Score:2, Informative)
This tool allows you to enable PageHeap for your process, which is heap corruption detection built into the OS heap implementation. Upon freeing a block of memory, PageHeap will break into your debugger spewing tracing that a block has been corrupted. It can also provide the call stack when the block was allocated. Newer heap validation features are available in progressively more recent OS releases.
Memory tracking tools for WIndows (Score:1, Informative)
- Use umdh that is a part of the Debugging tools for windows to track memory leaks. http://www.microsoft.com/whdc/devtools/debugging/
- Whatever you do make sure you have proper symbols. Following article explains how to get symbols from MS symbol server. http://www.microsoft.com/whdc/devtools/debugging/
- Use paged heap to track all other issue like memory overruns, double free and all other sorts of heap corruption. http://www.microsoft.com/technet/prodtechnol/wind
- Please note that if you run you application with app verifier checks on you need to run it under a debugger. I would strongly suggest windbg or cdb instead of visual studio because it has extensions that would greatly help you to track down the issue ("!analyze -v" "!avrf" "!heap -p -a "). For more details see windbg help. If your application is a service then you might consider running your machine under kd, which would trap all unhandled exceptions and application verifier reports.
- Following link has a very good windbg tutorial http://www.codeproject.com/debug/windbg_part1.asp [codeproject.com]
That is all you need to debug any kind of heap issues.
Re:two points (Score:2, Informative)
Re:two points (Score:3, Informative)
Re:Boost? Ugh (Score:2, Informative)
Re:two points (Score:4, Informative)
They don't want to remove the old ways (including malloc/free). And they don't want to penalize people who don't use garbage collection. In other words, if you don't want to use garbage collection you won't be paying for it in code size/runtime.
But for them that do... It might be nice to offer it.
Re:Valgrind (Score:2, Informative)
"Speed isn't everything. Why, if you start with a situation where other problems are choking you, you don't even have to think about speed!" That's called tautology [wikipedia.org].
By the by, the programming languages you just argued for are ADA, PHP and Visual Basic.NET, whose libraries are enormous in comparison to Java or C#. Now, I do a fair amount of PHP when I'm bored, so don't get up in arms when I say this. That said, I want to point something out: there are very, very few genuinely large scale services written in PHP, C#, Java, ADA, or any of those languages.
When it comes down to it, there are two costs: programmer time and hardware time. If you're network blocked, you buy a bigger network pipe; a dedicated box with a dedicated guaranteed ten megabit unmetered line for a year costs about a week and a half of a programmer's salary.
But, more importantly, you never, ever, EVER get network blocked. That just doesn't happen on engineered systems. The network moves faster than any software you or anyone you know will ever write. Sure, one given pipe might fill; you just upgrade the pipe. It's relatively straightforward to find gigabit, and if you know what you're doing you can go up from there; you have almost certainly never in your life seen a system that can push a gigabit of data. Even just filling static HTTP requests at that speed can bring a heavy duty, well tuned box to its knees. I happen to be friends with a tech at one of the giant shared hosting farms; his company has over 200,000 customers, and it takes them a Quad Athlon to service the average 100 megabit pipe filled with average wordpress blogs.
Now, I'm not saying the richness of the API doesn't matter. I'm just saying that positing a system based on filling the network is a little like designing the heater in the car in case the sun goes out. If you have enough users to fill a pipe, you can afford the next pipe up, or you need to be less retarded in setting your prices. The pipe has nothing to do with your selection of language.
And, frankly, the idea that you have to step away from real languages to get a real API is silly. Or, haven't you ever actually looked through the standard libraries of languages like C++ and Erlang?
If your database is taking a half second to return, you've got incredibly serious problems. And yes, in the real world, a 2ms lag matters because of the queueing problem. Have a look at one of those graphs where a modern stage server like Lightstreamer or YAWS compares itself to Apache. Apache tends to drop off logarithmically. Every time you lag 2ms, that's two more MS of customers you have to deal with during the next query.
This is a limit flow problem, and most people get limit flow problems if you use a toilet as an analogy. Your webserver is similar to a toilet that doesn't have a stopgap. That means the bowl is always slowly filling with usage (ie, not water - say this is a train station) and you have to flush it every so often to keep things clean. The train station was well designed 50 years ago, but as population has gone up, the facilities are feeling the strain.
Now, there's a point at which you say "does it really matter if the toilet takes three minutes to flush, if it's 20 minutes between trains?" Well, actually, yes, because that means each toilet can only service six and two thirds people per train arrival. At home-bound rush hour, you're going to get an enormous line of people outside the bathroom that keeps getting longer and longer.
The problem with Apache is that the longer the line is, the slower the toilet flushes. See the issue now
Re:um (Score:2, Informative)
After having worked on what I could only describe as some of the ugliest code on the face of the earth (written by people who previously made a living writing throw away code for PhD types (the kind of code that you use to find an answer, then never use again)) I would say the _real_ problem that makes one want to just go hang themselves in the server room with some cat5 is lack of documentation.
Nothing compounds poorly written code as much as poor (or non-existent over here) documentation. If you _have_ to hack code to make something work in a hurry... COMMENT THE CRAP OUT OF IT. Put a huge comment block, with asterisks and exclamation points and a link to the page in your projects documentation manager describing why you did it, what should be done down the road when time allows, and an apology note to the next guy who has to work on the code.
I probably used the wrong string class (Score:1, Informative)
Re:rtti (Score:3, Informative)
No, not at all. The current C/C++ specifications permit compilers to transform code in ways that can interfere with a garbage collector. Fortunately compilers do not do that as often as they could, but it seems like something important that should be addressed.
See Hans' paper Simple Garbage-Collector Safety [psu.edu] for details.