Forgot your password?
typodupeerror
Programming IT Technology

Experiences w/ Garbage Collection and C/C++? 112

Posted by Cliff
from the taking-out-the-trash dept.
dberger queries: "Java has helped garbage collection enter the mainstream programmer's life - but it's certainly not new or unique to java. There have been (and are) several ways to add garbage collection to C/C++ - the most active seeming to be Hans Boehm's free libgc. I'm curious if any of the Slashdot crowd has used this (or any other) C++ garbage collector in non-trivial commercial applications. If so - what were your experiences? If not, why not? (Before you ask, yes - I know that GC isn't the only difference between C++ and Java, but 'automagic memory management' is certainly part of Java's marketing luster)"
This discussion has been archived. No new comments can be posted.

Experiences w/ Garbage Collection and C/C++?

Comments Filter:
  • by cookd (72933) <douglascook@junoTIGER.com minus cat> on Tuesday September 16, 2003 @01:05AM (#6972241) Journal
    Classes can have inner classes as well as typedefs. Those are in the namespace of the class, so the namespace operator :: is used to access them.

    And the sort question was answered by somebody else, but here is a bit more on the subject: if a bunch of classes share portions of their interfaces, and the shared subset is enough to perform a useful operation, why not share the implementation of the operation? While you could certainly "tell a vector to sort itself", it makes just as much sense to "apply the sort operation to a vector". Sort is not a primitive operation on a vector, it doesn't require access to the internals of the vector for efficient implementation, so there is no reason to make it a member.

    The rest I agree with. A C++ master can do wonderous things, but there are few C++ masters out there. Very simply put, it is really tough to come up with the BEST way to do something in C++ -- there is always more than one way to do it, and doing it perfectly can take an impossible amount of time.

    Also, since many useful concepts are possible to implement optimally, yet not built-in to the language, there are way too many libraries. How many different ways are strings expressed? Too many.
  • by swillden (191260) * <shawn-ds@willden.org> on Tuesday September 16, 2003 @02:50AM (#6972753) Homepage Journal

    You repeat some common myths about GC; allow me to counter them.

    The obvious: CPU & memory overhead for the checking and tracking. I can't comment on the amount here, but it is a generalized solution, so you forego the optimization opportunities that you'd otherwise have.

    Malloc/free and new/delete (without pooling) are also generalized solutions, and they also consume CPU and memory overhead for checking and tracking. There is good reason to believe that in the right type of language (which C and C++ are not) that GC can actually be much more efficient than manual deallocation, mainly because it can do its work in larger batches, and because it can reorganize objects in memory to make allocating more efficient. Contrast a simple single-heap malloc implementation, which has to scan a free list looking for a sufficiently large block against a copying garbage-collected system where the allocation pool is simply a large contiguous block from which you just grab the first 'n' bytes.

    If you look on Boehm's web site, you can find a few papers comparing the performance of conservative GC for C with optimized malloc/free implementations. malloc/free wins, but not by as much as you'd expect.

    The subtle: Memory allocation can become a major bottleneck in multithreaded systems. Garbage collection has similar issues.

    Actually, GC *eases* the issues associated with recovering memory in multithreaded systems. Why? In a multi-threaded program with manual deallocation, both allocation and deallocation occur in every thread context. In a GC system, all deallocation is typically concentrated in a single thread, the GC thread. Allocation is still spread across threads but the required interlocking is hugely reduced since the GC thread can do all of the reclaiming, block coalescing and free list construction (if that's the technique used) without any interference from the other threads. It will have to acquire a mutex to place the recovered blocks back where the active threads can get them, of course.

    Generational, copying GCs can do even better, but not for C or C++.

    The irritating: you don't know when your destructors are called.

    As experience with finalize methods in Java has shown, you should really treat GC as a way of having infinite memory. The problem with finalizers/destructors is that not only do you not know when they'll be called, you have no way of knowing that they'll *ever* be called. That means that they're effectively useless and add significant complexity and overhead for little or no return.

    IMO, if you want to use C++ with GC, you should make sure that objects that have non-trivial destructors (those that do something besides memory management) get destructed normally, and just let GC handle the memory.

    Reference Counting Smart Pointer (RCSP for short)

    They're useful, but I'd hardly call them great. Reference counting is *far* more compute-intensive than scanning-type garbage collection. And then there's the problem of circular references, which will never be reclaimed. Of course, it's not that hard to avoid those situations most of the time, but with GC you don't have to care.

    Owning Smart Pointer (OSP) ... auto_ptr

    These are very useful, and conscientious use of them will eliminate 95% of memory leaks and dangling pointers. OTOH, they don't work when things get complex enough that ownership isn't simple and clear.

    Also, you can optimize your smart pointers for individual types (through template specialization). A great example is to give the no-longer-needed object back to a pool for later reuse.

    Generally, I would do this through specialized new/delete, rather than specialized smart pointers. Regardless of the mechanism, though, pooled allocation is the absolute best thing you can do to minimize the cost of memory management in your application. The reason is, of course, that you build the pooling based on your knowledge of the actual usage characteristics of the objects; knowledge that no general-purpose memory manager can possibly have.

  • by jdfekete (316697) on Tuesday September 16, 2003 @07:15AM (#6973656) Homepage
    I used BW Garbage Collection on an Information Visualization system here [umd.edu] available under the GPL.
    It works, but with some problems, mainly due to the fact that it doesn't knows enough of the OS, in particular large pages allocated by libraries that it has to scan for pointers. There is nothing that cannot be fixed in theory, but systems are not designed for it right now.
    On my visualization application, it spends seconds scanning some memory mapped zone opened by NVidia OpenGL implementation (this is a guess from looking at the stack trace in gdb).

    For the performance talk, there are interesting figures in the book Garbage Collection [amazon.com]
    and at Boem's site showing that reference counting costs more than garbage collection.
    However, reference counting is predictable and does not interrupt interactive applications at random moments.
    In particular for multithreaded applications, reference counts should be guarded by a mutex that is very expensive and can be avoided using a GC.

    One possible alternative to C++ with BW is using gcj, the Gnu Java Compiler. It uses the same back-end than G++ for producing optimized code but is closely tied to the BW garbage collection, providing information about how objects are organized in memory, improving the marking-time of the GC.
    gcj produces code that can be linked with C and C++ without having to resort to JNI.
  • Re:GC costs (Score:3, Insightful)

    by be-fan (61476) on Tuesday September 16, 2003 @11:25AM (#6975667)
    Your description is slightly inaccurate. He said that explicitly freeing allows the next alloc to reuse a given chunk of cache-hot memory, while the GC will ignore that memory and allocate a cache-cold chunk instead.
  • Very happy... (Score:3, Insightful)

    by DrCode (95839) on Tuesday September 16, 2003 @12:07PM (#6976224)
    A few years ago, I used the Boehme GC when writing a pair of compilers (Verilog/VHDL) in C++. I was very happy with the result, since it was rare for GC even to get called at all. It was also surprising how much simpler code gets when you don't have to worry about deleting objects.
  • by umofomia (639418) on Tuesday September 16, 2003 @04:51PM (#6979264) Journal
    There's a really good book [amazon.com] about everything you ever needed to know about garbage collection. Although most of the book deals with garbage collection techniques in general, it has two complete chapters devoted to implementing and using garbage collectors in C and C++ and which ones you should use depending on your application needs.
  • by swillden (191260) * <shawn-ds@willden.org> on Wednesday September 17, 2003 @01:21PM (#6987019) Homepage Journal

    iow, it's really worthwhile to think about memory management issues, who owns memory, etc, and not just for the free() call, but for the design itself. It pays back a lot to think of these things, and use GC for particular cases that can benefit.

    I agree with this, actually. I've seen many cases where having to think about object lifetimes has given me clearer insights into the problem domain and into the design, and resulted in better, tighter, cleaner and more maintainable code than would have been the case otherwise.

    However, I've also seen some systems where the cleanest, most flexible and most maintainable design made keeping track of object lifetimes a real bitch. So while I wholeheartedly agree that being forced to think about memory management can improve the overall quality of most designs, there are circumstances where having to manage memory manually forces you to choose an inferior application architecture. In those cases, GC is a *huge* win, since choosing the right structure is the single most important thing you can do to facilitate both implementation and maintenance.

    In addition, there are those cases where the code just doesn't matter that much; and GC is certainly an aid to getting the job done quickly, because it removes a large class of concerns from the programmer.

    I'm not necessarily a huge fan of GC, sometimes it's great, sometimes it sucks -- it's just another tool. But I had to respond in this thread because there are a lot of misconceptions about GC (and the alternatives, like refcounting smart pointers).

Passwords are implemented as a result of insecurity.

Working...