The New Garbage Man 205
We've all heard of "garbage collection" in reference to cleaning up after yourself after a memory allocation. Two graduate students at the Illinois Institute of Technology are proposing a new method of garbage collection and memory allocation called DMM. What they are trying to do is "map basic software algorithms such as malloc(), free(), and realloc() into hardware." Sounds as if it has some promise, but can it ever be as flexible as some of the memory managers used today?
even if... (Score:2)
If they were to do this, it would probably increase performance. But marginally. At least in linux programs, when I've done profiles, most of the cycles were taken up in console i/o. Printf, etc.
It would also require a modification to the kernel - so that the brk() system call would simply call the hardware variants.
I think that it probably *would* remove flexibility, but at the same time I think that's a good thing. Part of the problem with memory leaks and stuff is the flexibility...
Then again, I might just be talking out of a bodily orifice.
If you can't figure out how to mail me, don't.
dump (Score:2)
They should read Soustroup's Answers (Score:2)
I'm not giving everyone the right to go out and write inefficient code. It just seems that this research will not get very far. By the time they have something working, memory will be faster than their hardware solution. Sure, just use their hardware to make things even faster. Only if they get the cost down very low.
Barjne looked at some specialized machines. They just weren't cost justified. I would have to think that is the case here.
I'd rather see research into fabricating memory that ran at core processor speeds. That would speed up every memory access, not just the malloc()s and free()s.
--
malloc is pretty complicated to put in hardware (Score:3)
Flexiblity? (Score:1)
But it seems like, unlike games and demos, most programmers are using "APIs" for there memory management. So I don't think switching to hardware would do anything but speed things up. It could also allow for system types that we haven't seen before, with much, much, more use of dynamic memory.
[ c h a d   o k e r e ] [dhs.org]
It's always been a good thing... (Score:1)
(Of course, both problems become easier if one programs in an appropriately intelligent functional language, not C -- a language that facilitates integrating such things in a transparent way -- but that's most definitely another rant...)
When I hear questions like "but will it be flexible?" I can't help thinking that it sounds a little like whining from folks comfortable with their current programming model. Considering the number of problems that memory management gets programmers into, you'd think that any move towards a fast, automatic system (think George W. -- Java on cocaine) would be welcomed...
Did RISC teach us nothing !!! (Score:4)
New project, but old idea (Score:1)
"The Java virtual machine does not assume any particular implementation technology, host hardware, or host operating system. It is not inherently interpreted, but can just as well be implemented by compiling its instruction set to that of a silicon CPU. It may also be implemented in microcode or directly in silicon." from Chapter 1, the Introduction.
Since the JVM is responsible for garbage collection, this implies that the concept of implementing memory management in the hardware is at least 4 years old.
I find it hard to believe no one's thought of this idea before for C, C++ (and so on) either.
I wouldn't knock it till they try it or, at least, someone publishes some calculations that prove or disprove the possibility of major speed-ups.
Re:They should read Soustroup's Answers (Score:2)
Um, that's what there doing, basically. These 'specialized machines' could be included in the next intel/amd chipset.
'd rather see research into fabricating memory that ran at core processor speeds. That would speed up every memory access, not just the malloc()s and free()s.
Well, I'm sure we'll have 800mhz ram in a few years, of course CPU speeds will be around 3 gigHz. The only way to get (cheap) ram speeds that fast is by stopping CPU development. Anyway it doesn't really matter, because almost all Memory access come from out of the CPU catch. (If you have a PC, try disabling the l1 and l2 catches in the BIOS, see how fast your system is without them). Speeding up raw main memory won't get you that much of an increase. And what would you rather pay for, a $25 chip that speeds up a few routines, or $10,000 worth of ultra high-speed ram?
[ c h a d   o k e r e ] [dhs.org]
Related link... (Score:2)
Lisp machines? (Score:3)
Why they used FrontPage (Score:1)
bad? (Score:1)
Re:They should read Soustroup's Answers (Score:1)
Sorry to be anal about this, but it is "cache".
I understand they are trying to make a few specific things faster, but my point is work on making the RAM run at processor core (which for the most part, happens to be what cache RAM runs at) speed. What I'm thinking of is eliminating the cache and make memory run at cache speed. The reason that cache exists is because main memory is slow. If main memory is very fast, there is no need for cache.
And, quite frankly, unless my $25 can speed my system up noticeably, I won't spend it.
Why would cheap RAM speeds stop CPU development? I could see increase RAM production lowering CPU production, but not stopping development.
--
Re:Related link... (Score:1)
He wrote a lot of code. Big deal. He doesn't have any theoretical training that would come into play here, just programmer feedback.
are they looking for speed? (Score:1)
Re:They should read Soustroup's Answers (Score:1)
[ c h a d   o k e r e ] [dhs.org]
You Don't get it (Score:1)
1) Garbage collection is good for two reasons:
i) You simplify your interfaces. W/o garbage collection, your interfaces need to be designed to keep track of what has or hasn't been allocated. This makes programmer simpler. If you don't use GC, you essentially are "reference counting", which is one of the worst performing GC algorithms, in fact.
ii) You'll have fewer memory leaks. Memory leaks are difficult, often impossible to detect in large applications. W/o GC, the amount of memory leaked is unbouded in the program size. With GC, you can formally prove that memory leaks will remain bounded.
So with all these good thing going for GC, why is it not commonly used?
1) It is often slower.
2) It's not a standard part of C/C++. (But it is in Perl, Java and just about every other "modern" language).
The hardware knows when a reference/pointer is used. Software often doesn't and has to rely on techniques like reference counting or mark&sweep. This would be a *lot* easier if a few simple hardware hacks were throw in (not too many transistors). Remember, b/c of Moore's Law, CPU makers have more transistors than they know what to do with. It's kind of like "test and set bit"... implementing this in hardware is waaay easier than implementing locks in software.
As for #2, well talk to Bjarne about that..
-Amit Dubey
Re:Why they used FrontPage (Score:3)
FrontPage's lack of proper software garbage collection is the perfect environment to test hardware garbage collection in.
Wouldn't hardware dedicated to collecting software garbage tend to delete FrontPage entirely?
Non C Languages? (Score:1)
Instead of moving a few C functions to hardware I'm with most people here who think that we should just create a faster memory subsystem.
-- iCEBaLM
Re:Did RISC teach us nothing !!! (Score:2)
If we want things to be "DAMN fast", why don't we just write everything in assembly? Let's hand-tune all our code! Yeah! (David Cutler complained about this when first implementing NT at Microsoft, btw -- he wanted to develop for multiple platforms to force cross-platform compatibility, but MS programmers of the time would automatically code in x86 assembly to make things faster...)
Garbage collection improves the memory management model, which is the point in the first place. Which is better: slightly slower code that reduces the number of memory errors and security holes, or faster code with bugs?
(For comparison, which is better: portably written, slower C or fast, nonportable, nonreadable x86 assembly?)
Hardware design vs. HTML - what's important (Score:2)
I'm not saying that this is a good/bad idea, or that it will/won't improve performance. I don't know. I do know that the two men I've met are two of the very few people in IIT's CS department who might come up with something usable, regardless of their lack of English and HTML skills.
The English I saw browsing the pages was really not bad, and even if the page was done in FrontPage it's still better than the main IIT site (take a look if you don't believe me). Many successful open-source software projects involve people who do not use flawless English, but that does not affect the quality of the work.
Re:Non C Languages? (Score:1)
when they want memory? Malloc. It's what anything
calls when it wants memory, ultimately.
Let me clarify: (Score:1)
2) This doesn't counter RISC philosophy (ie many RISC chips have test&set bit instructions)
3) GC *IS* good. GC is for incompetant programms just like "High-level" languages such a C were for incompetant programmers in the 1970's and 80's.
Re:This NINJA says... (Score:1)
Re:Did RISC teach us nothing !!! (Score:1)
The ASM is better, depending on what you're doing.
For example if you're writing some small low-level code,
A) Its short so porting isn't really an issue. A few thousand lines of ASM code is just as easy to re-write for another arch rather than port.
B) ASM is just as readable as C these days. take a look at MASM (Microsofts ASM compiler)
C)Its fast as hell, and for small critical sections of programs, its really usefull.
Re:Better CPU == faster mem! WRONG! (Score:1)
Re:Flexiblity? (Score:1)
But it seems like, unlike games and demos, most programmers are using "APIs" for there memory management. So I don't think switching to hardware would do anything but speed things up.
True, programmers use APIs for memory management, but what do these APIs call? malloc(), more than likely.
Not Garbage Collection, but malloc/free (Score:4)
They are going to insert malloc/free into hardware, that doesn't mean they will build garbage collection into the hardware (which is a completely different thing). However, as the project description says, it could improve garbage collection performance, but that is another issue.
DMM (Dynamic Memory Management) is a well known term, not something new as the slashdot article tries to make it sound like.
Building DMM into hardware it not a new idea. However, I'm not really sure if it has been successfully implemented in any "real" CPUs.
Re:Non C Languages? (Score:2)
But they all do it differently, and even the same languages in different compilers allocate memory differently, this is what I'm talking about. If you impliment C style malloc(), free(), etc in hardware, then only C languages will benifit because other languages do it too differently for it to be useful to them.
-- iCEBaLM
Re:They should read Soustroup's Answers (Score:1)
Re:You Don't get it (Score:1)
Kind of cool - for real time systems (Score:5)
People would love to write programs for such systems using a high-level language like Java or C++. The main issue is that real-time systems must be able to deterministically predict how long any operation will take. Problem is, with current implementations of runtimes, the garbage collector could kick in at any time and take some arbitrary amount of time to finish, totally messing things up.
Lets say you've got a software-controlled robot arm, putting together a car or something (cheesy example). A start command is executed, and the arm begins moving. Then, right before the stop command was supposed to be executed, in comes your garbage collector. It does its thing and finishes, and the stop command executes - 50 milliseconds too late, putting the arm out of position. Oops - I hope you weren't planning on selling that car.
But if you can deterministically predict when and how long all of the operations will take, voila, you can just take them into account when scheduling what happens and everything is all hunky-dory. The hardware part of it is a bit of a distraction I think - the deterministic part is what's cool.
C-- has automatic garbage collection (Score:1)
Re:Unix programmers need to clean up after themsel (Score:2)
On the flip side, Windows should release the memory of a process after termination. Memory leaks should not persist past the termination of a process.
And your god, BG, should take that advice too, I've yet to see ANY program except for netscape with more memory leaks than Windows.
If you can't figure out how to mail me, don't.
Re:even if... (Score:1)
malloc/free is a major bottleneck in many systems, especially in SMPs. So, yes a faster (without locks) malloc/free would speed up a program several times (not just marginally). I have several examples of programs that spends up to 80% of their time in malloc/free. But of course this depends on the kind of program. Multithreaded object-oriented programs running on SMP hardware is defenetly one category of programs that would benefit from this.
It would also require a modification to the kernel - so that the brk() system call would simply call the hardware variants.
Most likely, but is that really a problem?
I think that it probably *would* remove flexibility, but at the same time I think that's a good thing. Part of the problem with memory leaks and stuff is the flexibility...
Yes, it would probably remove some flexibility (which is not a good thing). But I don't see what you mean when you say that flexibility is part of the memory leak problems.
Re:Garbage Collection is for Incompetent Programme (Score:4)
So why did those non-incompetent programmers who know all about managing memory on their own bother with writing these tools for the weak-minded? Mainly, it's that good programmers understand the principles of program design. One important design principle is that when you want to know what your program is doing at the top level, you shouldn't have to care what is really going on at the lower levels. That principle is the reason we have operating systems and compilers, for example. All of those things allow the programmer not to have to worry about what goes on under the hood, not because the programmer is stupid, but because the programmer's time is better spent solving the problem than dealing over and over again with low-level details that have nothing to do with the program at hand as opposed to any other program in the world.
Garbage-collected languages do the same thing- you trade a little speed (the idea that garbage collectors must be slower than hand allocation and deallocation actually turns out not to be true, incidentally, but in practice it often is) for the ability not to have to think about a ridiculously machine-level idea (id est, 'in which cell of memory am I going to write down this particular chunk of data?') so your code can just concisely reflect what your program actually does- quicker to write, easier to read, better all the way around. A smidgen slower, but who cares? You can certainly write programs of acceptable speed in garbage-collected languages, so it's no big deal unless you're writing something really speed critical, like tight graphics rendering loops or something, in which case us softheaded garbage-collection-needing incompetent programmers just use inline assembly. =)
Re:Garbage Collection is for Incompetent Programme (Score:1)
it's been done before... (Score:5)
None of those machines have caught on. The reason is that generally, you seem to do better if you invest the same kinds of resources into just making your processor faster.
Simply implementing existing malloc/free algorithms in hardware is unlikely to help. Those are complex algorithms with complex data structures, and they are probably memory limited. Where hardware could help is by changing the representation of pointers themselves: adding tag bits that are invisible to other instructions (allowing pointers and non-pointers to be distinguished, as well as space for mark bits), and possibly representing pointers as base+offset or some other kind of descriptor that makes it easier for the allocator to move memory around without the C/C++ program knowing about. As a side benefit, this kind of approach could also make C/C++ programs much safer and help run languages like Java faster.
But altogether, after having seen several attempts at this, I'm not hopeful. Unless Intel or Motorola do this in a mainstream, mass market processor, it will be very expensive and not perform as well. If you want those features, you end up getting better performance at a lower price by simply buying a mainstream processor without the features and simulating what you need.
sounds like TAG/CAM enhancement (Score:2)
Your basic cache/main-memory hit, miss, refresh, and stale address mapping functions, but with a more flexible and comprehensive interface.
Remember the VAX! (Score:5)
Argh! Moving stuff to hardware, when we have long struggled to get rid of esoteric hardware instructions?!? The whole idea behind RISC is to remove this kind of stupidness.
History is full of examples where this kind of hardware "features" made things faster originally, but has since become bottlenecks instead of improvements. "rep movsb" made sense on the 8086, but on a Pentium Pro it's just another thing which makes the chip more complicated, big, and power-comsuming.
And as for garbage collection, those of you who say that "GC is slow" are just plain wrong. There is a lot of research on garbage collection, and nowadays a good GC may be faster than manual malloc/freeing. It does not lock up your programs either - there are incremental GC's which you won't notice. They are even used in real-time environments today.
A major hindrance for good GC's in commonplace programs is that people usually write in C(++), and you can't have a good GC in a programming language which allows pointers. Just to give an example: I once saw a program which used a doubly-linked list. But instead of storing "next" and "prev" pointers, it only used one value - the XOR of prev and next! It was possible to traverse, since you always knew the address of either the previous or the next entry, so getting the other was just an XOR operation. But please tell me, how the h* is a GC algorithm supposed to know this? Even if you use the loosest approach, "treat every single word in memory as a possible pointer", it will fail - since there are no real pointers at all!
So the result is this: There are lots of good GC's around, but they can't be used in C - so people don't know that they exist. More or less.
Re:Unix programmers need to clean up after themsel (Score:2)
He mentioned he was porting to windows. In my experience, the windows kernel is much worse at handling things like that than the unix kernel is. Sometimes you have to reboot to free the memory.
If you can't figure out how to mail me, don't.
Re:Better CPU == faster mem! WRONG! (Score:1)
No. Memory is limited by latency. It doesn't matter if the CPU is 100 ghz, if it has to wait on the memory the whole system will slow down.
It's like if you're in a car, you can go 150, the speed limit is 150, but the guy in front of you is going 50. The speed of your car doesn't matter, and the speed limit doesn't matter either. You're going 50.
Kindly hold back your flames till you know what you're talking about. It makes you look stupid.
If you can't figure out how to mail me, don't.
Re:They should read Soustroup's Answers (Score:2)
No, the reason RAM is cheap is because it is slow. There's a difference.
Do you understand how a cache works? The speed of the cache is not what works the magic. That's only part of it. Caches work on the principles of temporal and spacial locatity. RTFM.
The CPU spends hardly any time whatsoever accessing memory outside the cache, speeding up main memory to core speeds wouldn't speed the computer up very much at all.
There would be no need for a cache if the processor could read and write to main memory at core processor speeds.
What I'm saying is that I'd rather see main memory speed up and the elimination of cache. I don't want to turn main memory into cache. That won't work.
You are right about disabling cache slowing down your system. If you had no cache and your RAM ran at core processor speeds, your system would be faster than a system with core speed cache and low speed RAM.
Sure, pricing is the key point here. But the research dollars could be better spent. Processors are plenty fast today. I/O subsystems are the areas that need our attention. If core speed RAM came out, geeks like us would be the first in line to buy it. Everything is high priced when it first hits the market. Prices would drop.
--
Re:What's wrong with reference counting? (Score:1)
struct LinkedList {
LinkedList *next;
};
LinkedList *a, *b;
a = new LinkedList;
b = new LinkedList;
a->next = b;
b->next = a;
Then you never used a or b again, a reference-counting GC wouldn't be able to collect them: both have a non-zero reference count!
Considering that circular linked lists *do* happen, this is a BAD thing.
Re:Unix programmers need to clean up after themsel (Score:1)
Re:malloc is pretty complicated to put in hardware (Score:1)
And anyway, how hard can malloc and free be? I mean, sure, there's questions of optimising for speed and lack of fragmentation, etc., but I hardly think it can be that complicated.
Garbage collection, on the other hand (which is what they're actually talking about) is slightly more mind-bending - especially when you get into incremental GC. Ouch - my head hurts.
But then, I don't know anything about memory management, so I could be wrong.
Re:Did RISC teach us nothing !!! (Score:1)
Um, no, I don't think so. You are wearing rose-tinted glasses my friend.
Unless you're knee-deep in assembly macros, in which case that's strictly speaking at a higher-level than assembly anyway.
Is this really anything new? (Score:1)
Re:bad? (Score:1)
How about making sure you're using the correct registers? Isn't improper saving and restoring of registers a performance problem? How about simply not using registers when you should? We have compilers now, though, that take care of that for you--and you don't even have to think about it it all.
GC could work the same way--if it were fast enough. Now, putting it into hardware is something I have more of a problem with, simply for cross-platform issues, but simply hiding memory allocation isn't necessarily a Bad Thing.
Not true: GC is easy!!! (Score:1)
I disagree with you that writing a GC has to be difficult. I've written several, and all of them have been a page or less of code. Here's the GC from my minimal Lisp interpreter, which I call Stutter:
voidgarbage_collect(void)
{
CELL*cell;
inti,count=0;
mark(binding_list);
for(i=0;i<protect_used;i++)
mark(protect_table[i]);
for(cell=heap,i=0;i<heap_si ze;cell++,i++){
if(!cell_mark(cell)){
cell_car(cell)=free_list;
free_list=cell;
count++;
}
cell_mark(cell)=0;
}
}
Surely, this is not rocket science.
-- GWF
Re:They should read Soustroup's Answers (Score:1)
The reason that we have all these layers of cache is because it is cost effective and resource effective. Good luck making full speed ram with low latency that is inexpensive. I believe it would require a complete paradigm shift.
Re:It's always been a good thing... (Score:1)
Re:They should read Soustroup's Answers (Score:1)
Exactly. That's what needs to happen. Even if intel and AMD manage to release a 1Ghz processor, the thing will just be wasting clock cycles.
Processors are plenty damn fast today. But they are useless if they don't have information to process. If we could process things without making the CPU wait, then we would see some pretty snappy performance.
--
Re:Meine Deutsche ist gut!!! (Score:1)
Re:They should read Soustroup's Answers (Score:1)
Re:What's wrong with reference counting? (Score:1)
I understand that this is working at a higher level than a garbage collector would work at, but it works very well.
Re:Not true: GC is easy!!! (Score:2)
Re:New project, but old idea (Score:1)
However, this example only illustrates the truism that any resources directed at specific hardware implementations would be better off being directed at making a general CPU faster. These chips don't exist today because Intel and AMD have pushed the envelope so far that they just aren't necessary, or couldn't keep up.
(For the record, I am a Java programmer so this isn't an anti-Java bias speaking).
Hotnutz.com [hotnutz.com] - Funny
Re:Not true: GC is easy!!! (Score:1)
Writing a good GC is NOT easy, thats why there are so few of them. (for instance ref counting SUCKS!)
Gavin E. Mendel-Gleason
Re:It's always been a good thing... (Score:1)
Well, let's see...there's room to speed up functional languages. You can make them implicitly parallel (try that in C). They're easier to implement garbage collection in.
So which one would you choose?
Re:Did RISC teach us nothing !!! (Score:1)
Re:bad? (Score:1)
Sometimes, the only correct way to make a free() call is to not make it. You don't know and have no way of knowing if you've got the last active reference to the block of memory or not. In such cases, using anything other than garbage collection is not coding "correctly".
--
Re:Remember the VAX! (Score:1)
So always becareful when you think you have a pancea, because you never do, you only have a false sense of security
Re:Non C Languages? (Score:1)
If this functionality were moved to hardware, and the OS modified to support this, all languages would benefit. If the OS was not modified to support this, no languages would benefit, not even C.
I think you have some very fundamental misunderstandings about how memory allocation works. For the purposes we're discussing here, all non-stack based programming languages do it identically. Pascal, C, Ada, etc...
--
Re:Unix programmers need to clean up after themsel (Score:1)
It's not a matter of "better or worse", but of "different". If you're porting to Windows 3.1 or Windows 9x in 16-bit mode, then yes, you can corrupt the global heap by failing to release memory after a process terminates. That's because all global memory in those models is shared among all processes, and so the system can't clean up without possibly destablizing some other process. Bizarre as it sounds, that not a bug, it's a feature. It's one of the ways that Windows 3.1 and 9x preserved backwards compatibility with DOS, in which all processes had relatively direct access to the physical memory.
If you're porting to Windows 9x in 32-bit mode, or if you're porting to Windows NT, then, no, processes clean up after themselves when they shut down. That's because Win32 uses an explicit shared memory model, with shared memory retained against the file system. Thus, if two processes share memory, and the memory that one of them holds is released by process termination, memory held by the other process doesn't get corrupted.
C optimizations can make gc difficult too. (Score:1)
First of all, thanks for pointing out that hack for implementing a doubly-linked list with the XOR of prev and next. I love interesting tricks like that.
Garbage Collection by Jones and Lins, point out that even if a programmer stays away from code like the hack in your example, an optimizing compiler can still make life difficult for a garbage collector. They give the following example (pg. 247):
Re:Garbage Collection is for Incompetent Programme (Score:4)
void * mem_p;
void foo(void) {
mem_p = malloc(somesize);
}
void bar(void) {
}
void bang(void) {
foo();
bar();
}
We can refactor this to:
void foo (void * mem_p) {
}
void bar (void * mem_p) {
}
void bang (void) {
void * mem_p = malloc(somesize);
foo(mem_p);
bar(mem_p);
}
We can do this because the life time of the body of bang() completely encompasses the life time of mem_p- which neither foo() nor bar() does.
Unfortunately, OO programming makes GC signifigantly less optional. And exceptions make GC no longer an option. Consider the following C++ code:
{
someclass * ptr1 = new someclass();
someclass * ptr2 = new someclass();
}
Can you spot the bug? If the program runs out of memory trying to allocate the second class, new throws an exception. But since the exception isn't immediatly caught, the stack is simply poped, and the memory pointed to by ptr1 is leaked.
The solution, as any C++ programmer will tell you, is to use smart classes and smart pointers, which implement GC. I.e., if the language doesn't have GC, you have to add it.
There are other reasons to use GC- by using GC you can more effectively divorce the implementation of a class from it's interface. In a GC system you don't have to export information about the memory management aspects of the implementation. This is especially important if you have two different classes implementing the same interface, but with different memory management requirements.
Which is faster, GC or manual allocation, often depends upon both the program in question, and the GC algorithm implemented. There is some data indicating that copying garbage collection is the fastest in complex situations- what you lose copying the data you win back in better cache and virtual memory utilization.
Implementing malloc() in hardware not very smart (Score:1)
Refenrence counting is the more basic solution to the same problem.
> If you don't use GC, you essentially are
> "reference counting", which is one of the worst
> performing GC algorithms, in fact.
Maybe so, but wouldn't it be a lot smarter to implement a couple of instructions that would speed up reference counting consirably, instead of implemention a couple of high level functions?
Re:Remember the VAX! (Score:1)
Call me pedantic, but Java and Lisp have pointers, they just don't have pointer arithmetic.
Re:Not true: GC is easy!!! (Score:1)
Come on, saying that's all you need for a complete garbage collector is like saying this code is a complete garbage collector:
int gc(void)
{
mark();
sweep();
}
Yes, now that I look at it, you're right! Garbage collection is easy and requires only two lines of code!
Get real. Show me a GC that (a) deals with objects for varying type and size, not an array of CELLs, (b) preserves blocks of memory pointed to by nothing in the heap but by a pointer on some processes' stack, and (c) finds pointers on the heap while not knowing the internal structure of the objects allocated there.
If it doesn't satisfy those three requirements, let's not pretend it's anything other than a toy. Even at that, we're talking about fairly primative GC. Once you've solved those problems, make sure your GC is (a) fast, (b) thread-safe, and (c) reasonably real-time. Then we're talking about something that might be considered for use in a real program...
I've written several garbage collectors myself, and I can assure you they don't take a page or less of code. Not if you understand the real issues involved in writing real garbage collectors.
--
Re:Better CPU == faster mem! WRONG! (Score:1)
Only to a certain point. Once the cpu speed overreaches the latency of the memory, the memory becomes the road block.
It introduces "wait states", which is basically a clock cycle where the cpu isn't doing anything but waiting for the memory to respond. These add up.
If you can't figure out how to mail me, don't.
Re:Garbage Collection is for Incompetent Programme (Score:1)
I work for a large software company. One of my major jobs is training new developers in "best practices". What you've just said is one of the classic things that people in academic circles believe...but which is utterly and unforgivably wrong. I spend a lot of time teaching them that they do care very much about how the system works, and that they care a lot about how the memory allocator in their system works.
You see, most modern commercial software is well-designed. But efficient design will only take you so far. One of the key performance bottlenecks on an modern computer is simple and straightforward: page faulting. The basic joke we tell is that doubling the speed of your processor gets you to your next page fault twice as fast...and it takes as long with a newer machine as it did before. This means that the cheapest way to get a performance improvement is to avoid page faults -- and you do that by avoiding heap allocation at all costs. If you need to do a malloc, do it in two stages: allocate a small buffer on the stack, test the space needed, and use that buffer if its big enough. It's on the stack, so it won't get page faulted out. (But always check that it is big enough...there's this thing called a buffer overflow condition that you risk there.) Only fall back on the heap when you have to. Allocate several continguous pages of memory to handle list nodes, and use them. Only go outside that block when you must. Etc.
And this is for any program. Do you allow the user to undo things he's done? Then you maintain an undo stack. Allocate nodes for it efficiently. Are you doing searches in dynamic lists? Allocate your tree efficiently. It's not enough to know that you need to use a B-tree or an AVL-tree to keep your data around; you must also make sure that you keep stuff compact. Do you write in C++? The x86 code to support C++-style exceptions (or Java-style exceptions) is fantastically expensive, since the x86 doesn't support dynamic context unwinding in hardware. Maybe you're better off doing nothing in your constructor, and using an explicit initialization step afterwards. That way, you don't need to handle the throwing of exceptions.
Don't get me wrong: good design is the most important key to successful coding. A well designed piece of software can easily run ten times faster than a badly design piece of software that does the same thing. But implementation is also critical, and that often means knowing more than just how the algorithm works, but also means knowing how much operations cost on the target system.
Re:Kind of cool - for real time systems (Score:3)
Re:Did RISC teach us nothing !!! (Score:1)
<i>Um, no, I don't think so. You are wearing rose-tinted glasses my friend. </i>
<i> Unless you're knee-deep in assembly macros, in which case that's strictly speaking at a higher-level than assembly anyway.</i>
Um, <b>YES</b>, certain processors have *extremely readable* assembly. I know for certain that the 68k has a very nice asm opcodes (it's very much like C - reads like it, too). And this is with a dumb 2pass assembler (no macros, nothing).
The only reason "assembly is hard" is due to Intel and their awful pocket-calculator decendent chip, the x86. Now, this chip has extremely wierd instruction rules with wierd register accesses (some faster than others for certain operations)...
Re:Did RISC teach us nothing !!! (Score:1)
Re:What's wrong with reference counting? (Score:1)
Let's say you had a function LinkedList *CircularLinkedList() that returns a circular linked list.
c = CircularLinkList();
OK. The count of the first element in the list is incremented. No problem. You do some processing, then you do
free(c);
the first element in the linked list has it's reference count decreased. It's count is now 1. And it will remain 1 even though the guy who is refering to him is unreferencable!!!
This is a general failure of reference-counting techniques. There are other shortcomings, including extra space to keep track of ref. counts, and keeping track of the counts them selves. Think about: every time a pointer is assigned, passed to a function, or returned from a function, you need to increment or decrement some counter *somewhere* in memory. And this can't be parallelized like mark&sweep can and run when your main process it waiting - it takes a hit on the main process itself.
It may work "OK", but better methods have long since been developed. I hope that I've explained this well, but if you still think reference counting is good, then obviously I haven't.... because this isn't a contencious point.
Re:Non C Languages? (Score:1)
IIRC, in early IBM MVS (360 vintage), memory was allocatable from several subpools and subpools could be allocated/freed themselves. Surely 30+ years of "progress" has not made storage allocation more primitive.
In reality you would want to allocate a "few" large blocks from the OS, then sub-allocate pieces of those blocks within a subsystem. When you have shared memory and long-lived processes, things get interesting.
Re:Did RISC teach us nothing !!! (Score:2)
I tend to write two kinds of programs: simple user utilities (most of which can be done in Perl to ignore this garbage) and security-critical, low-level daemons. The former are not speed-sensitive enough to write in assembly. In the latter, I am more concerned about being able to instantly determine program flow and implications than having to grok assembly.
I have no problem with people using assembly in very specialized applications. But I think anyone considering using assembly for any reason but as an absolute last resort should be sent to work on Microsoft BOB.
Re:They should read Soustroup's Answers (Score:2)
See, there is a reason why high frequency
connectors get smaller and smaller. There
is a reason why GHz people don't use BNC
and such. To get good characteristics on high
frequency signal you are almost forced to
shrink your design or else adhere to insane
tolerances. The first way leads to on-chip
cache, the second leads to very expensive RAM.
Rambus is a good early indicator of how RAM
gets expensive as you bump up clock speed.
There is also a problem of synchronization.
Even within one fast enough chip you can't
have one clock running the show because of
relativity issues. Engineers are starting to
deal with this today so it's not a far away
problem. But now imagine a physically remote
RAM chip (remote here can mean as little as
a millimeter away). Even if you could run it as
fast as the microprocessor, you couldn't keep
them in synch.
Having ultra-fast RAM is quite pointless, IMHO.
You want local caches all over your chip. In the
future chips will likely have many units on the
same die working from their own clock and with
their own cache.
Re:Garbage Collection is for Incompetent Programme (Score:1)
I think this line of thinking is plain ol wrong. I understand what you're saying, but I think you're giving up tons of performance, especially in cases of, as you said, REALLY complex data. For instance, if you're using something that takes care of all the memory management for you and allows you to concentrate solely on the problem, something like, say, java, you might be able to express your data conversion routine quickly, but oh-my-god is it ever going to suck to actually parse a few gigs of data.
At that point, having efficient memory management AND a tight loop could save you hours or even days of mining time.
Bottom line: anything that gets me the answers I want from my programs with less programming effort is a good thing.
Be sure to put that at the front of your user manual, I'm sure the people waiting on your code to do what THEY want will appreciate your insight.
And, damn is it ever funny that that post got moderated up.
-- blue
it's a graduate student project (Score:1)
oh andas far as comments about their html or english, neither of those is particularly relevant to their research. i've had both of these guys as TA's for cs classes here at iit, and i had dr. chang as a teacher. all three of them are not native english speakers and have very heavy accents. but while they are difficult to understand, they are all extremely knowledgable about hardware and hardware design, and are some of the brightest people i have met in college.
Loss of Control (Score:2)
BUT, if they do this and make it so you can, say, force the garbage collection to do its thing at a given time, maybe that could be do-able... still pretty crappy for high-speed apps though that need consistent speed...
Esperandi
Cool XOR pointer trick for doubly-linked lists (Score:1)
That's a cool trick! Although it probably doesn't make the program especially readable. Does anyone else have any good tricks like this? Preferably relating to garbage collection but offtopic will do
Re:Loss of Control (Score:1)
Re:Remember the VAX! (Score:2)
Many garbage collectors just collect before expanding the heap, so that's not so bad, and arguably better to page a bunch of stuff in all at once than paging something in on every call of free..
Re:malloc is pretty complicated to put in hardware (Score:2)
Please write a memory management system and then come back. I'm sure there's some classic understatement using almost exactly those words that I can quote at you, but it doesn't come to mind..
Daniel
WRONG APPROACH to programming (Score:2)
From the site (DMM):
>The memory intensive nature of object-oriented languages such as C++ and Java
From a reply:
>why not make the hardware faster?
Fellow programmers,
am I the only one who still remembers Assembler and the intensive search for memory preserving methods ?
Am I the only one who tries to make things fast, without thinking about the processor ?
In the 70s we struggled for every byte we could spare (yes, and we created the Y2K problem).
"Star programmers" like me even changed the code during runtime, by moving new instruction over old ones.
Yes, it was very hard to read, but it was top efficient.
Fellow programmers,
are you all too young or what happened to the common sense ?
If I have to solve a problem for my daily life with my machines, I FIRST check if a SHELL SCRIPT can do it.
Not, because I'm too lazy to use C, but because it might be faster.
If you run The Shell, there are inbuild commands and external commands, some having the very same name.
Ie: A "test" runs faster than a "command test".
BUT, YOUR:
case "$1" in
hello) COMMAND
esac
RUNS *** NOT *** FASTER
If "test" is inbuild in your shell (and it is, folks) and you write:
if test "$1" = hello
then
COMMAND
fi
Am I the only one who knows that ?
Java needs a lot of memory.
It's hip.
But what is Java REALLY ?
Nothing than an interpreter language that just happens to have support by MS and Netscape (and due to this, now in our Linux kernels).
In the 80s we used (and I still do) REXX.
It's also an interpreter language, can also be compiled to object code.
It runs on MVS, VM, Unix, DOS, MS Windows.....
At that time, there were just no browsers (besides Mosaic).
It can do everything Java can,
the only reason why you guys and girls use Java is that Big Companies ship their interpreters with their browsers
and it looks like, as if Java runs on it's own, like magic.
The only thing that runs on it's own is Assembler.
Now I don't want to say you should use Assembler,
but I think I need to remind us all,
that hardware becomes faster and faster,
and that because of this, programmers get lazy and code stuff that runs on state-of-the-art hardware,
but would run 4 times as fast, if these programmers would first think about RESOURCES.
A good program will run on a 486 in a reasonable speed and on a Piii like "real time".
I want programmers to THINK about how they code.
It IS possible to write applications that do not need extensive memory and a fast CPU (or two),
IF the programmer would first THINK about how to write something and optimize the code himself, not only with a standard optimizer (if he uses that at all).
Read "The Unix Programming Environment" by
Brian W. Kernighan and Rob Pike.
After that, your programs will run 4 times faster.
Replies greatly appreciated,
fine day y'all, george./
Paranoid Programming (Score:2)
Somewhat off-topic. I like to set pointers to NULL after their objects have been freed or are no longer valid. This helps prevent the nasty situation of code referring to an object that is now a chunk of memory on the free list, or worse, allocated to some other use.
This has nothing to do with GC. (Score:3)
There seems to be a number of misconceptions as to what this is all about or how might it be useful in practice. I will offer some opinions on some of the issues raised in other posts.
The first one, as already mentioned by a number of people, is that hardware implementation of malloc and free has nothing to do with GC. The most difficult part of GC is to determine which part of the memory is garbage (this is not as easy as it may seem) without using lots of memory (after all, garbage collectors are typically only activated when free memory blocks are running low), and for those garbage collectors running as an independent thread, avoid possible race conditions with the foreground processes. Other issues a garbage collector faces include how to reduce the impact on overall system performance, and how to decrease memory fragmentation.
A garbage collector is not something easy to design and implement. Making a good garbage collector especially requires the almost-impossible combination of profound understanding in the memory usage behavior of typical programs, logical reasoning abilities, and coding techniques. In addition to the garbage collector itself, you also need support from the language and compiler side, and you have to integrate all of these into a clean design. That is about as hard as things like this can be.
(Of course, you can also write a conservative pointer finding GC like the libgc used by many functional language compilers --- but that is far from the state of art in this business.)
The proposed hardware support has nothing to do with the points we mentioned above, therefore has nothing to do with GC. Then, is it possible to build some hardware support for garbage collection? Maybe, but I am not an expert in this field. Whatever the solution turns out to be, it will never be as simple as hardware implementation of malloc and free.
Second, this also has nothing to do with real time systems. Many people seems to think that being "real time" means you have to code everything in assembly language and make things run as fast as possible, but that is simply not true. Being (hard) real time means operations must respond and complete within bounded time; as long as that bound is satisfied, there is no need for the task be completed "very fast". The trick of building real time systems is in making sure that nothing will delay the operation for a long time, and that requires careful analysis and implementation.
If you remain to be convinced, think of a typical hard real time application: CD-R burning control programs. They must feed a continuous stream of data (say, 600KB/s for 4X burning) into the burner, or the disk will turn into a coaster. Is it necessary to code it in assembly? Absolutely not, because pumping data at 600KB/s is very little work for current architectures with 600MHz processors and 100MHz memory busses. However, does that mean you do not need to pay special attention to make it real time? Wrong, because although the hardware is several orders of magnitudes more powerful than it needs to be, there are still possibilities that something will make the program stop working for a second or two and screw the party. It is the responsibility of real time system designers to make sure that it does not happen.
Works only when chasing the list (Score:2)
--
Re:Hang on a minute.... (Score:2)
You need a pointer to the first entry in the list. Assuming that the last "next"-pointer and the first "prev"-pointer is NULL, it's easy to go from there. Sample:
p = first;
prev = NULL;
while (p) {
(do stuff with *p)
tmp = p->xoredstuff ^ prev;
prev = p;
p = tmp;
}
Traversing in the other direction is left as an exercise for the reader.
Pointers v. References (Re:Remember the VAX!) (Score:2)
Well, more or less. As part of a programming languages class I took in school, I actually had to write a garbage collector for a Scheme-like language. While one could implement references in a Java/Scheme/LISP-like language in the above fashion, that's really slow; it adds another layer of indirection that isn't really necessary. If the run-time system provides information about the types of objects in memory (which is required for Java and Scheme anyway -- instanceof and symbol?), then the GC can reach in and directly change the pointer values within the program's stack or whatever.
From the point of view of the applications programmer, though, the two implementations are equivalent.
As far as I'm concerned, the primary difference between pointers and references is that with a reference to an object of type Foo, you are guaranteed that the object referred to is a valid instance of type Foo. This is not the case with pointers. In C/C++ terms, int i; float *pF; pF = (float *)(&i); The object at *pF is not a valid instance of a float.
(Yes, you can in fact break this guarantee in C++ using a series of really crufty typecasts. C++'s references are broken, in a number of respects.)
Of course, as MassacrE points out, references are almost always implemented internally in terms of pointers, because that's what the underlying architecture tends to support.
Re:You're wasting your time with GC (Score:2)
It's a misperception that garbage collection is about debugging, though- the point is that you can write programs that aren't convoluted through the need to allocate and deallocate memory cells for remembering new things. The C/C++ idiom of things like
void get_string(char * dummy_string)
{
}
makes sense only because you've seen it a million times, and because you're thinking about the way that memory works. The much more intuitive Java style- that is,
String getString()
{
return new String(
}
reflects your intent much more naturally, but is only really possible because of garbage collection. (I know you can allocate and then return in C/C++, but I also know that it can turn out to be a huge mess to figure out exactly when to deallocate, and I no of nobody who writes anything that isn't trivial by returning a newly-allocated pointer).
The point is simply that if you want to be able to design and implement programs correctly and rapidly, a facility that lets you specify only "what my program does" without bothering you with details. One aspect of speed is how fast your program will run, and that's certainly important. Another aspect of speed is how quickly you can write a correct program, and high-level abstractions make application development far quicker. And for most of your code, how rapidly it executes isn't really a big deal- the idea around Rice University, where I go to school, is that you should write your programs in as high-level a language as possible, figure out where it's necessary to make the code faster, and then rewrite those pieces in C or assembly or whatever (that idea is based on the premise that in typical programs, a very small subset of the code takes up the vast majority of the execution time). You get speed where you need it and flexibility (ease of maintenance, reusability, etc) everywhere else- the best of both worlds, and also you wrote your program in half the time it took the other guy to write the C++ version.
Re:Loss of Control (Score:2)
Says who? Most gc's operate in threads. Make your critical operations atomic. Presto, no interruptions.
Re:Garbage Collection is for Incompetent Programme (Score:2)
Re:Loss of Control (Score:2)
Esperandi
Re:They should read Soustroup's Answers (Score:3)
Here's an example. Say fast memory is 10x the price of slow memory and 10x the speed. Also, assume that a cache one tenth the size of the main ram has a miss rate of 0.5%. Given these specs, you can have either:
N megs of fast memory with an average access time of Q.
5N megs of dram with N/2 megs of cache with an average access time of 1.045Q.
That's five times the capacity for a 4.5% speed hit. In the real world the situation is even more favorable for a memory heirarchy.
BTW, nobody just uses ram anymore, everyone has swap space too. Why don't you argue for disks that run at core speed?
Ryan Salsbury
Re:You're wasting your time with GC (Score:2)
GC misconceptions (Score:2)
Some of the articles that have been posted seem to miss the point. Several people suggested that this design goes against the principles of RISC. I am puzzled. The RISC philosophy is about maximizing efficiency by reducing CPU complexity. But this is memory management research, i.e. it proposes a new MMU design, not a new CPU. It is like suggesting that 3D accelerator boards are contrary to the principle of RISC design because they involve complex hardware. There's no contradiction in having a simplified CPU and complex off-chip hardware to back it up.
Others have suggested that there's no point to this work because a hardware implementation of malloc() and free() would run only marginally faster than their software counterparts. I suggest reading some of the publications on their Web site, particularly their Introduction to DMMX. [iit.edu] They aren't merely trying to implement malloc() and free() in hardware, and the solution they describe would allocate and sweep the heap in constant time. If the scenario described in this paper is feasible, it could be pretty interesting stuff.