More Effective Use of Shared Memory on Linux 280
An anonymous reader writes "Making effective use of shared memory in high-level languages such as C++ is not straightforward, but it is possible to overcome the inherent difficulties. This article describes, and includes sample code for, two C++ design patterns that use shared memory on Linux in interesting ways and open the door for more efficient interprocess communication."
SysV IPC is obsolete (Score:4, Informative)
$ man shm_open
Re:SysV IPC is obsolete (Score:4, Funny)
(talking about ftok)
Me: But, what is done to prevent clashes if different programs use the same key?
Prof: Nothing.
Me: Eh? That's fucking sabotage. (I used "cholerous", but that was in Polish)
Prof: And that's why we won't use SysV IPC in subsequent lessons.
The authors here use a static key of 0x1234...
Re:SysV IPC is obsolete (Score:5, Funny)
Well, that should be a safe choice, because no sane person would use 0x1234, therefore this key is still unused.
Re:SysV IPC is obsolete (Score:5, Funny)
Re:SysV IPC is obsolete (Score:2)
Re:SysV IPC is obsolete (Score:2, Redundant)
Damn, that's the code I have on my luggage. Remind me to change it.
Re:SysV IPC is obsolete (Score:2)
I'm not so sure about this -- I'd rather a conflict with a sane person than with the insane one...
Re:SysV IPC is obsolete (Score:2)
That is actually the default password (1234) for half of the spanish adsl routers from Telefonica (the incumbent phone company in Spain).
And most people will never change it.
And they are accesible from both the internet and wifi (most of them are wifi enabled).
so go laugh out loud
shmem (soon in Boost!) (Score:5, Informative)
And it will soon (hopefully) be a part of Boost [boost.org]!
Re:shmem (soon in Boost!) (Score:5, Interesting)
Depends on your definition of "complex objects".
From the documentation:
Virtuality forbidden
This is not an specific problem of Shmem, it is a problem for all shared memory object placing mechanisms. The virtual table pointer and the virtual table are in the address space of the process that constructs the object, so if we place a class with virtual function or inheritance, the virtual function pointer placed in shared memory will be invalid for other processes.
Basically, I would have been surprised if they had found a solution for that. But I guess it cannot be portably solved. Instead, the system would have to be prepared for it. I could imagine that objects in a shared library (so the same code is guaranteed to be shared to both processes) could be placed in shared memory, if the compiler/runtime system provided the means for it (say, instead of the pointer to a VMT, it would contain an offset into the constant data section of the shared library, and something to identify the library with, say a system-wide unique active library index which is generated by the dynamic linker).
Re:shmem (soon in Boost!) (Score:3, Interesting)
You can get virtual functions this way and it will be fast enough but not very "nice", of course.
Re:shmem (soon in Boost!) (Score:2)
const (Score:4, Funny)
Re:const (Score:2)
Once made a class that abstracted a file as an array of records. To be able to make a const version of this that could still read the backing file, I had to have read pointers into the file as "mutable". There is still changeable state, it's just more controlled.
const is really not a very good guarantee that things won't be changed. The language lets you get around it too many ways.
This is nothing new (Score:4, Interesting)
Anyway, old stuff. Wake me up when you start talking about the newer tricks with shared memory.
CML (Score:4, Informative)
http://portal.acm.org/citation.cfm?id=113470 [acm.org]
In particular, the things you synchronize on are first-class. Also you can speculatively send/receive things. Normal "select" is only for reading. You don't have to manage your memory either.
There are other concurrent languages, but CML is nice in that it has a formal semantics, so unlike typical languages like "C", "C++", Erlang or Java, a program has a meaning other than "whatever the program does when I run it."
You can implement the primitives of CML in your favorite higher-order language, so you don't have to be limited by ML. That's what's in Reppy's book.
A proper implementation can achieve speeds that are about 30x faster than pthreads for typical tests like "ping/pong".
Re: (Score:2)
Hardware-enforced sharing: OLD HAT (Score:5, Interesting)
Re:Hardware-enforced sharing: OLD HAT (Score:4, Informative)
Re:Hardware-enforced sharing: OLD HAT (Score:2)
Doors (Score:5, Interesting)
Re:Doors (Score:3, Interesting)
Comments on comments (Score:2, Insightful)
Unix Domain Sockets? (Score:2)
And? (Score:4, Interesting)
And why is this news? Is it so difficult that nobody has done it? No, that can't be -- the shm stuff can be wrapped. This is so important that it rates a "design pattern"? Not it either -- the one illustrated isn't the best solution.
So, just what is this article? Methinks fluff. Sort of in line with "How to implement co-routines with setjmp/longjmp" thing. Or, "Restructuring data to assist processor cache residency". And "How to remove locks from performance critical MP code".
Except not as interesting or useful.
Ratboy.
There are better ways (Score:5, Informative)
A lot of shared memory synchronization and/or caching problems can be solved on Linux through the effective use of a few simple things:
1) shm_open (if seperately-started processes which need to coordinate in shared memory), or mmap(MAP_SHARED|MAP_ANONYMOUS) for a process which will fork children which need to communicate/share between themselves and the parent.
2) Use 's "atomic_t" integer type within that shared memory array (atomic_t* my_shm_array = mmap(....)). The atomic_t type has several functions defined in that header for atomic read, write, increment, etc for the linux hardware platform at hand. On most sane (cache-coherent) SMP architectures, reading and writing are already atomic operations, so this basically devolves to just setting and getting integers like normal (with a little bit of syntactic sugar (struct { volatile int val }) to make sure the C compiler doesn't optimize things away that it shouldn't. And you can implement a whole lot of sane algorithms using nothing but shared memory integer reads and writes with no locking or special atomic increment ops.
3) If you need more advanced or complex locking on the shared memory for synchronization, use Linux's "futex"'s. They're in the man pages, and they're really fast.
Re:There are better ways (Score:2, Interesting)
2) dont u know that NPTL is already doing this for u? On fast-path, NPTL's posix mutex just do atomic operations and avoid doing syscall. Stick to the standard API and let the platform guys (libc, kernel,
3) u dont want to do this, seriously! if futex is that consummable by the public, then why did the glibc guy write a looooooong paper describing howto use futex.
Not much experience on this but... (Score:2, Informative)
not really usefull (Score:2, Informative)
Why would someone use a shared memory block for threads which are all running in the same memory space anyway?
We come to the conclusion that the code is quite useless for inter-thread communication too. All in all - usel
This is a mediocre way to get IPC. (Score:4, Informative)
On top of those mechanisms, even slower interprocess communication systems are typically implemented, such as OpenRPC and CORBA. (For even more inefficiency, there's XPC. In Perl. But I digress.)
Because of this history, there's a perception that interprocess communication has to be slow. It doesn't.
What you really want looks more like what QNX [qnx.com] has - fast interprocess messaging that interacts properly with the scheduler. QNX has to have interprocess communication done right, because it does everything through it, including all I/O. This works out quite well. You take a performance hit (maybe 20% for this), but you get much of that back because the higher levels become more efficient when built on good IPC.
The QNX messaging primitives are available for Linux, [cogentrts.com] although the implementation isn't good enough for inclusion in the standard kernel. That work should be redone for the current kernel.
IPC/scheduler interaction really matters. If you get it wrong, each interprocess transaction results in an extra pass through the scheduler, or worse, both the sending process and the receiving process lose their turn at the CPU. This is easy to test. Start up two processes that communicate using your IPC mechanism. Measure the performance. Then start up a compute-bound process and measure again. If the IPC rate drops by much more than a factor of 2, something is wrong. Don't be surprised if it drops by two orders of magnitude. That's an indication that IPC/scheduler interaction was botched.
Sun addressed this in the mid-1990s with their "Doors" interface in Solaris, which had roughly the right primitives. But that idea never caught on.
The article here implements a message-passing system via shared memory, which is not exactly a new idea, even for UNIX. I think it first appeared in MERT [bell-labs.com], in the 1970s. It's an attempt to solve at the user level something that the OS should be doing for you.
Shared memory is a hack. It's hard to make it work right. With it, one process can crash other processes in hard-to-debug ways. Sometimes you need it because you're moving vast amounts of data, (by which I mean more than just a video stream) but that's rarely the case.
Re:C++ has bigger memory issues (Score:3, Informative)
Re:C++ has bigger memory issues (Score:5, Insightful)
C++ is an actual Object Oriented language, which is of course half the problem.
If you mean a pure OO language like Java, in which everything is an object except for primitives and it takes ten classes and wrappers just to read a file, well then C++ isn't exactly an Object Oriented language as such. Perhaps you mean Smalltalk or the like.
I tell you what though, C++ is still around after all this time. With all the hype surrounding Java, Perl, C#, Python, etc, etc, etc C++ programmers are still there beavering away with the god awful sytax Stroustrup left them with. Even after all the improvments, all the innovation and all the additional research into computer languages, for a hell of a lot of tasks, there is really no real alternative to C++.
I don't say this as a C++ fanboy, even though I am "somewhat" fond of the language when it is used properly, and not in garbled and unreadable line noise. I say this simply as a statement of fact. There is still no successor to C++.
I don't want garbage collection so much as I want a cleanup and rationalisation of the syntax. GC would be nice, but forcing more readable code would be even better.
Re:C++ has bigger memory issues (Score:2, Flamebait)
Mostly because it allows sloppy, C-style, programming which is easy. It also leaks like a sieve most of the time and has all the security problems seen in C. It doesn't HAVE to but its design means that it does. And it's ugly as hell.
Easy programming languages always hang around longer than they're needed because most programmers, sad to say, are uninterested in quality and very interested in meeting deadlines.
There is still no successor t
Re:C++ has bigger memory issues (Score:2)
C/C++ doesn't prevent you from coding secure, leak-free programs. All it does is shift the responsibility for security and memory management from the language to the programmer. If you're a sloppy programmer then, yes, you need a better language than C/C++.
Re:C++ has bigger memory issues (Score:2)
That is what I said, yes.
TWW
Re:C++ has bigger memory issues (Score:2)
C/C++ doesn't prevent you from coding secure, leak-free programs.
That is exactly what I said.
Re:C++ has bigger memory issues (Score:2, Insightful)
Excuse me? C may be many things , but easy isn't one of them.
Ask any beginner. And as for the "security problems" in C , as
you well know , C was designed as a replacement for assembler.
It wouldn't have been much use if it didn't give you
complete flexibility wrt memory access, even if that means
breaking some of the rules that hand-held high level programmers
use as a crutch because they're unable to code to the metal.
"are uninterested in quality and very interested in meeting dead
Re:C++ has bigger memory issues (Score:2)
Compared to most languages, C is very easy to pick up the basics of. That's why C-like languages rule the programming world. There are some tricky parts but the basic structure of a C program is something even beginners grasp very quickly, and I've tutored quite a few.
If you have a mortgage and a family and your boss threatens you with the sack if you keep missing deadlines very few if any programmers will take the "moral" sta
Re:C++ has bigger memory issues (Score:2)
Yes, but in every instance I have seen C++ programmers end up implementing reference counting (by hand) in order to make these problems tractable. Unfortunately reference counting is a horribly inefficient solution compared to a decent garbage collector. Often C++ programmers don't understand why reference counting is a bad idea, assuming that just because its large cost is spread over many operations it mu
Re:C++ has bigger memory issues (Score:2, Offtopic)
Eh?
File inputFile = new File("temp.txt");
FileReader in = new FileReader(inputFile);
Or if you want to use the new IO library from 1.4 which gives you memory mapping and locking on sections of files,
FileChannel in = new FileInputStream("temp.txt").getChannel();
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:2)
No, that was actually quite deliberate. When people say they can open and read a file in Perl or C with only one line, they haven't included any error handling, so I think it's only fair to do the same here.
Re:C++ has bigger memory issues (Score:2)
open my $filehandle, '', 'filename' or die 'omfgwtfbbq!';
Re:C++ has bigger memory issues (Score:2)
I wouldn't call instantly ending a program with an error message "handling" a problem. If you have no demands on program reliability, you can always throw a RuntimeException("No such file: + fileName") in Java too.
Re:C++ has bigger memory issues (Score:2)
I've used them. What did you not like about them?
FileReader (or BufferedInputStream) are still awkward because tehy use byte and string arrays. Those really are hard to work with w/o doing conversions in java, where in C the char arrays are usfule w/o conversion.
Ok, there is also RandomAccessFile [sun.com] which gives you methods for reading all primitives, byte arrays, UTF-8 Strings, whole lines....
Used fo
Re:C++ has bigger memory issues (Score:2)
last time i checked, new FileInputStream("foo.txt") returned a sane 1 class wrapping around InputStream, which is actually as close to the ideal OOP model as it gets.
java's big advantage over old generation programming languages is that char != byte, which makes all kind of utf8 and utf16 handling transparent and very comfortable
Re:C++ has bigger memory issues (Score:2)
Your statement is contradictory. All types are objects in a pure OO language. Yet Java's primitives are not. Thus Java is not a pure object oriented language.
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:2)
Of course, there is still some debate about what "Object oriented" means, and what it means to be "pure" OO. And by debate I mostly mean a combination of Usenet flaming and the academic equivilent of trash-talking. Suffice it to say that there is not a universally accepted definition
Re:C++ has bigger memory issues (Score:2)
It supports procedural, OO and functional programming, the last still needs improvement. Contract programing is in the works, and will amongst other things alleviate all those nasty template error messages. Fine-grained GC is already in TR1 and should be available from major compilers pretty soon. It is interesting to note that GC is being done in the standard library, unlike C# which does it in the core lang
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:2)
Oh good gods yes! As an example for how all over the place c code can get, try looking into the mplayer source tree. Men have but gazed upon its grim pointer laden facade, and gone instantly mad. Mplayer.c is 4000 lines long!
Re:C++ has bigger memory issues (Score:2, Insightful)
that point in the code. WIth C++ you can use opertor overloading,
polymorphism, hidden convertions , template specialisation and lots of other stuff which makes the actual code more implicit than C and hides whats actually going on away more. Sure , if you spend some time looking at the code you'll figure it out , but C++ doesn't IMO lend itself to skim reading as much as C does.
Re:C++ has bigger memory issues (Score:2)
WIth C++ you can use opertor overloading, polymorphism, hidden convertions , template specialisation and lots of other stuff which makes the actual code more implicit than C and hides whats actually going on away more.
C++ code tends to more clearly express the programmer's intent. C code tends to more clearly express the implementation details. When reading C, you start by looking at what the code does, then work upward to figure out what the programmer intended to accomplish. When reading C++, you st
Re:C++ has bigger memory issues (Score:2)
I'd say it is entirely a programmer problem. C++ programmers who write C programs (for whatever reason) tend to organize things just as they would a C++ program.
What does it take to be a successor? (Score:2)
So, what would it take to satisfy your criteria on being a proper successor to C++, if none of C#, Java, Pike, Python, and many others are unable to qualify?
Re:What does it take to be a successor? (Score:2)
Isn't it obvious? Something better!
Re:What does it take to be a successor? (Score:2)
So, what would it take to satisfy your criteria on being a proper successor to C++, if none of C#, Java, Pike, Python, and many others are unable to qualify?
One of the overriding goals behind the design of C++ was that it had to be as efficient as C. Stroustrup made many design decisions to ensure that C++ is never slower than C except in a few cases where the programmer makes a decision to invoke one or more of a small set of features that simply cannot be provided without overhead, and even there he t
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:2)
Is this really such a problem?
The standard wrapper classes(Integer, Double, etc.) have been around since 1.0 and are part of java.lang so they don't even have to be explicitly included in the code.
I've never been confused as to when I needed an object over the effeciency of a primitive data type.
And as of 1.5, the collections classes automatically box and unbox primitives into their corresponding wrappers.
Re:C++ has bigger memory issues (Score:5, Interesting)
People who think they always need to "new" objects in C++ have spent way too much time using Java.
Here's another hint -- pass objects to functions as const references: This way, a copied object isn't allocated for the passing (no memory at all is in fact allocated). The biggest drawback is you can only call "const" methods on the object, but this is outweighed by not using pointers. Not that I don't like pointers, they just increase the complexity and should be used prudently. And as my
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:3, Informative)
a) The called function _cannot_ modify the argument. This becomes important to the code surrounding the function call.
T x(...);
foo(x);
If foo is declared "void foo(T const&)", then you *know* that x has not changed. If instead it's declared as taking a plain reference, you can't know.
b) You can pass const objects or objects with limited lifetimes.
foo(T());
Re:C++ has bigger memory issues (Score:2)
Re:C++ has bigger memory issues (Score:2)
I can see no reason for making mandatory garbage collection part of the C++ language. It wouldn't make it one bit more useful; C++ doesn't prevent you from using garbage collection in your programs. If C++ needs a garbage collector(s) then it (or they) should go into the standard library (if anywhere) and not the language itself. Restricting the use of a language will never make it more useful.
Re:C++ has bigger memory issues (Score:3, Interesting)
Getting back to the original premise of the story, can you even do OS-level shared memory (SysV or POSIX) with Java? OS-level semaphores? Any meaningful kind of IPC? OS-level anything? I mean without godawful JNI nonsense.
Re:C++ has bigger memory issues (Score:2)
Not really, no.
Getting back to the original premise of the story, can you even do OS-level shared memory (SysV or POSIX) with Java?
I dunno. Never tried. I would imagine not, given the sandboxing that goes on.
TWW
Re:C++ has bigger memory issues (Score:3, Interesting)
STL, for example, is not an OO library. Yet it has proved to be immensly useful.
One place where the garbage collected languages fall down is in the management of resources. The handling of limited resources such as files or sockets must be explicitly released by the programmer. This demonstrates that you simply cannot ignore the lifetime of objects with a garbage collector. And I also assert here that memor
Fanboy mods... (Score:3, Informative)
There are some subjects that draw fanboy clubs here in
Some examples: Java, AMD, Apple, Ruby.
Try criticizing any of them here, you'll be down-moderated to (-1) pretty quickly. OTOH, praise any of those and you'll get moderated up, no matter how stupid or inconsistent the comment is.
Re:Fanboy mods... (Score:2, Informative)
Re:C++ has bigger memory issues (Score:2)
I wish I had
Re:C++ has bigger memory issues (Score:2)
Though it still doesn't solve the problem of bad coders.
Re:C++ has bigger memory issues (Score:2)
I would dispute the second part: has code quality improved in these houses, or anywhere in the last 15 years for that matter? Has the number of late, canceled, or over-budget projects decreased?
I simply don't believe that the currently popular languages have anything substantial to offer when it comes to solving the big problems of computing. C++ is a hack, but it is a very po
Re:Microsoft code? (Score:2)
if(x = foo() != -1)
when the programmer meant
if((x = foo()) != -1)
?
If you write your comparisons with the constant first, the compiler will tell you. In particular,
if(-1 != x = foo())
results in a compile-time diagnostic. It's not about "being unfamiliar with C++ syntax". Quite the opposite, some
Re:Microsoft code? (Score:3, Insightful)
Re:Microsoft code? (Score:2)
Re:Microsoft code? (Score:2)
Re:Microsoft code? (Score:2)
The programmer actually meant:
Re:Microsoft code? (Score:2)
I consider this cleaner than the combined assignment/compare in the while condition. YMMV. Note that if you ever need to replace the simple call to foo() by something more complex, you'll have to switch to this form anyway, or it will become unreadable quite quickly.
And yes, it's still single entry/single exit - it's just that the exit is syntactically in the middle of the loop body.
Re:Microsoft code? (Score:2)
Re:Microsoft code? (Score:3, Insightful)
I like my manga right to left, my code left to right.
Re:Microsoft code? (Score:2)
Now, you hopefully don't think C++ declaration syntax (which is this way due to C backwards compatibility) is something you should imitate. Indeed, Bjarne Stroustrup himself said: "I consider the C declarator syntax an experiment that failed." [slashdot.org] (in the answer to question 3, paragraph 4)
Re:Microsoft code? (Score:2)
c = getc(stdin);
if (EOF != c)
{
Re:Microsoft code? (Score:2)
b) One is a habit, the other is a typo. I've been writing in English for far longer than C++, and I still create typos in English every once in a while. The idea that a smart C++ programmer would never accidentally spell a common idiom wrong is absurd.
A smart C++ programmer is one who recognizes commo
Re:Microsoft code? (Score:3, Informative)
BTW, the very first file is not valid C++: All identifiers which contain double underscores are everywhere and under all circumstances reserved for the implementation. This also includes __COMMON_H__. Change it to e.g. COMMON_H to get valid C++.
Well, at least his main function returns int.
Re:10 fold speed improvement - Dekkers mutex ! fas (Score:2)
Only those, which know of Futexes.
Re:10 fold speed improvement - Dekkers mutex ! fas (Score:5, Informative)
This one is worth remembering as one to avoid -- it's based on the idea of a busy-wait. Look at the while(test) {
There's a reason this algorithm lies in rest in academic journals: it's only useful as a teaching tool.
Re:10 fold speed improvement - Dekkers mutex ! fas (Score:3, Insightful)
- It does busy waiting. If one thread holds the 'mutex' for a long time, the other thread will take a lot of CPU for nothing.
If you really need to take the resource as soon as it is available without giving up the proc, then have a look at "spin locks".
- It is not very scalable.
First, you need one version of the algorithm for mono proc one for bi proc, etc. Of course you could put them all in a shared lib and select one at runtime.
Second, the algo seems to be O(
Re:10 fold speed improvement - Dekkers mutex ! fas (Score:4, Informative)
The guy had a clue. Your algorithm is a busy-wait loop, so your CPUs will be maxed at 100% while waiting, and the thread will be pushed by the scheduler to lower priority, and so on...
Re:10 fold speed improvement - The Phd was idiot (Score:3, Informative)
That code makes a huge fundimental assumption, that write order is preserved. In other words, if you do:
Write to location 3 on processor 1 (take the lock)
Read from location 30 on processor 1 (do stuff with the lock held)
Read from location 3 on processor 2 (check the lock)
that the reads and writes will appear in order. On ALL modern processors, this assumption is not true, it's possible for the write to location 3 to occur AFTER the read from location 3 on processor 2. It works gre
Re:10 fold speed improvement - The Phd was idiot (Score:2)
The code assumes that a writes are atomic. This will almost always hold for 8 processes and usually form 32, but if the flags array is larger than a word, atomic writes go out the window.
yeah, fast, and 10-fold chance of odd failures (Score:5, Informative)
Google for Dekker's algorithm and memory barrier - you will find better explanations of the problem there than I could type up in my limited time here right now.
Re:yeah, fast, and 10-fold chance of odd failures (Score:2)
Re:yeah, fast, and 10-fold chance of odd failures (Score:2)
This code may very well work in the narrow scenario in which you're using it, but it's NOT a general purpose solution to the problem.
Cache choherency is NOT sufficient (Score:3, Interesting)
CPU AA:
resource = produce_something();
turn = BB;
flags[AA] = FREE;
CPU BB:
flags[BB] = BUSY;
consume(resource);
The problem is that AA is free to reorder its writes. So, the actual order could be:
flags[AA] = FREE;
flags[BB] = BUSY;
consume(resource);
res
Re:Cache choherency is NOT sufficient (Score:3, Insightful)
Preying on the non-comp SCI mods, I see. (Score:5, Insightful)
Turn to page 55 of your OS design and implementation by Tanenbaum. See where he says, "For a discussion of Dekker's algorithm, see Dijkstra (1965)."? How do you get through a proper comp sci honours degree to the point where you can take a masters and then a PhD without reading Dijkstra?
How about you crack open that copy of Operating Systems (4th ed) by William Stallings, which has a discussion of concurrency and Dekker's on pages 208-213? How can you get past a 2nd/3rd-year introductory operating systems class without having gone over this topic?
You are a troll. A troll preying on the fact that most of the moderators here have no idea about computer science, and have not taken a wiff of a real operatings systems class.
For the record, Peterson's algorithm (published in 1981) is a much simpler solution to your problem. It's on page 56 of the Tanenbaum book, and also discussed in Stallings on page 213. There's a new 5th edition of the Stallings book, but the index will take you to the correct chapter/page in short order.
Java-trolls are clueless, as usual... (Score:2, Insightful)
Re:Java-trolls are clueless, as usual... (Score:2)
Re:High-level language? (Score:3, Informative)
I've never heard that before... everyone I know, and all the literature I've read that described programming languages, considers assembler as "low level" and anything at a higher level of abstraction as "high level." With the exception of a few folks who try to describe C as a "mid level language" or as "high level assembler."
Calling C++ a "low level language" is absolutely a mistake