Seeking Multi-Platform I/O Libraries? 88
An Anonymous Coward asks: "I'm just getting ready to plunge into a new project, and joy of joys have been given complete freedom when it comes to the implementation language - so long as the program will build and run on both x86 Linux and Windows. Now, I don't need a GUI, this is systems stuff only (processing binary executables in fact, so lots of bitfiddling and big nasty algorithms over hairy data structures) so pretty much all I need are standard IO libraries. C is currently at the top of my list..but what other language should I be looking at? I'm happy to learn a new one, and have the go ahead to do it..like I say, they want absolute speed. Can someone suggest a better language? C++ is out, it does come with a speed hit (using C++ properly anyway, not as a
souped-up C). If I'm gonna take the speed hit, I
may as well consider something like Ocaml which might let me claw the speed back with better algorithms and data structures.."
-1, Flamebait (Score:4, Funny)
[ in my best announcer voice ]:
Let's get ready to RUMBLE!!!
--
Evan
Re:-1, Flamebait (Score:1)
Templates, for example, allow for similar common factoring of code without resorting to inheritance and virtuals.
C++ certainly has it's downsides, but please don't reject it based on arguments that have been repeated over and over without basis.
Re:-1, Flamebait (Score:2)
Ideally, perhaps, but not in reality. There's definite, significant overhead in using much of C++'s standard library. Time a loop using IOstreams (std::cout) against one using std::printf(). In GCC 3.0, at least, you should find the C++ significantly slower. (In fact, slower than a similar test in Java.) C++ is a good language with a lot of critics, but just because the criticisms have all been heard before doesn't mean they lack merit. One can say most of the performance problems aren't inherent in the language, and that'd be true, but what difference does it make if all of the implementations are so bad?
Re:-1, Flamebait (Score:2)
You're right on the money there, unfortunately. There is no theoretical reason for C++ to be slower than C at all; if you implement the equivalent functionality to, say, virtual function dispatch in C, then it's likely to run with at least the same overhead as if the compiler does it for you in C++.
OTOH, standard library implementations mostly suck. At least they're improving, though, sometimes drastically; check out SGI's latest. And the template implementation actually helps the optimiser a lot in some cases. When you've finished comparing cout and printf, try qsort vs. the C++ library sort and spot the difference.
IMHO, though, the biggest problem is still the optimisers behind the compilers. We've got decades of experience optimising C to high levels. We've got perhaps a quarter of that optimising things like templates, exceptions and so on in a C++ compiler. Again, it's getting better, and sometimes quantum leaps get made as a new implementation technique is discovered, but it's still got a way to go. That is where the really big practical disadvantage lies, at least for now.
Re:-1, Flamebait (Score:1)
Try lots of implementations.... (Score:2, Insightful)
Re:Try lots of implementations.... (Score:2)
If speed were not so critical, I'd suggest Perl, actually. With the speed demands, and the need for cross-platform IO, I think C is probably what you want to use.
/Janne
Re:Try lots of implementations.... (Score:2)
Re:Try lots of implementations.... (Score:1)
If I'm not mistaken, he said "lots of bitfiddling and big nasty algorithms over hairy data structures", not "text processing".
But then again, you;d probably recommend Perl for embedded real time applications, too...
Re:Try lots of implementations.... (Score:2)
Perl is no good at real-time tasks, of course. I doubt I would consider Perl for heavily calculation-oriented applications either.
/Janne
Consider Python... Wait! Don't leave!!! (Score:4, Interesting)
You'll be happier, your fellow programmers will be happier, your successor programmers will be happier, and the chewy parts of your code will still be really fast. Think about it.
Re:Consider Python... Wait! Don't leave!!! (Score:3, Interesting)
The problem is that under large workloads (which is normal for us) you end up with python spending more time marshalling and unmarshalling objects. It's a PITA. I blame this mostly on SWIG (which I am NOT a fan of. Don't get me started on what the maintainers consider good development practice.)
Python's a great choice if you can do it all natively. It's also a great language to prototype in and then "translate" to another language like C++ or C or Java. (depending on task and preference.) But I wouldn't do the python+swig thing.
[Note: I'm only posting anonymously to protect my identity. There are certain political factions at work that read
Re:Consider Python... Wait! Don't leave!!! (Score:2, Interesting)
1. Using indentation instead of braces kills the religious "coding convention" wars before they have a chance to start. It's easy to read, it makes what you read and what the parser read consistent (Never chased a mismatched indentation/braces case, have you?), and it just plain works. Where did that function start? Any editor worth its while can tell you that, most of them already have a macro that does this. If you ever used Scintilla/SciTE you'd probably never go back to "find matching" only style editors unless you were forced to - collapsing functions makes a lot of sense even in the curly brace world (more so in Python's indentation world).
2. There are add-ons that can enforce that, but that would be missing the point. The Python interpreter and language specification goes to some length to catch this kind of errors, and although it's a long way from e.g. C or Java, it caters for the common cases. Typos in long variable names may create annoying bugs, but ones that are _always_ easy to identify and fix. True, they wait for run time rather than compile time; personally, the number of bugs of this kind that I get is consistently low enough for this not to matter (and, since Python code tends to be an order of magnitude shorter than any other language except Lisp or APL, it's more than worth it. Plus, there's a Lint for Python if you insist). Variable declarations are NOT free documentation. "Object my_object = new Blah();" is not more informative than "my_object = Blah()". It's the variable's name that's the documentation, rarely it's type.
3. Oh jesus. C++, Java, SmallTalk, LISP and just about any other language does this too. What language are you using? Plus, try scintilla and you'll be amazed at what a GOOD language sensitive editor can do (for any of the above languages).
Re:Consider Python... Wait! Don't leave!!! (Score:1)
Vim 6 has very nice folding functionality.
john
I have a similar question: (Score:2, Informative)
But seriously.. Every language provides standard support for file IO, unless it's totally half-assed*.
If you actually want to get helpful answers, you might provide a little more information. For instance: How much analysis will you be doing on the files? How much data are you dealing with? Is this probably going to be blocking on input all the time, or does run speed actually matter? How large and/or complicated will your program be? Does cost of deployment really matter?
* Or halfway totally assed, or whatever.
Re:I have a similar question: (Score:2)
What you are asking for does not need an entire language, just a
library.
Get the best library out there, and scan through list of its "binding"
languages, then pick the one you are most comfortable with.
Short answer:
Common Lisp (CLISP + CLOCC + emacs + ilisp) is a killer.
Yes, try O'Caml! (Score:4, Informative)
If you just write the same program you would have written in C, the speed will be quite good, probably about 20% CPU-slower than C. (And if your program is IO-heavy, you might not notice this at all.)
If you have any sort of limited time or interest (as most projects do), you'll be able to write a much better program in O'Caml than you would in C, because:
- Because it's safe, you won't need to ever spend time tracking down or debugging core dumps or memory leaks. Because it's statically typed, a large percentage of bugs are caught at compile-time.
- If your program is interacting with the network, you won't need to worry about buffer overflows, format string bugs, or most of the common security problems.
- O'Caml has a much richer core language than C, with support for algebraic datatypes, pattern matching, higher-order functions, threads, modules, and objects. You can do a lot of great stuff with these.
- O'Caml has a nicer (though not as nice as, say, SML) module system, which keeps your program from getting unmanageable, and helps isolate faults to a particular module.
And by better, I also mean faster -- development wisdom says that algorithms and data structures are what matter most, not just the instruction-level efficiency of your code.
Of course, if you don't know the language, then it will have a higher startup cost for you. But I think it's worth it; you'll learn a different programming style that can help you think in new ways even when you're writing code in Old School languages. =)
Use OCaml (Score:2)
Of course, it has a portable IO lib - just because the corresponding module for more low level stuff is called "Unix" doesn't mean that it isn't available on Windows as well, with some restrictions [inria.fr].
c++ is out? (Score:5, Insightful)
Re:c++ is out? (Score:2)
Right on.
Benchmarking is the key. And, it pays to do it every few years or so, as compilers and hardware and software platforms evolve.
While not related directly to your I/O question, a colleague found that earlier benchmarks we had done for floating point intensive calculations which showed FORTRAN beating C++ by about a factor of two were outdated. Current tests show them comparable in speed (as long as you're not too careless with your C++).
I think I/O in C++ can be reasonably fast for most purposes, but again, as long as your careful about how you do it.
By all means, benchmark!
Re:c++ is out? (Score:3, Informative)
Re:c++ is out? (Score:4, Informative)
Moreover, because the compiler knows what you're actually trying to do, it can often perform optimizations that are not possible in C. For the example of virtual function calls, the equivalent in C (both in terms of functionality and efficiency) is calls using function pointers. The difference is that in C++ the compiler often knows the dynamic type of an object (if it's an actual object and not a pointer or reference) and can optimize away the virtual function call and replace it with a static call (or even inline the function). The C compiler is unable to do that.
So yes, there are features in C++ that have a performance penalty, but they have no equivalent in C, so the comparison is invalid.
As for ocaml or other FP languages, I think it's a good idea to try them. Besides the productivity and maintainability gains, you may also have actual efficiency benefits. Again, because the compiler knows what you're trying to do in a high(er) level language, sometimes it can perform obscure but very effective optimizations that can beat what an average or even good C programmer can do.
Re:c++ is out? (Score:1)
This is the police. Put down the bong and come out with your hands up.
Re:c++ is out? (Score:1)
The next additional overhead involves virtual methods. A call to a virtual method costs much more than to a normal method, because there's a memory lookup involved into the vtable.
The nice thing about C++ is that you opt-in to these extra features. Don't use virtual methods if you don't need them. And if you do need them but can't live with the overhead, sometimes you can use templates instead, which use up more memory but are just as efficient as normal classes.
Don't forget the 90-10 rule. 90% of the program time is spent running 10% of the code. Even if you build your entire program using virtual functions, at the end you can profile it, find those 10% (usually less than that, according to my experience) and optimize them using improved algorithms, time-memory tradeoffs, inlining, and other such methods. I find it hard to believe you'll regret using C++ over C because of the performance hit.
Re:c++ is out? (Score:1)
Re:c++ is out? (Score:1)
Re:c++ is out? (Score:1)
Re:c++ is out? (Score:2)
My work experience is that c++ is not easily portable.
All c++ compilers I've worked with on various unixen had some kind of brain damage that made most of the advanced c++ features (like templates) near unusable.
If you realy have an open house (Score:1)
Java, Nothing but Net (Score:1)
Java is very portable and can do all that bit fiddling just as well as C. The syntax is very similar to C, so it shouldn't take long to adapt.
Once you have written the progam for Linux, the exact same code would work on Windows. Write the program once, not twice. Save yourself some time.
You won't have to worry anywhere near as much about messing up a pointer somewhere or about allocating the wrong amount of memory.
Performance? If you're worried about performance, then you have not used a recent copy of Java. Find Java 1.3 or 1.4 and try it for yourself. I've got a Java program that scans through about 6,500 Novell user accounts in under two minutes. Performance is not a problem unless you want speedy GUI.
Since you're not needing a GUI, I think Java would be an excellent choice.
Re:Java, Nothing but Net (Score:1)
And at some point, if you decide you want a GUI, take a look at IBM's SWT instead of AWT and Swing. Programs written with SWT are indistinguishable from native programs on Windows, Linux and Solaris (That's all I've tested BTW, and in addition SWT has bindings for Photon so you can run it on QNX!).
Re:Java, Nothing but Net (Score:1)
But if you do get into a performance problem in a particular section of your program (from the sound of it it'll probably be an algorithm-related part), you can always implement that part in C and call it from Java using JNI. This still leaves most of your program cross-platform, while solving the performance bottleneck.
Use Logo!!!! (not...) (Score:2)
Really.
The better you know a language, the faster you will be able to write your app, the more optimized it will be, fewer bugs, etc. This is common sense.
(I was going to have a really smart-assed comment on Logo, but I'll reserve that for later....)
Compiled FORTH (Score:1)
more than just a language performance question... (Score:5, Insightful)
But all of these opinions presume that you're fairly experienced in these languages. Ignore them.
Language experience/familiarity is THE factor here, so don't discount it. Someone who has been eating and breathing Java would likely produce speedier code than someone who is just learning C, for example.
Your employer/client wants SPEED. This project involves hairy and complicated bit fiddling. I would suggest NOT using this project to learn a new language, for the risks outweigh the rewards in this situation.
If you choose to use a new langauge for this critical job, you're setting yourself up for disappoint. Do not forget that you're going to have to go through the all the growing pains associated with a new langauge. You're going to spend weekends tracking down (and learning from) all the newbie mistakes one makes with a new langauge. You are going to encounter new and unfamiliar bugs at all levels - logical design, physical design, semantic, syntactic.
Do you really want to spend your nights and weekends figuring out what the heck is throwing some particular JAVA exception seamingly at random? Why your C++ function template specialization is being ignored?
Learning a new language is exhilarating, but that will quickly turn to FRUSTRATION when you run into that weekend-long show-stopper bug.
With your product being measured by performance, and with deadlines looming... When it comes down to crunch-time, I think the choice is OBVIOUS!!
Choose a different, fun project to learn a new language. But for this product you're delivering, I would encourage you to stick with the tools you know and love.
Best,
Captain Abstraction
Multi-threading libraries (Score:1)
Use C (Score:5, Funny)
#include <stdio.h>
Says it all really.
Cheers,
Ian
Re:Use C (Score:2)
#include <stdlib.h>
Yeah, that does say it all. I have been working on such a thing. I have been attempting to do a cross platform library. So that it will at least be source compatible.
This is really difficult to do. If all you are doing is memcpy, file io, printf, then it is possible. If you get into sockets then it gets a little more machine dependant. Use log on and off, is even worse.
One option is to pick a cross platform C API. glib may work. I think there is a port to windows and if not it still should work under cigwin. Its speed is not that bad and it gives you things like sockets and linked lists and all the things you'd need for a daemon process or simple none gui program.
Memory-mapped files (Score:1)
learn C++ (Score:1)
Nevertheless, C++ can be fast, powerful, and simple as well. People have problems with C++ if they don't understand it well or if they work with people who don't understand it well. That is a real problem (most commercial and open source C++ programs and libraries are awful), but don't blame the language.
Java (Score:2)
Don't forget the Apache Portable Runtime (Score:2, Informative)
Apache 2.0 is based on an excellent platform independent IO library (and many other cross platform data types, data structures, etc), the Apache Portable Runtime. It's written in C, and it's fast.
http://apr.apache.org/
wow (Score:5, Funny)
I saw O'Caml!!
You quiche eating wanker, how COULD you forget assembly? Isn't that what programming is
all about? And WHY are you comparing C to O'Caml, a fine assembly macro language, to
shity ML dialect used by equally hard-wanking mathematicians and abstractly thinking
creatures? If these wankmaticians knew how the world operated, they would not
have invented recursion let alone APPROVED of inductions as a sane, corner stone
princible in their so called "art". Induction is only possible as long as the
the "counter" register can hold your index, and recurssion is the crackwhore narcessistic
twin sister of iteration (there is nothing she does, iteration can't do with
a well placed label and a jump.)
Listen to me son, read Quine, Boole and DeMorgan, get the manual to your processor,
and "script" at the level of the ONE TRUE ABSTRACTION LAYER.
Re:wow (Score:2)
Using C++ properly???? (Score:4, Insightful)
Object oriented development is a tremendous thing, useful for many things, and a marvel of overcoming complexity through abstraction.
BUT, OOP is not the solution for everything. There are many problems that don't need an object structure, and should be written another way. Above all, drop the notion that C++ should be used only a certain way to be proper. The latest cool feature of C++, the Standard Template Library, isn't even object oriented - it's GENERIC, because that type of programming just was the right thing to do for that library.
TCL (Score:2)
For me, using TCL my performance increased by 60%
(especially when using its [Incr TCL] OO Extension)
TCL works on most unices, Windows, Mac, VMS, Palm Pilot...
Tk graphical library is so successful that other languages
(perl, prolog, python) are using it.
No, thanks. (Score:2)
One Word: (Score:3, Funny)
QBasic.
Re:One Word: (Score:1)
QBasic was a cool language, and I used it a lot in my childhood days....when I was learning programming...
But now I just love C/C++....its more structured and coool....i just love it...
Memories... from the corner of my mind... (Score:1)
I'm a worse programmer today, and the worst part is, I can't remember any of it...
How about AT&T's sfio (Score:1, Interesting)
What speed hit? (Score:3)
A smart C++ programmer can use template metaprogramming in a library like Blitz++ [oonumerics.org] to automatically build code optimised for the job. To write the equivalent code in C is possible but it's much more laborious and harder to maintain.
There are good reasons not to use C++. Performance isn't one of them.
Re:What speed hit? (Score:1)
Oh, and before someone says, Just aply the prelink patch... Tried it, but I can't spare my computer for the days it would take to compile all that C++ code.
Re:What speed hit? (Score:1)
That's like saying c sucks because my linear search of my 100meg dataset is slow. Perl is much faster when I use binary search, therefore Perl is always faster than c.
Now, you *have* raised one valid point about template heavy c++: It can be a bit slower to compile. So far, this has never been an issue for me. By far the longest compiles I've had to deal with were output from yacc, which is of course, POCC (plain ole c code).
Re:What speed hit? (Score:1)
Re:What speed hit? (Score:2)
Re:What speed hit? (Score:2)
duhhhhh (Score:1)
Newbie, shoot thyself (Score:1)
Re:VB (Score:1)
you didn't mention any platforms other than win32, or win16
General purpose vs. best of breed (Score:2)
You can choose to use a general-purpose language which has a good spread of capabilities, or you can go with a best of breed language in the area you are trying to work in.
For general projects, I use a mix of Python and C++. I'd say the best of breed languages for text would be Perl, math would be Haskell, and for getting down to the metal would be Assembler.
For what you are trying to do, the no-brainer choice would be souped-up C, i.e. C which uses a few C++ features to make your life easier.
Use K (Score:2)
Some of theK programming maxims are that memmap is better than read/write (the native file I/O is memmap), operating over bulk data is better than scalar data (the language is built around bulk operators), and terse code is good.
There is a warning, though. K is very elite and may be too elite for you (it was for me at first), but it is very eay to learn.
Kylix / Delphi (Score:2)
That's easy... (Score:1)
The Great Computer Language Shootout (Score:1)
Ada 95 (Score:1)
You can download a non-supported version (windows and Linux) from
ftp://ftp.cs.nyu.edu/pub/gnat
or wait a few weeks for gcc 3.1 to be released (since the Ada 95 GNAT backend will now be included)
PHP (Score:1)
It has a few things going for it.
** Faster than a turtle
** Anyone can code it
** Doesn't show up during random drug testing