Skein Hash... In Bash 90
First time accepted submitter Matt16060936 writes "...Last night (err.. 3am this morning) I finished an implementation of the Skein 512-512 hash algorithm (version 1.3). I'm a fan of Skein and hope it wins the SHA-3 competition next year. One of the nice things about Skein is how quickly it's been adopted by many platforms and implemented in many languages. To that end, I present Skein 512-512 implemented in Bash."
Ah, Bash (Score:5, Insightful)
The cause of, and solution to, all of life's problems. Here's to Bash! If it weren't for you, I would be a terrible sysadmin. Thanks to you, I am a terrible sysadmin!
Re:Ah, Bash (Score:4, Funny)
Don't bash bash.
Re: (Score:2)
Re: (Score:1)
Yes, I actually saved that post all these years. It's funny.
Optimized for 64-bit processors (Score:2)
Re: (Score:2)
So, just like SHA-2?
Re: (Score:2)
Re: (Score:2)
~/p4/scripts/skein $ uname -a
Linux mjm 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 i686 i386 GNU/Linux
~/p4/scripts/skein $ time
70d0...7e
real 4m30.224s
user 3m19.416s
sys 0m28.850s
Re: (Score:2)
You have got to be kidding. Implementing a hash algorithm in a scripting language is an inside joke. The overhead of the language in this case is easily ten thousand times the overhead of the actual calculations, and more likely ten times that.
Re: (Score:2)
Re: (Score:3)
A slightly slower than usual hash algorithm is not likely to be noticeable on a modern client, even a handheld device.
All else equal, a 64 bit algorithm should get roughly twice the amount of work done per operation as a comparable 32 bit algorithm, so the performance overhead on 32 bit architectures isn't nearly as bad as it looks. Ten to twenty percent maybe.
And if you are really concerned about performance, you write an implementation in assembly language.
Re: (Score:2)
A slightly slower than usual hash algorithm is not likely to be noticeable on a modern client
Even if the hash is run every time a process is started, in order to check that the executable has not been altered since it was installed?
Re: (Score:2)
Yes. You could probably run several thousand rounds of Skein on an input each second, and thats being conservative.
Re:Optimized for 64-bit processors (Score:4, Informative)
On the slowest 32 bit processor tested (32 bit, ARM v4, 75 Mhz), NIST benchmarked Skein using portable C code at just under 1 megabyte / sec. That is about twice as slow as SHA-2 on the same processor, and certainly slow enough that you might notice in the case you mention. On modern 64 bit processor (Intel Q6600, x86_64, 2.4 Ghz), more like 286 MB / sec for Skein, about twice as fast as SHA-2. See here [nist.gov] (pdf).
The striking difference between 32 bit and 64 bit implementations is much more than I would have guessed, but that may be merely a matter of optimization. For now it looks like a good excuse to use SHA-1 or SHA-2 when doing the sort of thing you describe on slow processors. For something like SSL or IPSEC, you aren't likely to notice the difference, because the bandwidth to a typical mobile device just isn't that high.
Re: (Score:2)
All else equal, a 64 bit algorithm should get roughly twice the amount of work done per operation as a comparable 32 bit algorithm, so the performance overhead on 32 bit architectures isn't nearly as bad as it looks. Ten to twenty percent maybe.
It depends on which processor you're using. x86-64 not only has a crapload more registers but it actually has general purpose registers. x86 instructions often (maybe even usually) expect your operands to be in certain registers and put the results in certain registers, to the extent that x86 really has zero general purpose registers. This is mitigated by register renaming on modern platforms. On a modern machine the cost might not be immense. On an older machine, it will be. Also, longer words mean less lo
Re: (Score:2)
It's going to run really, really, really slow on 8 bit computers.
Re: (Score:1)
Not to mention 4 bit computers. [wikipedia.org]
Not an issue (Score:2)
Look at page 25 of the Skien Whitepaper [schneier.com]. Using C implementations, Skien-512 outperforms SHA-512 and Skien-256 is only about 75% slower than SHA-256 on a 32 bit CPU. That isn't unacceptably slow.
first post (Score:2)
Damn. Slashcode won't let me post the Skein encrypted version of first post (Filter error: That's an awful long string of letters there.)
Re: (Score:2)
I did, but you guys just couldn't read it.
Re: (Score:1)
Thank goodness, since hashes aren't encryption.
Re: (Score:2)
That depends on how well the hash function distributes its hashes, the size of the original message, and how good your rainbow tables are :p
Hmmm (Score:2, Insightful)
Amazing how code is always produced in the middle of the night...
Re: (Score:3)
This applies to much more than just code. I thinks it's mostly because of lack of distractions, but there's also the "I'll just finish this and go to bed" mentality.
http://www.phdcomics.com/comics/archive.php?comicid=1219 [phdcomics.com]
Re: (Score:2)
Unless you're a /. geek . Then you're likely writing code in the middle of the night.
Sourceforge? (Score:1)
Why didn't you post this on Github?
At least it has syntax highlighting, I'm no expert at Bash, but I don't think anyone would appreciate reading it with no syntax highlighting.
Re: (Score:2, Interesting)
People have been doing clever things in bash long before there was syntax highlighting. In fact, it took me years to finally accept that as anything other than clutter. Though, I still can't stand colored output of the "ls" command.
Having cut my teeth using line editors over 300 baud modems (or terminals wired into the mainframe via serial cables), I am sometimes amu
Re: (Score:2)
I can totally relate to what you are saying because I am one of the worst offenders... yet I'm a painter. I have all these cool brushes and rollers and pneumatic sprayers and scrapers and paint in all manner of colors and shades. But when I get to the client site, I look like an idiot because I have to resort to scraping off old paint with my fingernails and painting several coats of paint with my fingers. It takes forever and I gotta tell you it gets aggravating. But what am I gonna do? I got so used to al
Re: (Score:2)
LOL ... funny.
No, seriously ... when I first saw syntax highlighting I was literally like "WTF is this crap". It was just a jumble of colors on the screen, and I found it quite distracting.
I was used to working on monochrome VT52s and VT100s ... so the first time I saw syntax highlighting, I turned it off. To this day, I find the color output of "ls" actually conveys less information that knowing that an "@" is a symlink.
Now, of course, I'll take syntax highlighting any day of the week. But it really did
Re: (Score:2)
Intercal (Score:2)
Come back with the intercal implementation.
Re: (Score:2)
Intercal is peanuts compared to Malbolge.
Re: (Score:2)
Malbolge is a child's plaything compared to Vogon poetry.
Next, LOLCODE (Score:1)
n/t
Self-encrypting (Score:1)
Sadly the hash of the bash script is only marginally less readable to me than the source.
Re: (Score:2)
A skim of the functions looks like it is a clever implementation of bitshifting left and right (in varying amounts), as well as a block portion. May I suggest Applied Cryptography (http://www.schneier.com/book-applied.html) ? While it may not cover this particular hash algorithm (perhaps recent versions do?), a lot of the actions used here are covered there. The first half of the book (third?) is non-code, and VERY informative to anyone interested in how encryption works.
Fixed Applied Cryptography link (Score:2)
Bah -- fixing my link. http://www.schneier.com/book-applied.html [schneier.com]
Dreadfully slow (Score:2)
I've had times when I'd have found it useful to have something like base64 or md5 in shell script form to require less dependencies on ancient installations, assuming it could be made work with anything approaching acceptable performance.
Unfortunately this isn't it. A 194 bytes file took 3 seconds. The skein script (10K) takes 2 minutes and half, and it's ridiculously memory intensive too. The process grew to 150 MB in size.
And the "Useless use of cat" Award goes to (Score:2)
Matt Tomasello for:
echo 'Usage: cat FILE | skein [ARGS]'
Re: (Score:1)
978-0517545164
Re: (Score:2)
Looks very entertaining, will make a great fit for an old unix admin/friend and cat lover.
Re: (Score:2)
If the Skein bash function was written to take in text input and spit out a hash, using cat would allow using a filename as the source of the text you want SKEIN'd. If you tried doing "Skein [args] FILE", you would just get the hash of the filename.
How do you propose the SKEIN function determine whether the second argument is a filename, or the text to be hashed? Should he add that funciton as a further switch, and also implement the function of reading the contents of a file (which would presumably use "
Re: (Score:2)
/. ate his "less than" sign. What he means to say is:
Where bash handles opening, reading, and piping the file into STDIN. No need to start another process.
Re: (Score:2)
I would recommend the standard Unix conventions for stuff like this:
$ skein [args] [file] [--text 'some text']
Re: (Score:2)
Am I being trolled?
If the Skein bash function was written to take in text input and spit out a hash, using cat would allow using a filename as the source of the text you want SKEIN'd. If you tried doing "Skein [args] FILE", you would just get the hash of the filename.
skein < FILE
surely?
Re: (Score:2)
No, I didnt think of that as I deal with Linux CLI for about 10% of my job and I usually dont deal with reading input from files. It has been a long, long time since I have used that file pipe. I used to deal with it in Windows, and I got tired of using it when it didnt work properly with various commands (like set), so Ive always used alternatives, and that habit sticks with me in Linux as well.
Really, I dont get the big complaint about using cat to handle it, though, several people here seem to be advoc
Re: (Score:1)
Re: (Score:2)
It didnt eat my less than symbol, I just didnt think of using a file input pipe.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Simply put, you follow the conventions established by md5sum and other command line hashing tools that have existed for many years/decades prior.
Re: (Score:2)
Would that involve adding more bash code to his already long script that he probably doesnt ever want to have to re-work on?
Also, as others have pointed out, you could use unix file pipes to handle reading the file; adding more arguments to it really would be retarded.
Re: (Score:2)
Matt Tomasello for:
echo 'Usage: cat FILE | skein [ARGS]'
I find this kind of pedantry a bit dull. When I'm hacking around on things at the command line I almost always use that form. If you're building up a pipeline of various parts to get something done, and you need to insert at the beginning, or reorder components, it's much easier if the input to the pipeline is a clean, separate term.
Trivial example:
cat FILE | foo is easily edited to head FILE | foo in a way that foo < FILE is not.
Re: (Score:2)
Hacking around on the command line is one thing. Writing examples is another. When you save your code for long-term re-use or reference purposes it should be well written.
Re: (Score:1)
But what's the problem with using cat here? It's not necessary, but it doesn't harm either, does it?
Also, well written code means maintainable code. Maintainable code means code that is easy to change. Which speaks for cat.
BTW, I've even found uses for the apparently useless use of cat in cat | command or command | cat (note: no filename on the cat!). That's when using programs which test whether standard input or standard output is a terminal, and I want the non-terminal behaviour even though I'm typing in
Re: (Score:2)
But what's the problem with using cat here? It's not necessary, but it doesn't harm either, does it?
In fact it does do harm. Not even counting tangentials like "it encourages people to do this reflexively so they will get bitten when the difference really matters", it execs one more program and uses a good chunk more memory.
Maintainable code means code that is easy to change. Which speaks for cat.
I dispute that "cat foo | grep bar" is easier to change than "grep bar BTW, I've even found uses for the apparently useless use of cat in cat | command or command | cat (note: no filename on the cat!). That's when using programs which test whether standard input or standard output is a
Re: (Score:2)
Using cat effectively doubles the required IO operations.
A solution to your "left arrow to something":
arrow up
ctrl-arrow left
ctrl-w
Re: (Score:1)
<file grep something
There you go.
Skein Hash... (Score:2)
...not to be confused with Skin Rash.
Re: (Score:2)
>Is this really 'newsworthy'?
inasmuchas anything newsworthy for nerds can be said to be non-newsworthy for nerds, and the line between nerd and non-nerd moves with the subject, then yes, this could be newsworthy for bash or skein nerds
on the other hand, it's the first i've heard that they were cooking up a SHA-3. so maybe there are multiple layers of nerdiness involved.
tomorrow i expect to see your posting of how you've implemented it in brainfuck.
Re: (Score:2)
it's the first i've heard that they were cooking up a SHA-3. so maybe there are multiple layers of nerdiness involved.
It's hardly [slashdot.org] the [slashdot.org] first [slashdot.org] time [slashdot.org] it's been mentioned [slashdot.org] on Slashdot.
Turing-complete (Score:1)
Newsflash: a known algorithm implemented in a Turing-complete language. Uhm...
Re: (Score:1)
Re: (Score:2)
Newsflash: a known algorithm implemented in a Turing-complete language poorly suited for that algorithm
fixed that for you.
I implemented it last night using my abacus and slide rule
Re: (Score:1)
I always like to challenge people to try it in Brainfuck or with a single instruction computer. Turing completeness is more of a theoretical thing as it ignores difficulty, and polynomial time Turing machine stuff only partly captures that.
Re: (Score:1)
I've done it on my (purely theoretical) single instruction computer.
Now, it wasn't exactly difficult because the single instruction of that computer has been well chosen.
Here's the implementation:
Validation? (Score:1)
Now do the validation suite for it. That's really the hard part.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
There's nothing in it that fundamentally requires bash, you just need a shell which supports arrays. It might not perform as well, however, without some of the bashisms. I'd say porting this will be of minimal effort.
Error in script? (Score:2)
me@here$ uname -m
i686
me@here$ getconf LONG_BIT
32
me@here$ echo $(( 1 << 32 ))
4294967296
me@here$ echo $(( 1 << 64 ))
1
Your test would consider my ARCH as 64 when it is clearly not. But then why does left-shifting 64 times not overflow?