Genome Methods Applied to Reverse-Engineering 94
L1TH10N writes "Wired news has an article on a truely innovative way of analysing network protocol reverse-engineering. Marshall Beddoe, a security analyst, is using algorithms used in bioinformatics to analyse closed-source and secret network protocols which he calls "Protocol Informatics".According to Beddoe, network conversations are full of "junk" -- usually the actual data being sent -- which interferes with the analysis of the occasional command sequence that controls what to do with that junk. This has parrallels with Bioinformatics that has to deal with a similar problem of finding known DNA sequences separated by long gaps of unknown data. Biologists have devised complex algorithms to discover whether DNA sequences are descended from the same ancestors by comparing the genetic differences with the known mutation rates of certain DNA components. Beddoe applied the same principles to mutating network conversations of evolving network protocols."
After today's Nobel prize in physics... (Score:4, Funny)
Re:After today's Nobel prize in physics... (Score:2, Interesting)
I firmly believe that bioinformatics is going to the the next IT. Programmers will use compilers that create genetic sequences for bio-machines and bio-computers (the debugging process is the main scary part). The odd contrast to present IT
Re:After today's Nobel prize in physics... (Score:2)
Re:After today's Nobel prize in physics... (Score:2)
I agree about bioinformatics being the next big thing, but it's really just another kind of information technology. Same idea, different system. Damn, I gotta learn how to code.
Re:After today's Nobel prize in physics... (Score:2)
You can code. It's just that, since you're a slashdotter, you don't have a compiler.
Re:After today's Nobel prize in physics... (Score:2)
Novel? This stuff [ucsd.edu] is old hat [ucsd.edu].
Re:After today's Nobel prize in physics... (Score:1)
Re:After today's Nobel prize in physics... (Score:2)
Re:After today's Nobel prize in physics... (Score:1)
Im sure you have all read about swarm style ai taken from nature. Google boids if you haven't. Its a relatively good model. It isnt too hard to imagine emergent behaviour used in future technological issues... Certainly copying nature is a valid method of problem solving. No point trying to reinvent what already works.
Now it would be truly interesting... (Score:5, Interesting)
If that could be implemented somehow (an attached appliance or something), it could drastically cut the amount of spam that goes through.
Re:Now it would be truly interesting... (Score:1)
Re:Now it would be truly interesting... (Score:2)
I was thinking along similar lines (Score:4, Insightful)
At least until the spammers figured out how to make spam look so much like certain types of legit email that we started losing good email...
At which point (Score:1)
Wouldn't that be ironic, that spam actually DID provide a cure for cancer or some other disease? And you wouldn't even have to read it or buy anything!
Re:Now it would be truly interesting... (Score:1)
http://isg.ee.ethz.ch/tools/postgrey/
shouldn't it be... (Score:2, Funny)
Re:shouldn't it be... (Score:2)
Will It Read .doc Files? (Score:4, Funny)
Perhaps these techniques can be applied to the never-ending task of creating an accurate converter for MS Word .doc-uments?
Yes, simple document conversion is possible but until 100% accuracy is possible the race is not won.
Re:Will It Read .doc Files? (Score:1)
Re:Will It Read .doc Files? (Score:2, Interesting)
Bert
Who started his own company and now understands first hand what his former secretary had to endure when battling with that productivity killer. We need competition to get rid of it. Any measure against Microsoft should involve opening the standard.
Re:Will It Read .doc Files? (Score:2)
Modeling (Score:3, Insightful)
I dont know what im talking about... but its cool anyway.
Re: (Score:2)
So... (Score:2, Funny)
Illegal in the US.. (Score:3, Funny)
Re:Illegal in the US.. (Score:4, Informative)
Re:Illegal in the US.. (Score:2)
Computer forensic has other clues... (Score:5, Interesting)
Re:Computer forensic has other clues... (Score:1)
You mean like when someone defaces a webpage with "Roight! USA eats chunder! AUSSIES RUL3!!1!one!1!" they can figure out that the perp is (obviously) Canadian?
Contrasts: Datastreams to DNA (Score:5, Insightful)
"Junk" in DNA (e.g., "latent" DNA) is probably not junk, we just don't know the function (yet). No scientist worth their salt would admit that (at least not in earshot of a grant proposal review committee!)
Re:Contrasts: Datastreams to DNA (Score:2, Informative)
> junk
Actually theres an article in this months SciAm that talks exactly about this. Very interesting
http://sciam.com/article.cfm?chanID=sa006&colID=1
Re:Contrasts: Datastreams to DNA (Score:2)
Exactly? The article you've linked to (what I can see of it; I'm not a subscriber) appears to be about RNA's role in the regulation of genes.
There's nothing about "Junk DNA", although I know introns play a role in the regulation of a genes transclation. Nobody calls the DNA in those regions "junk" DNA, though.
Having not been able to read the full article, however, I may have missed some important link into the "junk" DNA to
Re:Contrasts: Datastreams to DNA (Score:1)
I can't think of many scientists who think about "junk" DNA anymore...but if I ever get my research finished and published, then I'll add one more nail to the coffin.
Re:Contrasts: Datastreams to DNA (Score:2, Informative)
From what I've read there is a case that there is real Junk in the DNA. Various sequences which at some point in the past served a purpose but now (like the human apendix) the original function is no longer relavant. I've also read somewhere that some of the DNA is actually a sort of virus which eons ag
Protection from genetic damage (Score:2, Interesting)
I read something about this in NewScientist a while ago. Blocks of a certain base (guanine?) either side of important regions of DNA, which are more susceptible to damage (by free radicals?), serve to protect the important code, by be
Re:Contrasts: Datastreams to DNA (Score:2)
If your disassembling the code of a program, the data is just junk that gets in the way until you figure out what the code is doing. Of course the ascii comments in data may be useful and from what I can tell, DNA doesn't seem to have any text strings in it so for now its just junk.
I haven't looked into the pattern matching stuff the bio guys are using but its very handy to be able to take a bit of a program and find out where the common libraries functions are h
Network Protocols vs. Building Blocks of Life (Score:5, Insightful)
"They're working on uncovering the mysteries of life itself; we're just hacking network protocols," he said. "Which sounds more important to you?"
I don't think Beddoe should cheapen the reverse engineering aspects of networking compared to biology. We may still be years away from finding a cure to cancer, AIDs, etc. and there's a good chance that biology work in this area might not be as fruitful. After all, (without getting into a religious debate, here) man was not created by man, whereas network protocols are. Because of this, it is relatively easier for us to reverse-engineer something that was created by another human, because we know how they think. Evolution or creation, we don't know much about our own building blocks, because we don't know how either God thinks, or the universe fully works.
While his software is great for "hacking network protocols", the biologists paying attention to his work might not find what they are looking for. The inputs very well may be just too vast for his ideas to provide any help.
On the other hand, the Samba team and the Spam Assasin author will most likely enjoy this.
Re:Network Protocols vs. Building Blocks of Life (Score:1, Flamebait)
Funny, I was taught that every person now alive was created by man, or more exactly, was created by man and woman.
Don't make me explain why... it's kind of gross, and outside the domain of most /.'ers anyway.
Re:Network Protocols vs. Building Blocks of Life (Score:2)
Not an apt analogy (Score:2, Insightful)
Genome sequences are much more consistent. It's all data, processed by RNA computers.
Re:Not an apt analogy (Score:4, Insightful)
Re:Not an apt analogy (Score:1)
Re:Not an apt analogy (Score:2)
Reinventing the wheel. (Score:1, Funny)
true+ly = ? (Score:2, Informative)
Gary Larson's prior art (Score:2, Funny)
tech-transfer... coming to IT near you (Score:4, Interesting)
Universal principles of information communication (Score:5, Insightful)
Re:Universal principles of information communicati (Score:3, Informative)
Re:Universal principles of information communicati (Score:2)
Re:Universal principles of information communicati (Score:2, Informative)
Unification and Backtracking (Score:2)
Looks like a nail (Score:2, Funny)
Talk about Race conditions (Score:2)
Seriously, how much would a Big Red Button have cost?
DNA vs. DMCA (Score:1, Flamebait)
Re:DNA vs. DMCA (Score:2)
The real problem is a ratio: N:C. N = the number of people; C = the people's ability to communicate with one another. N:C is too large. And the units for C are unknown (and, reflexively, a factor in C). We're too busy driving N to collapse. All hope lies
Bioinformatics links (Score:5, Informative)
Also figuring out biology seems to be a lot harder than figuring out networking, at least there are all kinds of nefarious things but also serendipitous things found. Like one presentation I just heard had a U.S. scientist who announced that they had discovered an entire signalling network in human cells that was like the one found in yeast cells. And apparently more proteins can be encoded than the number of genes, because of alternate orderings (counting from different displacements in the gene, I think, ask a real bioinformatics expert). One talk I heard a year ago that stuck with me was a scientist who had devised a way to find signalling pathways in cells quickly; by forcing the cell to die if certain requirements were not met, he created a parallel computer that allowed him to discover a whole swath at once. There is also a lot of math and statistics, as well as a lot of biological knowledge behind it, it is not strange to see various statistical tests, references to different computer programs they used for analysis, or a mention of simulated annealing (well maybe that one not so often, came up yesterday though).
One interesting thing is that they (the H-Invitational people / Japan Bioinformatics Consortium) have I believe twice held what they call annotation jamborees, much like a hackfest! In 2002 they had 120 scientists gather (mostly Japan but from all over the world) in a big room with a computer per person. They locked them in for 10 days, and annotated IIRC over 20,000 genes, basically doing a figure some man years of work in a week, inputting data so it can be searched, analyzed, and crossreferenced.
They do have a comparison between mouse and human genome there, I wonder if something similar could be done in open source in terms of annotating and indexing a libary of open source code in different languages, really all in one pseudo language would be more useful perhaps. Anyway biologists are learning from computer scientists learning from mathematicians, and someone famous has said that in the future, all science will be computer science.
Bioinformatics people are doing text mining and data mining, but also there are many flavors and types of analysis programs designed to penetrate and match up information as encoded by tiny molecules, folded proteins, genes, and so on. Here are some links to get started. Also note the perl for bioinformatics books, and there was a big oreilly bioinformatics conference archived from 2003 and other links too (see bio.oreilly.org link below).
I cannot speak for everyone, but I can convey what I have heard, that there have long been communication gaps that have held back some of this, actually cultural differences. For example physicists like pure math and biologists deal in dirty, wet things.. when people successfully combine different perspectives in this area [more] discoveries start getting made. In Japan at least they are trying to figure out how to grow more bioinformaticists, since students tend to go only towards either biology or towards computer science (why study twice as hard). But there seems to be a lot of interesting stuff in there for both sides.
PLoS Bio article [plosbiology.org]
some clusty [clusty.com]
faq [bioinformatics.org]
Re:Bioinformatics links (Score:5, Informative)
Re:Bioinformatics links (Score:2)
Note that the cassette model of alternative splicing is not mutually exclusive with the 'diff
Biologists are aware of this (Score:4, Interesting)
The majority of study is computer research applied towards biological methods and models, but I'm sure some of the cs geeks will be reading this article and grab the work done by the bio geeks.
And in the end, we will all have the best mouse trap ever.
Another good source (Score:2)
http://www.ietf.org/rfc.html
crossover of underlying math (Score:2)
There's a pdf here [unicaen.fr] on the subject or you could read the go
Ahem... (Score:1)
http://www.acsac.org/2003/beststud.html [acsac.org]
Re:Ahem... (Score:1)
Could someone explain.. (Score:2)
I guess he should write a script to create a huge amount of very similar programs, and compile them all to create binary trees. Are there standard methods for analyzing such a data set? Is it just simple multivariate statistics?
sounds exciting (Score:1)
bioinformatics is reverse engineering too but (Score:2)