Software Archaeology 434
Plug1 writes "Salon (day pass needed) has an article about preserving software for historical purposes. It discusses source code archiving, and the effect the DMCA is having on attempts to catalog and analyze legacy code. It will be a shame if in the future a wealth of information is locked away because knoweldge of the underlying technology is lost."
My first program (Score:2, Offtopic)
Re:My first program (Score:4, Funny)
10 PRINT "HELLO WORLD"
20 GOTO 10
30 END
At which point you have created your first programming boo boo.
MY first program (Score:3, Funny)
20 beep
30 goto 10
Even years ago I was much more 1337 than yu0 !
Re:My first program (Score:3, Interesting)
It really would be a travesty of progress if we lost all those wonderful "Hello World" programs to history.
Fortunately, we have the classic ACM "Hello World" [latech.edu] project to remind us of past glory.
Please understand... (Score:5, Insightful)
I really dont want strong crypto keeping out of stuff that I OWN, or My CONTENT.
I'td be a neat experiemnt to create a Linux driver that emulates TCPA chips so that stupid software thinks you're auth'ed.
Re:Please understand... (Score:2)
Re:Please understand... (Score:3, Insightful)
And once MS requires it, how's Linux going to fit in there? I'd figure that MS TCPA computers would have to be signed to even speak to other MS machines. We cant have traffic going out of the network that isnt validated for internal traffic.
Re:Please understand... (Score:5, Insightful)
We are talking here about file formats 30 years old, or even less. Try to imagine what will happen in 200 years. Most of our history will be written to electronic media, and for people that will live in 200 years, the file format used for that media will very probably be undecipherable.
What is the solution? Some say that we need to convert all documents in a more recent file format every x years. That will really become a pain in the ass as the number of archives go higher and higher.
Another trick could be to describe in whole the file format used and attach that description to every file. That, of course, brings up the problem of what file format to use for that description... (will even plain ascii files still exist in 200 years? Maybe not, but I think it is reasonnable to expect that people will at least still have an idea of how to read them...)
Comparing this to the problem faced for dead languages gives a good idea of the repercussions... There is already countless documents written in very old ages that we cannot decipher because the language used to write it is loss. People are working all their lives trying to understand a dead language. But with computers, we're not talking about something that happened 4000 years ago, but 30 years ago... That means that in the course of your lifetime, You could see obsolete file formats 3 times!
Someone will need to find a solution for this, and preferably before the problem happens for real...
Re:Please understand... (Score:2, Funny)
Re:Please understand... (Score:4, Insightful)
The real worry I'd have is how someone will be able to get the stuff off of the media if the directory and interface standards change. Will their advanced computers even be able to read the disk to see that goatse.jpg is on that disk? Even if they had the algorithm to decode the image, they might not see it's there.
Re:Please understand... (Score:5, Insightful)
we have this problem... (Score:3, Interesting)
Re:Please understand... (Score:5, Insightful)
an archaeologist of tomorrow can figure out ascii.
To be sure.
And will they be able to figure out PowerPoint?
And how about Secure PointPoint 2005 with automatic DocuSafe technology that incorporates encryption with a public key that is automatically downloaded over the network from microsoft.com after your VISA card number has been authenticated with citibank.com?
No, tomorrow's archaeologists will miss out on the whole indecipherable morass that is today's data formats.
Documents and presentations will look indistinguishable from random noise.
And, honestly, a lot of what gets attached in those formats looks that way already to me in 2003.
Re:Please understand... (Score:5, Insightful)
And this was with a language which itself was very easy mapped to the letters (every consonant mapped to a letter, vowels omitted).
The rules which encode a file may be much more complicated. Look just at the most common compression methods (Run Length for instance), how they just add another layer above the already encoded contents. And they remove something very important for deciphration, the redundancy, out of the data. Then the subjects that are stored in files are much more diverse. We have not only language, we have music and graphics, 3D data and cryptographic certificates, configuration files and program binaries.
Just to be able to know what the file is about and thus have an idea how to get started can prove to be more complicated than any deciphration from archaeologic texts.
DMCA Doesnt extend, for now. (Score:2)
Things such as the DMCA will become global. To the least common denominator..
Or at least if you want to trade with everyone else on the planet, so its a pesduo enactment.....
Re:Please understand... (Score:5, Informative)
TCPA hardware is not the same as DRM, and is not evil
The TCPA hardware specifies a cryptography co-processor on the mainboard. This can be used for DRM, but it can also be used for offloading things like SSL from the CPU. Emulating the hardware would be no good. Under *NIX, it would just be mounted at /dev/crypto (or something), and emulated if the hardware were not availible. It is the software which manages DRM.
Re:Please understand... (Score:3, Informative)
Before continuing to demonstrate your ignorance on this subject, you might wish to visit this site [nra.org] and enlighten yourself. At the very least, you might consider at least not automatically taking what these [bradycampaign.org] maroons [vpc.org] say as gospel. This [amazon.com] is also highly-recommended reading.
It's just a sug
Re:Please understand... (Score:3, Insightful)
Everyone I know that's involved in "sports shooting" also considers it to be practice for that mythical day that the evil man breaks in and tries to
Re:Please understand... (Score:2)
HA HA! (Score:5, Funny)
Re:Please understand... (Score:3, Informative)
Explain the Pyramids? (Score:5, Funny)
Re:Explain the Pyramids? (Score:5, Insightful)
The Egyptians could very well have written down the instructions for building them. There have been numerous opportunities for that information to be have been destroyed. Or they may have viewed their construction as too sacred and only passed down information on a need to know basis.
Our problem is that we charge for rocks and lack the motivation. We just assume we couldn't build such things as they did but never really bother to try.
Ben
Re:Explain the Pyramids? (Score:2)
Linus got it right when he said not to have a backup, simply let other people mirror your work.
Of course, this makes a free market for what information gets preserved. It is interesting that free information has a greater chance of survival of a large worldwide disaster than propreitary information.
Re:Explain the Pyramids? (Score:3, Insightful)
Re:Explain the Pyramids? (Score:3, Funny)
Central Point Software (Score:5, Interesting)
Re:Central Point Software (Score:5, Funny)
Re:Central Point Software (Score:2)
KFG
full article text, no pass required (Score:2, Informative)
The only problem, of course, is that they don't know it. All the images are recorded in an obsolete digital format, JPEG, and nobody knows how to unscramble the data. As a r
Re:full article text, no pass required (Score:5, Insightful)
Re:full article text, no pass required (Score:2, Funny)
Yes, and jaywalking is illegal, too.
Re:full article text, no pass required (Score:5, Insightful)
Re:full article text, no pass required (Score:3, Interesting)
I don't like the idea of reposting an entire article on Slashdot, either, but there's no other way for some of us to read what's being talked about.
Re:full article text, no pass required (Score:2, Insightful)
Having the Day-Pass system is only useful if it actually works.
Re:full article text, no pass required (Score:3, Insightful)
What if the software acheologists don't have the req
Re:full article text, no pass required (Score:4, Insightful)
Re:full article text, no pass required (Score:3, Insightful)
Any entity that begins to implement anti-consumer actions in order to stay afloat are doomed to begin with (RIAA, SCO, etc.) If you can't stay out of the red by simply providing your service with a *reasonable* amount of revenue-generating methods, then that should tell you that either:
a) You need better revenue-generatin
Re:full article text, no pass required (Score:4, Informative)
The Day Pass is a great idea, but some day or other I notice it plain doesn't work. Today, I tried to go to the article mentioned here, only to be redirected again and again to the same partial-content page. The Sprint ad never appears. Under Win 2000. Bot from IE and from Mozilla 1.4. I'd guess a technical problem on your (Ultramercial's?) side.
In this circumstances, I'd consider the posting of the entire article forgivable (although the poster didn't state Day Pass problems as the reason, which puts his/her motives in question). Otherwise, I agree it's a rather uncivilized behavior.
Re:full article text, no pass required (Score:3, Interesting)
I thought that was a bug! Every time I have to search again for the article I wanted to read. Since you're using cookies anyway, why not store the article you read the teaser for in the cookie so you can be taken to the full article immediately after you view the ad -- or at least give it as an option.
Re:full article text, no pass required (Score:2, Interesting)
It's not like anyone here follows ad-links anyways.
Re:full article text, no pass required (Score:3)
Re:full article text, no pass required (Score:3)
While your idea may have ethical merit, that too takes precious time and energy for a proper cost/benefit analysis and a few philosphy pro
Is this irony? (Score:5, Funny)
So kudos for reposting this valuable information to Slashdot! Without the efforts of others like you, internet surfers in generations to come might never understand the importance of, well, the efforts of others like you.
Re:full article text, no pass required (Score:4, Informative)
Preserve the Hardware as Well? (Score:5, Interesting)
Re:Preserve the Hardware as Well? (Score:3, Interesting)
Re:Preserve the Hardware as Well? (Score:5, Interesting)
It was fictional, and very tongue in cheek, but it made an interesting point. How the hell will you play your archived media if you don't have a player? And, not just a player, but support equipment as well -- a display that can connect to the player, a power supply that is the right voltage, amperage, and number of cycles, compatible cabling, etc. It could turn out to be quite a trick to get all the requirements together, just to do something as simple as play an old tape.
Perhaps what's needed is to define a single "data archival standard", and by law require that it be backwards compatible with version 1 of the standard, forever. Then, convert all current data to the version 1 standard, once and for all. We have a good candidate right now: DVD-RW and CD-RW. Preserve those standards, so that all future disk players can at a minimum play current-day CD's and DVD's, and we might be ok. Of course, you'd have to use archival-quality CD's and DVDs, because the cheap ones only last five years (the good ones last a hundred or more, they've got extra coatings to prevent degradation, etc).
Why not? Current DVD players already accept CDs. Just take the current DVD writer as a standard and design all new devices to be backwards compatible (on physical size, too -- i.e. a current, standard-size CD should be usable).
Heh... (Score:3, Interesting)
Re:Heh... (Score:2)
Re:Heh... (Score:3, Insightful)
Software development as an art/craft/science/whatever you think it is has evolved rapidly. There are "fashions" in code - try reading 20 year old C code: the language itself hasn't really changed much, but you will immediatly notice the differerence. People have tried things that failed, and have found interesting solutions that are now forgotten. This will all be lost.
What would literature be like if we hadn't accesss to th
Knuth is only one foundation that won't be lost (Score:4, Interesting)
It really all started when some engineers decided that machine code was too hard and invented assembler. Nowadays it's not even necessary to know what a bit is or how an ALU works to make programs. Just point and click and you've got yourself a brand spanking new database app courtesy of VB.
No one ought to knock VB because it really is the best tool for what it does, but it also lowers the barrier to entry for would-be programmers. This can only lead to worse programs.
The most fundamental concept in computer science is logic, not algorithms (or worse programming languages). If a 'programmer' hasn't written a program in a low level language like C or assembler, the hiring manager should beware. Without hands-on experience with the fundamentals of computer science that person is lacking at the most basic level, regardless of whether he knows 1 language or 50 languages. He is handicapped.
It's a good thing to abstract, but it's also important to remember and study the bases of our science.
Re:Knuth is only one foundation that won't be lost (Score:3, Insightful)
This is coming from someone who started in assembler and has been programming for over 20years now (primarily various assembly, C, C++), but I completely disagree with that statement. It's all in the context. Applications are about solving problems and if VB is the best tool for a particular problem, then it and the programm
Re:Knuth is only one foundation that won't be lost (Score:5, Insightful)
Bullshit.
"Computer science is about computers in the same way astronomy is about telescopes" --Edsgar Dijkstra
Programming isn't about knowing how to twiddle bits in registers or even how to leverage strengths of a particular processor.
Programming is about dealing with complex problems which can be solved by manipulation of information. I would say the the quality a programmer needs most of all is not logic or math, but just the ability to hold and manipulate large and complicated structures inside his head. And no, it doesn't have anything to do with assembler, low-level languages, ALUs, bits, etc. etc.
Re:Knuth is only one foundation that won't be lost (Score:3, Insightful)
Bah. (Score:2, Insightful)
That's like saying that a journalist is lacking in his ability to write if he's not fully competent in Latin. Just because someone doesn't know how to allocate memory doesn't mean he can't code in a language that does it for him automatically.
Coming Soon... (Score:4, Funny)
Re:Coming Soon... (Score:4, Funny)
"E"diana Jones:: "DONT LOOK AT IT MARION!!"
Old computer Scholar: "It's beautiful
Re:Coming Soon... (Score:3, Funny)
Isn't Music Software? (Score:2)
Re:Isn't Music Software? (Score:2)
Just a thought you guys.... (Score:4, Offtopic)
They changed to a registration/fee based model, but allowed 1 day passes for whatever reason.
Nothing can hurt them more than being slashdotted by a bunch of people using a day pass.
someone has already copied the contents of the article into a comment which is good because it saves them bandwidth, but
This is why things like the DMCA and DRM come about - people thoughtlessly violating other people's copyrights/etc, and/or taking their services for granted.
I'm no better than anyone else, I do the same thing.
I guess my point is: either support the people who provide services you enjoy (music, video, news, web content, porn, whatever), or quit complaining when they finally start defending themselves.
Re:Just a thought you guys.... (Score:3, Insightful)
Re:Just a thought you guys.... (Score:3, Informative)
Copying an entire article from another web site is also copyright infringement, unless of course the copyright terms of the article permit it.
Salon probably makes some money per page view. They want you to look at their web site, not copy text off of it. Copying an entire article is almost certainly copyright infringement, and makes whoever doe
Re:Just a thought you guys.... (Score:2)
It's a matter of survival (Score:5, Informative)
Technically, it's copyright infringement, but Salon isn't going to devote resources to suing Slashdot or Slashdot readers. If we were going to go that route, we'd start with the Freerepublic assholes, who actively want us to go bankrupt and do everything they can to help us down that road. To slashdot readers, the best appeal I can make is simple.
We want to make a living at what we do, so we can keep doing it. I want to keep paying great technology writers like Rachel Chalmers and Sam Williams to do interesting stories. If we convince enough readers to watch our ads or subscribe, we'll pull off this magic trick. So basically, the way I see it, any time a Slashdot reader posts the full text of a story on Slashdot, it's a vote against our survival, which is ironic, since you wouldn't be posting the stories if you didn't think there was some merit in them, right?
Re:Just a thought you guys.... (Score:2, Insightful)
Assuming that the people goto the site to read the article (as opposed to reading it here from the comment in which the whole article was posted) it would drive up the number of ads served which would be a good thing. I would think
Fair Use (Score:2, Insightful)
If it doesn't then there is still the matter of the government (the US at least) being able to do whatever it pleases with copywrited material. In this case the government's authority to copy what it wants is a good thing.
The Library of Congress is already making archival coppies of copywrited music and it is going to continue this dispite any hypothetical protestations of the RIAA. Why, because it is deemed neccessary for the preservation of culture.
Storage of old data / hardware (Score:5, Interesting)
Re:Storage of old data / hardware (Score:2, Insightful)
Your right, those things cost money to keep them going. And for what? A novelty? These things were doing any wor
securing for limited times (Score:2)
Particularly since the expressed intention of copyright is to give protection to creaters for a limited time (and then have the work pass into public ownership), from article 1 of the U.S. Constitution:
To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries;
No, it won't (Score:3, Funny)
Consider conferences on Geek Culture someday, where Prof. Bipperton Fusslebeak delivers a sad, acedmic commentary on contemporary culture:
"An Analysis of the Correlation between Increased Use of Open Source Software, and Slashdot Posts Centered Around Deviant Sexual Behaviors in the Post-.Com Era".
I'm doing my part (Score:4, Funny)
I'm doing my part to make sure that the porn images of the Internet don't meet this similar fate. I have recorded my voice describing each of the images in my collection, and encoded it into the open-source OGG format. Much of the recording has consisted of little more than "Mmmmmmmmmmm, yeah baby", but I think that speaks volumes.
This is a major reason... (Score:3, Interesting)
Haunt (Score:3, Interesting)
I went looking for it again a couple of years ago, but it has been lost. It was written in a language which no longer exists: OPS-4. Even the original source code has disappeared. All that is left is a partial port, to another language which no longer exists (OPS-5). Here is a brief description by the author [umich.edu].
Looking at the source code for the partial port gives some of the feel of the game:
Other technologies go obsolete too, So what? (Score:5, Interesting)
Why is obsolete software technology worth preserving where obsolete manufacturing technologies are not? In a 100 years, will we really need access to the billions of JPEGs that were spewed out by digital cameras everywhere? I am not arguing for ignoring history (even though those that learn from history are also doomed to repeat it), but I am wondering about the double-standard. What realms of human knowledge and invention are worth saving, and which are not?
BTW, for the record, I still have old documents and applications from my Mac 128k and I might even have a paper tape copy of a old APL program that I wrote 25 years ago. But then I am a certified packrat.
Re:Other technologies go obsolete too, So what? (Score:4, Insightful)
Some of this data is useless, today. In the future, someone might find it useful. Do we allow this data to degrade, and then possibly launch a new satellite to collect new data (if that's even possible, in some cases, it's not - how do you gather climate data from the 1970's?).
The main problem is the tape backup companies no longer support the old tape drives, and new tape drives don't support the old tapes and tape formats.
Funny thing is, 5 years ago, I was there with everyone else saying that we should put this data on CD ROM, because that format will never, ever, ever go away. Now, I'm not so sure - if they ever straighten out the DVD standard, I can see a future, 10 years from now, when you won't be able to buy a new device that can read a CD ROM.
A joke (Score:5, Funny)
The year was 2015. Joe, a programmer, was getting up in years and decided he wanted to have his body frozen after he died. He made the arrangements, and when the time came, he was frozen and placed in a government facility. Time passed, and he was forgotten.
Jump ahead a few centuries... suddenly Joe finds himself conscious again! He is on a lab table surrounded by strange looking people in uniforms. Their leader, speaking through a translator, welcomes Joe back to life.
Joe is amazed! There are so many questions he wants to ask, but first he says, "Why did you bring me back to life?"
The leader answers, "Well, the year is 9999. Y10k is coming up, and your file says you know Cobol."
Aha! (Score:3, Funny)
*snaps whip*
"Fetch me another Mountain Dew, Shorty!"
Another red herring from salon? (Score:5, Insightful)
If you have the source code for something then you have no cause to fear the DMCA, since you don't need to decrypt it. And if you don't have the source code, where is the value? Is there really any value in running lotus 123 for the Apple//? Perhaps if you have an Apple//, but so what? You cannot "fly over the code" from any height (as was mentioned in the article) because you don't have any code to fly over. You have an executable, and the "structure" there is quite different than looking at source code.
If you want source code for DOS, hit freedos.org and download it. It's not Microsoft's source, but so what? It does the very same job and, in many cases, it's superior to the original. Works that have value will be replicated and emulated; works thta have no value simply have no value - where is the need (or logic) in "preserving" them?
Formats not the problem (Score:4, Insightful)
Take the Doomsday Project (in the UK) as an example. An Acorm Archimedies lazerdisc full of content relating to life in the 20th century. The problem came when they wanted to get the data off .. and couldn't easily find a compatible lazerdisc reader.
Of course, the format of the data is an issue. But if you can't get the data off the media, then the format of it isn't going to matter in the slightest.
Maybe not legally, but it *will* be preserved... (Score:4, Interesting)
He goes on to explain that they use these 'ancient' systems to understand and gain insight into current systems, adding that nothing really changes, just gets added to (and that noone really understands the full system).
I believe Gibsons insight will be proven real, and that Software Archaeology is *essential* for the future DMCA or no DMCA.
The alternative is stagnation in the evolution of computer systems. This cannot happen, although it might in America
The part/parts of the World that don't succumb to DMCA fever will become the new tech leaders (and probably a great immigration target for us lot!)
DMCA is already taking bite (Score:2)
Fine if the manuals are still printed and available, however such manuals are hardly a big money spinner for the companies involved.
Re:DMCA is already taking bite (Score:2, Informative)
ISDA spiders are trolling around and seeing a ftp/web site with "video game" in the text and offering files like pacman.zip and streetfighter2.zip for download.
C&D notices are automatically being sent, none of it has to do with the DMCA, but with regular old copyright law, since the ISDA assumes the games are being put up for download.
Whatshisface (who had the big manual site and shut it down) just couldnt be bothered to explain to anyone at the ISDA what f
sounds familiar... (Score:2)
Isn't that the basis for just about every post-apocalypse story out there? It's scary to think that we are already seeing signs of it.
Even fictional characters think the DMCA is evil!
.
archive.org (Score:3, Informative)
Reverse engineering (Score:2, Interesting)
What, Me Pedantic? (Score:3, Interesting)
"It might seem silly now but put yourself 1,000 years in the future," says Booch, chief scientist at IBM's Rational Software subsidiary. "It's not too hard to imagine."
This assumes that (a) humans will still be drinking coffee 1,000 years from now, (b) we will still have college professors and (c) they will still have need of drink coasters.
I believe that 1,000 years from now we will consume our caffeine in pill form only, be schooled by robots and will obtain our liquids from intravenous bags.
Bloatware (Score:3, Insightful)
Each new version, the software gets bigger and bigger and biggers. It is an archealogical wonder in itself. Another name for this coding style is called bloat. Linux has many of the same things going on.
This argument about the need to preserve prior formats has been around for quite awhile. The truth of the matter is that software is largely an evolutionary process. Most file formats build upon the past, so there is a tendency for software to naturally preserve its path.
Of course, for Grady Booch, who wants to be reconized as an intellectual giant a thousand years from now, the main question is if his name will invoke the same awe as say Euclid and Archimedes. He is, after all, one of the trinity of OO modeling approaches.
Mandatory source code deposit (Score:4, Insightful)
Don't just complain, DO SOMETHING (Score:4, Informative)
Difficult not impossible (Score:4, Interesting)
Eyeglasses (Score:4, Interesting)
Those of you of moderate to low income (I'm talking. . . making less than 7 figures per year, to put it in perspective with pre-reniassance nobility), who require corrective eye lenses, imagine yourself unable to beg, borrow, or steal a pair of glasses for yourself. Even crude ones.
Eventually, the secret got out, and now we have a global multi-billion dollar industry.
In other words, the very concept of IP is just plain evil.
Rock^H^H^H^H HTML will never die (Score:3, Interesting)
The only thing we need to do is maintain our compliance to standards! Because barring the end of the world, HTML and other standards will never die. They'll just get turned into kernel options with a default of NO.
Anyone have a spare 8" floppy drive? (Score:4, Insightful)
My dad still has a program he wrote on punch cards someplace.
That's the trouble, isn't it? Even if the data survives, the hardware to read it might not.
Anyone else think of Vinge? (Score:4, Informative)
In that book, code-as-data is taken to an extreme, and the best programmers have the title "Programmer Archaeologist", since they spend little time writing new code; instead they look through old code to find something written for a similar situation. It isn't that old programmers are better-- it's that the software contains facts and information that are of value.
Whereas on Star Trek someone might look through an ancient captain's log to learn about a bizarre planet/new race/weird disease/strange technology, in Vinge's book that sort of specialized information is stored in the source code for software that was written at the time to deal with the situation.
Dark Ages II (Score:3, Informative)
Re:here's an easy howto: (Score:5, Interesting)
Re:here's an easy howto: (Score:3, Funny)
What about CD-R's exposed to mountain dew?
Re:here's an easy howto: (Score:3, Insightful)
One modern 80GB hard disk.
80GB = 80,000,000,000 bytes = 80,000,000,000 ASCII characters.
One stanarded printed US-letter-sized page is 80 X 60 characters or 4800 characters.
80,000,000,000 characters / (4800 characters/page) = 16,666,667 pages (rounded off).
This is potentially ju