An Update On Microsoft's 'GitHub Arctic Vault Program' (news.com.au) 31
news.com.au reports:
The GitHub Arctic Vault program is part of the now Microsoft-owned code repository GitHub...aimed at preserving the information for generations to come...
"We chose to store GitHub's public repositories in the Arctic World Archive in Svalbard [a Norwegian island] because it is one of the most remote and geopolitically stable places on Earth and is about a mile down the road from the famous Global Seed Vault," said GitHub vice president of special projects Thomas Dohmke. Mr Dohmke said open source code in particular was worth preserving... "Ultimately, it's time to create multiple durable backups of the software our world depends on..." Other treasures include the original source code for MS-DOS (the precursor to Microsoft Windows), the open source code that powers Bitcoin, Facebook's React, and the publishing platform Wordpress...
"The Arctic Code Vault was just the beginning of the GitHub Archive Program's journey to secure the world's open source code," GitHub vice president of special projects Thomas Dohmke told news.com.au. "We've partnered with multiple organisations and advisers to help us maximise the GitHub Archive Program's value and preserve all open-source software for future generations." One of those partners is Norwegian archival experts Piql, who specialise in very-long-term data storage. The company uses around 200 silver halide and polyester film reels designed to last a thousand years to store the information...
"We chose to store GitHub's public repositories in the Arctic World Archive in Svalbard [a Norwegian island] because it is one of the most remote and geopolitically stable places on Earth and is about a mile down the road from the famous Global Seed Vault," said GitHub vice president of special projects Thomas Dohmke. Mr Dohmke said open source code in particular was worth preserving... "Ultimately, it's time to create multiple durable backups of the software our world depends on..." Other treasures include the original source code for MS-DOS (the precursor to Microsoft Windows), the open source code that powers Bitcoin, Facebook's React, and the publishing platform Wordpress...
"The Arctic Code Vault was just the beginning of the GitHub Archive Program's journey to secure the world's open source code," GitHub vice president of special projects Thomas Dohmke told news.com.au. "We've partnered with multiple organisations and advisers to help us maximise the GitHub Archive Program's value and preserve all open-source software for future generations." One of those partners is Norwegian archival experts Piql, who specialise in very-long-term data storage. The company uses around 200 silver halide and polyester film reels designed to last a thousand years to store the information...
Strange choice of software (Score:2, Insightful)
MS-DOS, Bitcoin, and React???? These are the important pieces of software that humanity will care about after a nuclear apocalypse? Yeah right...
Re: (Score:3)
Re: (Score:2)
Wait a minute! What about cute cat videos? After the apocalypse, the most important thing would be to lift people's spirits, hence cats on Roombas would be very high on the list of priorities.
Re: (Score:2)
Github is mostly source code, not binaries. Preserving it requires complete digital precision, especially if we want to preserve the history. Most video or audo archival means can survive a few percent of the data being erratically lost. Digital source code is not so tolerant.
Re: Strange choice of software (Score:2)
Digital source code storage can trivially be made error-tolerant, simply by storing multiple copies with checksums (or, if you want to be more space-efficient, by using an equivalent but more sophisticated encoding like forward error correction). Of course that increases the amount of storage space youâ(TM)ll need, but compared to the storage requirements for video, even the most wasteful redundancy would be small.
That said, cockroach-DNA storage is the most robust solution. You wonâ(TM)t be abl
Re: (Score:2)
> Digital source code storage can trivially be made error-tolerant, simply by storing multiple copies with checksums (or, i
Simply put, checksums do not prevent "rm -rf /". Check summing and snapshots re valuable against certain classes of failure, not against all. Nor do checksums prevent obsolence or failur of the systems used to _read_ from older media, especially if hte data is stored encrypted and the key is lost or the index to the archival system is lost.
DNA storage is not digitally robust. It's hi
Re: Strange choice of software (Score:4, Informative)
I know nobody reads the fucking article but...
The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault will sweep up every active public GitHub repository, in addition to significant dormant repos. The snapshot will include every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020, every repo with at least 1 star and any commits from the year before the snapshot (02/03/2019 - 02/02/2020), and every repo with at least 250 stars. The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in sizeâ"depending on available space, repos with more stars may retain binaries.
They just listed some of the best known repositories.
dotnet/core
torvalds/linux
python/cpython
bitcoin/bitcoin
rails/rails
docker/machine
openssl/openssl
nodejs/node
Homebrew/brew
php/php-src
twbs/bootstrap
microsoft/TypeScript
apache/hadoop
v8/v8
Alamofire/Alamofire
gatsbyjs/gatsby
fastai/fastai
jimweirich/builder
zeit/next.js
WordPress/WordPress
rust-lang/rust
golang/go
angular/angular
jquery/jquery
ruby/ruby
facebook/react
CocoaPods/CocoaPods
jupyter/notebook
zeromq/libzmq
postgres/postgres
microsoft/MS-DOS
Netflix/chaosmonkey
robbyrussell/oh-my-zsh
xamarin/xunit
grafana/grafana
graphql/graphql-js
github/gh-ost
rspec/rspec
libgit2/libgit2
Many more
Re: (Score:1)
Re: (Score:2)
You're thinking of DOS 3.1+
Only DOS 1.25 and 2.0 is available on github.
I don't see the point (Score:3)
This smacks of "publicity stunt". Storing physical seeds in the arctic makes some sense - but there's nothing magical about storing data in a cold, remote location versus somewhere else.
Additionally, this source code is already mirrored in numerous locations around the world. If some global cataclysm was ubiquitous enough to somehow destroy all those copies... it is unlikely any survivors will care in the least about setting up a new Wordpress server.
Re:I don't see the point (Score:4, Funny)
What about my critical Libra holdings? I might need to buy food, water, and shelter, how will I access those without this very important archive?
Re: (Score:2)
Incoherent, unmerged, forked, and obsolete clones of git repos scattered around the world are vulnerable to profound "split brain" merge difficulties. This is especially true for the "laptop" environments which developers never remerge from upstream and do local work on. Github has become a critical reference site for the Linux kernel and for git software itself at https://github.com/git/git/ [github.com] github.com is potentially vulnerable to flushing, to people accidentally or deliberately deleting the back-end stora
Re: (Score:2)
Re: (Score:2)
From the article:
"One of those partners is Norwegian archival experts Piql, who specialise in very-long-term data storage. The company uses around 200 silver halide and polyester film reels designed to last a thousand years to store the information... "
With that said, a thousand years ain't shit. (But I am looking forward to Firefox version 3284751063 and Windows Eleven Billion.)
Micro$oft getting ready to kill us all off. (Score:2, Funny)
As soon as they have all the Open Source code locked away, they getting ready to wipe us out.
Re: (Score:2)
Re: (Score:2)
Simply put, no. This is a backup, not the primary repository for anything.
React! Ha! (Score:3)
React can't even keep their core design patterns consistent for a couple years at a time. What benefit at all [besides marketing BS] could there be to putting the source code into a vault like this for potentially thousands of years?
Original source code for MS-DOS? (Score:1)
Who cares (Score:2)
Nobody competent is on GitHub anymore since MS bought it. And none of that code is of any use anyways without the machines to execute it. And if computers have to be re-developed, it will be easier and cheaper to write what software is needed instead of getting some ancient, probably bug-ridden stuff to work again.
Re: (Score:2)
Are you under the impression that gitlab, sourceforge, butbucket, or any other commercial or personal git repository will be stable as a corporation for the next 100 years?
Re: (Score:2)
Are you an MS shill? The competent people have moved off. GitHub is dying.
Goddammit, I'm getting old when ... (Score:2)
... a story needs this:
... MS-DOS (the precursor to Microsoft Windows) ...
Had we stuck with DOS ... (Score:2)
... with the speed increases and storage capacity we have now, we'd be slicker'n mockingbird shit on a sycamore limb.
When the aliens land (Score:2)
When the aliens land and find the Arctic Vault, they be wondering what the hell "Tetris" was and why it was so fucking important to preserve it for millennia. "It must have been important, look at all the trouble they took to preserve it!"
wikileaks (Score:2)
Depends who wins the 2020 election (Score:2)
Cobol (Score:2)
Most of the important code needed to reboot the economy is still written in Cobol.
And if the Javascript code suffered from bitrot, would anyone even notice?
Typically Microsoft (Score:2)
Always just in time...