RDF For Desktop Metadata? 167
claes writes "There is an article "Metadata for the desktop" that suggests that RDF should be used to describe data in desktop environments. This is an interesting idea. RDF is already used by Creative Commons to attach license metadata to its works. Mozilla also supports it.
RDF was designed for the web, but can it also find its way to the desktop? And what metadata is most important to describe?"
The killer app for metadata on the desktop (Score:5, Funny)
Suppose today I want to see shaved asian hardcore action. Now provided that metadata searches are integrated into the OS(like they will be in Tiger), all I need to do is a quick metadata search on my hard drive and boom, there is what I am looking for.
I mean provided there was a decent standard(a porn standards body would rule!) and good regex capabilities built into the OS, I would be willing to pay for porn. I know that there are comments built into the jpeg standard, but there are all sorts of porn file formats, it would be helpful to have a universal standard across them. It saves time, beats trying to search on google and going through a lot of crap just to get to something good. I am a man on the run, I have places to go, I can't be bogged down by my porn. Plus, think of the people that get to catagorize this stuff(well, the fun stuff anyway, not goatse), what an awesome job that would be!
I should probably post AC, but I figure this post is bound to earn me at least one fan and/or freak.
Re:The killer app for metadata on the desktop (Score:5, Funny)
Just check your email. If it's not there now, it will be soon enough.
Re:The killer app for metadata on the desktop (Score:2)
Are you, perchance, a geek?
Re:The killer app for metadata on the desktop (Score:2)
I guess even moreso in that I watch but, have yet to participate in it.
Geek squared!
Re:The killer app for metadata on the desktop (Score:1)
Calling Autopr0n! (Score:3, Funny)
Re:The killer app for metadata on the desktop (Score:1, Funny)
Re:The killer app for metadata on the desktop (Score:2)
Definition:...? (Score:5, Interesting)
Re:Definition:...? (Score:2, Informative)
Re:Definition:...? (Score:2, Insightful)
Re:Definition:...? (Score:4, Insightful)
I don't see with the thread started wanted a definition by Slashdotters in the first place, since it's already pretty well described [wikipedia.org] and AFAIK the word doesn't have several meanings.
Re:Definition:...? (Score:1)
Re:Definition:...? (Score:3, Informative)
Say you have a digital photo. It's from a vacation you took in 2002, to hawaii, and contains photos of you, your partner, one of your children, but not your other kids and no pets. All that info could be kept as metadata of those pictures, and more.
The same can be done for finance info for the year 1999 for you, or 2001 for your partner, or music files bought from a certain place, by a certain artist and band.
While each of the filetypes above can have their own metadata (exif for images, co
Re:Definition:...? (Score:2)
Re:Definition:...? (Score:3, Funny)
Re:Definition:...? (Score:2)
Re:Definition:...? (Score:2)
Your computer already stores data about its files and such, but that's metadata's readability by humans is a bit questionable(all the concepts except file name and an eventual comment only make
Implicit feedback for filesystem information (Score:4, Informative)
The big concern is keeping this data protected and private. You dont want to share all of your metadata with everyone, so security of these systems should be something to look at carefully.
What happened to forked files? (Score:5, Insightful)
While MacOS was at a disadvantage being one of the only ones to use it, wouldn't it have been an excellent advantage for ALL filesystems to be forked?
(I don't know the answer to this - anyone who knows more about filesystems, give your thoughts)
Re:What happened to forked files? (Score:2)
NOTE: take this with a grain of salt, I know very little about filesystems.
Re:What happened to forked files? (Score:1)
Re:What happened to forked files? (Score:5, Informative)
I think the new filesystem WinFS in Longhorn is basically just an evolution of NTFS streams to make them more accessible for the users. They've always been there, just not very accessible besides a limited set of text fields in the file properties dialog box in Windows. (i.e. they've always been able to hold custom data and have custom key names)
Re:What happened to forked files? (Score:3, Informative)
Well, one problem immediately springs to mind: The translation between different metadata formats. It's already a pain in the butt when using transferring files of not-so-popular types to the Mac.
The second gripe I have with the Mac is that it's so friggin' hard to edit the metadata. AFAIK you can't even do it on OS 9 without software. Now assuming the user
Re:What happened to forked files? (Score:3, Insightful)
> for ALL filesystems to be forked?
Yes, but the trouble of compatibility remains. But there is a simple solution for this: fork as dir bundles: Instead of a file with a metadata fork you simply put the metadata file and the datafile into a dir and give that folder the name of the datafile. The current users copy the dir around and use its contents. But modern OSes treat the dir as if it is the datafile when the user interacts with it.
The metadata file sa
Re:What happened to forked files? (Score:2)
Re:What happened to forked files? (Score:1)
First off, the ability to use file type and other arbitrary metadata still exists in OSX (or HFS+, as the case may be). (More here [arstechnica.com].) This is above and beyond the much maligned resource fork.
The real issue both with resource forks and (to a lesser extent) filesystem level metadata is inter-system transport, ie how do you ftp the metadata along with the file. This is what made resource forks such a PITA.
Apple, it seems, has now moved away [apple.com] from putting the metadata in the FS,
Re:What happened to forked files? (Score:2)
Re:What happened to forked files? (Score:2)
People used Binhex format for that, back in the MacOS 5-ish days, and there was nothing about it that was inherently a pain. It was just that in those days, the open-source movement hadn't really taken off, and there were a lot of people still wasting their time on the dead-end shareware scene.
Re:What happened to forked files? (Score:2)
Re:What happened to forked files? (Score:2)
A VLIR file would have one sector, and that one sector pointed to multiple other sectors. One of the sectors was used for the "information sector" (info on the file), and simple VLIR files would then have the data in one of the other pointers.
More complex applications, like geoWrite, would use one pointer per page of the document, this
Re:What happened to forked files? (Score:2)
Re:What happened to forked files? (Score:2)
1) The thumbnail file can get corrupted and the folder cannot be viewed.
2) The thumbnail file takes space.
3) The thumbnail file cannot be copied -- so explorer complains every time you do a select-all of the folder, or try to copy the file.
4) If you burn the folder to disk, it prompt
Integration (Score:5, Interesting)
Re:Integration (Score:1)
What is wrong with you people? (Score:2, Insightful)
But why oh why do people think that XML-based solutions is the way to go? An RDF solution would be bloat beyond belief. Ok, so it's not that bad for a few files, but when we get down to it - we don't have just a few files. We have plenty of them.
So why not use something smaler? A simpler protocol?
We can still have RDF-frontends for those that crave their daily XML-fix. Get real.
Re:What is wrong with you people? (Score:2)
RDF primer [w3.org]. At first I thought it was extremely overcomplicated, but after reading some more I started to grasp the concepts. And they are not about storage formats. They are about semantics.
Re:What is wrong with you people? (Score:2)
And N3 [or Turtle] is a far better serialization than XML.
Semantics are important, but _agreement_ is even more so. The hope of RDF is that when we get away from the sillyness of XML and start agreeing about how to speak about relations [in terms of SPO], we can start talking about more interesting things like schema and semantics.
How much do you pay for HDD's (Score:2)
I have 300,000 files on my Windows box (Score:3, Informative)
The vast majority are very small files. How much more space would be required to give each one some RDF? And remember disk space is allocate in terms of sectors, or sometimes in blocks of several sectors, so small files waste proportionately more space.
And that's just on the Windows installation for my PC. I also have Slackware Linux and BeOS on other partitions. Quite likely there are very nearly a million files on my PC alon
This is largely irrelevant if you have experience (Score:4, Insightful)
For one thing, I always give my filenames relevant titles, not things like document06.doc.
Also, I already know how to search through files for content using basic grep or advanced Windows searching.
I mean, sure, meta data like ID3 tags for MP3s that I steal offline are important because my Nomad mp3 player indexes based on that info, but in general I'd say meta data is not quite as important as some may suspect.
Re:This is largely irrelevant if you have experien (Score:2)
Re:This is largely irrelevant if you have experien (Score:2)
Anyway, the grandparent has it exactly wrong: "normal" users who won't correctly name things and store them in a badly-thought out directory tree will
Re:This is largely irrelevant if you have experien (Score:2)
FS support for metadata (Score:5, Interesting)
I've heard the NTFS file system is designed to allow the system to add any number of properties (besides the obvious filename, last access time and permissions) to any stored file. This is likely to be exploited by Longhorn, which is planned to be capable of appending metadata to newly created files (for example, if you download a file from the Internet, the system would likely append a Originated-From-URL property to it).
What I wonder is, is there any filesystem in the FOSS world that supports something like this, or are there plans to make it supported before 20??, when Longhorn hits the stores? I see this as a critical feature that must be made available by non-Windows OSes.
Re:FS support for metadata (Score:3, Insightful)
Longhorn is using WinFS, which afaik is just a metadata layer slapped on top of NTFS.
Re:FS support for metadata (Score:1)
Quoting from http://www.digit-life.com/articles/ntfs/ [digit-life.com]:
Each file on NTFS has a rather abstract constitution - it has no data, it has streams. One of the streams has the habitual for us sense - file data. But the majority of file attributes are also streams! Thus we have that the base file nature is only the number in MFT and the rest is optional. The given abstraction can be used for the creation of rather convenient things - for example it is possible to "stick" one more stream to a file, having recorded
Re:FS support for metadata (Score:2)
The storage engine for WinFS will come from the mssql team so thats hardly "slapped on top"
Re:FS support for metadata (Score:5, Informative)
Actually, you can. To add a metadata item called "hidden.txt" to a file called picture.jpeg, just type on the command line:
notepad picture.jpeg:hidden.txt
Notepad should say that it "created the file." You should notice that no new files have been created: just look for them with explorer. But you can later open this "file" and read and edit it.
You can do this with any file with any metadata name.
Re:FS support for metadata (Score:2)
(Just to point out: hidden.txt doesn't actually show in the filesystem anywhere. You can name it whatever you want too.)
NTFS streams (Score:3, Informative)
If you move the file around the NTFS drive, or from one NTFS drive to another, then yes, the metadata goes with it. If you move it to a FAT volume though, the metadata is lost forever. Not a huge deal as NTFS is getting more and more users nowadays.
XP uses these metadata streams to some degree, actually. Some of the things in the properties page for a file are actually NTFS streams.
Longhorn will make more extensive
Re:NTFS streams (Score:2)
Ditto the viruses.
Re:NTFS streams (Score:2)
Indeed [symantec.com].
That one's pretty harmless and easy to spot since it's proof of concept, but it's nevertheless scary thought to have viruses that could hide themselves to pseudo-files that are not visible in any way if you don't know what to search for, and even then only enumerably by weird totally unrelated and/or undocumented functions...
Re:FS support for metadata (Score:2)
Streams don't look too hard to deal with, it was just an ignored feature, like Windows Scripting, no few paid attention until it was exploited with a virus.
Re:FS support for metadata (Score:2)
Software on Classic Mac OS did this years ago with the comment field--it was mighty handy.
Re:FS support for metadata (Score:3, Informative)
let's keep the Meta data simple... (Score:4, Insightful)
What
Where
When
Why
and possibly How...
Re:let's keep the Meta data simple... (Score:2)
Re:let's keep the Meta data simple... (Score:2)
If you think about it, the S-P-O relation from RDF is the thing that actually allows interoperability between different [namespaced] properties... since it's clear into which role all extension goes -- as new Properties.
Re:let's keep the Meta data simple... (Score:2)
DublinCore covers a lot of the digital artifact information [title, authors, publisher, &c.] Where is wgs48, and when is actually covered pretty well by W6, if only because it's not clear what "when" is referring to.
If anything, I'd say W6 is really useful as a set of stakes in the ground for being super-classes
Spotlight (Score:2, Interesting)
Re:Spotlight (Score:3, Informative)
The search technology in Spotlight probably is inspired by live query from BeOS but first appeared at Apple in iTunes and later Preview for Panther.
Many former Be Inc. employees work at Apple now and some had worked at Apple before joining Be.
Re:Spotlight (Score:2)
It might be useful to have an interface in the finder to access/edit this metadata however in Tiger.
Currently, there is a way to tag items with the comments field accessed from Get Info. The rest of the metadata is created in application. Steve Jobs touted Spotlight as working with curr
Can't wait.. (Score:3, Funny)
Re:Can't wait.. (Score:2)
It would be good for doing things like grepping, but I wonder if a system-wide SQ
Re:Can't wait.. (Score:2)
Re:Can't wait.. (Score:2)
Re:Can't wait.. (Score:2)
Or as another example, for a "various artists" album you could have the songs available as
Re:Can't wait.. (Score:2)
Re:Can't wait.. (Score:2)
Re:Can't wait.. (Score:2)
How I learned it:
Parents who use graphs have kids who use graphs.Re:Can't wait.. (Score:2)
Trees [wikipedia.org] are graphs. A tree is a connected acyclic simple graph (i.e. any two vertices are connected by exactly one path).
What mrchaotica has done is add edges to the tree so that it has more than one path between some vertices. This makes the tree into just a graph.
Re:Can't wait.. (Score:2)
I guess it's a common enough idea, I've been planning to write some scripts to do it for ages, but they have to cope with the crappy organisation in my music folder at the moment. Naming consistency is the key to making it work sucessfully. I was thinking of having one big flat folder with album sub-folders, each containing the artist / album name, then h
Re:Can't wait.. (Score:2)
Re:Can't wait.. (Score:2)
The well organized file tree is best assuming that you use the exact same tree to store the file as to access the file. Problems are that well organized doesn't come cheap and that the optimum tree structure changes over time. Add to that the fact that you really want the ability to recover information from files based on entirely different criteria than those used to initially store the file. The cloud is not a
discussions about winfs and rdf (Score:3, Interesting)
Haystack and Metadata efforts (Score:5, Interesting)
I have to say, their ideas are intriguing, but after using it... I think the big shortcoming is that it's tough to come up with a generalized user interface for manipulating any data thrown at it. Haystack tries at this, and I think, fails at providing any kind of cues or context that tells you what your are dealing with. In Haystack, every task and piece of information you deal with looks very much like every other piece of data, because, as a design choice, Haystack every piece of data has the same rank as every other piece of data.
Having different applications for different types of data usually make sense, if only to limit the amount of options presented to the user so they can make an intelligent decision about what action they want to perform. See this article on Slashdot about how users need limited [slashdot.org] since it makes decision-making too difficult psychologically.
Inevitably, discussions around RDF and metadata always devolve into hand-wavy discussions on how the computer will be able to "magically" do smart things based on the metadata. But it really isn't magic and it isn't automatic at all. Equivalencies and mappings have to be created by humans along with the rules about what to do.
RDF uses many concepts from AI research. Anybody who has read about this branch of computer science knows that the discipline has pretty much given up on creating AI in the 'sci-fi' sense as an impractical dream. That's what makes the Loebner prize [loebner.net] so controversial. I don't expect that computers will be intelligent enough able to relieve users of too much of the burden in assigning metadata.
RDF is a promising approach, but if you read the article, it makes a lot of assumptions about what needs to happen to make the benefits real. Among them are establishing standards for what metadata fields apply to different types of objects: photos, people, music, etc. That kind of standardization won't happen overnight, if at all.
The computer also needs to know what to do when it encounters that kind of data. The article mentions MIME and browsers and, in effect, says the browser can make a rational decision even if it hasn't seen a particular MIME type before. That isn't really true.. you have to install a plugin that tells the browser what to do, or have a registry that someone has put together where the browser can install the right plugin at the right time.
That said, KDE's unification of contact information and passwords does show some of the promise of metadata efforts. And Apple's Spotlight looks like a good solution as far as it goes. I guess I'm just trying to make the point that the magic of metadata needs to be taken with a fairly large hunk of salt.
Separate the apps, not the data. (Score:2)
I agree wholeheartedly that unifying desktop applications into one nebulous interface isn't a very useful way to give users access to their data. Mail clients make good mail clients, but they make lousy photo gallery browsers.
That said, what I do wish we'd see more of is an effort for dif
Many community websites don't permit RDF (Score:3, Interesting)
But when I tried to publish one article at Kuro5hin, the RDF code, which took the form of HTML comments, was displayed literally in the visible body of my article. That is, all the tags had been turned into entities so the tags appeared literally in the rendered text.
I think Kuro5hin's Scoop content management system doesn't permit HTML comments. Maybe it's not trying to suppress comments, but it didn't occur to scoop's developers to allow them.
RDF on the web would likely be much more popular if one could count on publication sites allowing it in the submitted markup.
Another problem I had is that Creative Commons' recommended way to apply a license to a web page is not permitted by any of the community sites I frequent. CC-licensed web pages usually have a small banner that links to the license text. But for obvious reasons, sites like Slashdot and Kuro5hin don't permit images in article or comment submissions.
The result is that, even for the copies of my articles on my own website [goingware.com], I use neither RDF nor the CC banner, because I want to make it easy for others to copy my CC-licensed articles to site that don't permit RDF or graphics.
The way I apply the license is the much-less-cool method recommended for plain text files. I have the following text appear in the body of my articles:
Re:Many community websites don't permit RDF (Score:3, Informative)
Hey thanks for the tip (Score:2)
Isn't it the same problem? (Score:4, Insightful)
But that's the problem! If it's not fun to organize items into folders, how is it anymore fun to add metadata to a file? I'm not talking about text files. Text files are easy, because you can pull the metadata out of them automatically (in fact, you can do this now with search tools). I'm talking about files that have to be explicitly tagged with metadata, like pictures. How is adding metadata to each picture file to categorize your vacation pictures any less laborious than placing the vaction pictures into their own directory?
That's the problem as I see it. You still end up being a filing clerk! If people don't even organize their folders now, are people going to use metadata when it's available? Will improved search capabilities make users want to be clerks?
In a nutshell, isn't it the same problem?
Re:Isn't it the same problem? (Score:3, Insightful)
When I was a kid and would ask aloud where something was, my mum would say, "Look where you put it." It annoyed me to no end, of course, but years later I find myself "putting things where they belong" and emptying my mind of everything else, much like putting phone numbers in a phone book so one doesn't have to clutter up one's my mind remembering any of them.
My own opinion is that there is no substitute for "putting things in folders." Boring, but true. Regular expressions and databases can go a long
Re:Isn't it the same problem? (Score:2)
It isn't. The file names are metadata. Links and Symlinks let you have multiple "metadata" entries. If directories represent categories, then you can link a picture into as many categories as applicable.
In terms of power, metadata support is equivalent to support for links. In fact, metatdata could also be encoded into long file names - but th
Creative Commons & Desktop Metadata (Score:2)
Watching the XML kiddies reinvent the wheel (Score:4, Informative)
Knowledge representation via "is-a" links has been tried, and it breaks down rather quickly. Read "Artificial Intelligence meets Natural Stupidity", by Drew McDermott, for a 20 year old critique of this concept. It's overkill for searching, and not powerful enough for reliable automated question answering.
The Cyc debacle [cyc.com] illustrates how much work you have to put into tagging to get very little out. After twenty years of that money sink, it's still useless.
Re:Watching the XML kiddies reinvent the wheel (Score:2, Interesting)
have a look at N3 [w3.org] or ntriples [w3.org] for starters.
Re:Watching the XML kiddies reinvent the wheel (Score:2)
And after five years (and yes a lot of cash), the Gene Ontology is an incredibly useful tool for biologists.
It's not the answer to everything, but it makes some things easier. This is enough.
Phil
Re:Watching the XML kiddies reinvent the wheel (Score:2)
They'd loaded it up with information about the MIT/Cambridge area, and information about the Middle East. Suggested queries were things like "Who is the king of Jordan", and "Is MIT in Cambridge?". So I tried queries like "Who is the king of Israel", which returned the name of the premier of Israel.
Inference was broken. I asked "Is MIT in Cambridge" - Yes. "Is Cambridge in Massachusetts?" - Yes. "Is MIT in Massachusett
RDF (and OWL) in Pike (Score:4, Interesting)
I noticed the article made no mention of Pike (also the name of a fish - see language logo). Pike's a fine C-like scripting language ...that I know extremely poorly myself, but anyway..
From Pike's official homepage [ida.liu.se] (at the University of Linkoping, Sweden):
Worth downloading [ida.liu.se] and checking out for other reasons [ida.liu.se] than "just" RDF & OWL [w3.org]. Free software, available under LGPL, GPL, and MPL (Mozilla Public License).
Look also at XMP (Score:2)
When looking into metadata, people should probably be sure to check out XMP [adobe.com]
It's from Adobe, and whereas RDF just says how to format metadata, XMP addresses what to include in your RDF, and how to place it into different types of files. They have free libraries, but it's simple enough to follow even with your own code. And... given that it's how all Adobe products are doing metadata, at least in the publishing world it will probably stay something to pay attention to.
Creative Commons has addressed this [creativecommons.org],
RDF (Score:2, Funny)
Re:RDF (Score:2)
Another format.. (Score:2, Insightful)
Good, yet another format to use/suffer!
No matter how good those formats are (XML/RDF/etc) they all fail at the simplicity norm, the KISS principle.
In the example of the article, by not using a simple text oriented format they innecesarily complicates the access by any program to these values, and that leads to the second point.
The computational cost involved in parsing / validating all those formats; the day that our cpu's can process hundreds or thousands of simultaneous parsings without a noticeable i
XML?!?!? (Score:2)
Forgive my zeal, I just really hate the XML for everything mentality.
Re:XML?!?!? *sigh* (Score:2)
Secondly, please explain how the implicitly-described files in your NTFS streams can be seamlessly shared over the Web in a composable way.
The point of RDF on the desktop is that it does statement-level meta-data very well, and is Web-integrated.
Re:XML?!?!? *sigh* (Score:2)
As for sharing metadata over the web (I am not talking about NTFS about it, because until today I didn't even know it supported extended attributes), I think HTTP headeders perfectly fit this purpose - they are metadata, after all. Just encode every attribute in an HTTP header.
Besides, the main use I see for metadata is to improve orga
Example of missing metadata (Score:2)
Metadata with clipart & Inkscape (Score:2)
There's been work on adding Dublin Core metadata support to Inkscape, for its next release [inkscape.org].
The need for the metadata support is entirely practical in this case: the Open Clip Art Library [openclipart.org] requires all SVG submissions have proper metadata embedded, to ensure licensing and authorship correctness. Also, there is an SVG Clip Art Browser [openclipart.org] that uses the metadata info for its display.
One interesting observation that's come up recently and is being discussed on the lists [inkscape.org] is what happens when you embed several
DEVONthink for the Mac (Score:2)
It text indexes all supported Mac document files (Web, RTF, text, PDF, etc.), and can store anything (links, movies, PDFs, whatever). You can then do very fast search.
Have a look.