Snap Shots

So I’ve had this long-running thought-thread about what we “mean” when we make certain kinds of links in a web page. In some cases, you are quite specifically trying to send someone to a destination. However, there’s another sense in which your intention is more to clarify/elaborate/disambiguate the linked text. For example, if you are writing about Jazz, you might want to make it clear which George Lewis you are writing about: the trombone player and educator or the hot jazz clarinetist. (For that matter, the Wikipedia disambiguation page for George Lewis lists those two and six more!)
Now, individual humans don’t have simple public universal identifiers (yet!) but some things do. Books are a very obvious example. Publishers have been assigning ISBN numbers to books for more than forty years, and this is done even by small publishers today. Often when someone links to a book at Amazon, they are doing it more to give you a clear idea of which book they are talking about, even though they are not particularly trying to get you to use Amazon.
To stay specific, imagine that I’m writing about a book I’ve been reading lately, Sweetness and Power. I could link to its amazon page. But maybe you prefer to shop at Powell’s. (They do have it 2 bucks cheaper.) Or maybe you just want to see its LibraryThing page.
Or maybe you want to see if it’s available at your local library. If you’re a real geek, perhaps you’ve installed Jon Udell’s remarkable Library Lookup bookmarklet, but that only works once you’ve gone to the destination page. (This presumes you live in a more technologically enlightened library district than the Chicago Public Library’s, whose database is fundamentally at odds with any kind of citizen mashup cleverness.) You would prefer that I simply provide the ISBN alongside the book (or rather, in markup tags around the book), but there’s no commonly used way to do that.
There is a very old specification for this kind of thing called URN, which is in fact related URL and URI which you’re more likely to have heard of. (The W3C tries to explain this family of ideas, but if you’re not hardcore, you might be better off with Wikipedia.) In fact, a specification for URNs describing ISBNs was published five and a half years ago. Nevertheless, URNs have not really taken off, no doubt largely because of the bootstrapping problem — if no browser does anything smart with URNs then what motivation do authors have for using a URN instead of a URL — browsers know what to do with URLs.
So one group of semantic web explorers believe that the way forward is to provide a way that you can spend relatively little effort changing the way you mark up pages and provide a lot of added value. The home base for this idea is microformats.org. They’ve done a lot more than I can fairly summarize here, but it still feels like their developments are still largely theoretical. Furthermore, it doesn’t appear as though they have really tackled the URN/unique identifier for anything. What they have done is interesting, so forgive me for a brief sidetrack.
tails1.png
tails2_large.png
There’s a Firefox Extension called “Tails” which demonstrates some of the promise. In the picture to the left, you’ll see the green tails logo and the tooltip, “1 object(s) found on this page.”
If no objects were found, the logo would be grey. If you click on the green logo, a window will pop-up showing more information about what was found. As you can see from the image on the right, the pop-up is not very well integrated into the page, and if you look at it much more closely, it’s pretty abstract and generic. Furthermore, being in the status bar, it’s easy to miss the tails logo turning green. No disrespect to the author, but it’s not the kind of thing that is going to appeal to the masses.
wp_nav_popup.pngFor a more inspiring example, consider a neat feature I recently found on Wikipedia where you can have summary pop-ups for intra-Wikipedia links. That’s more like it. The things pop up accidentally once in a while when you’re browsing the wikipedia — I don’t find it happening enough to be annoying, and as a consequence, I recall that they are available and am finding myself using the popups once in a while. And we still haven’t really addressed the question of uniquely identifying things outside of the context of a specific web site. (One might argue that Wikipedia URLs could be defacto IDs, but they are too much at risk of changing for that.)
So anyway, in the course of checking some of these things out, I stumbled across Snap Shots. You may have already stumbled across them here, because I plugged in something which enables them on my site. You’re probably seeing popups over the URLs on this very page. They have even made a system of customized snap shots that provide different views for certain destination links, such as Wikipedia or YouTube. (I’m particularly interested in the YouTube ones because I don’t like the pageload time when I embedded the YouTube videos in my recent JazzFest post.
The caveat here is that these are provided by a commercial concern, Snap.com, an upstart startup search company. Then again, part of what I stumbled across in looking at Microformats is a serious discussion of the intellectual property risks in that project as they have documented things so far. (To their credit, members of the Microformats (or µF for short) community have acknowledged the need for clarification and have committed to responding within the next two weeks.)
In any case, it seems that a search company has some motivation for providing tools which encourage people to make it easier for you to provide good search results. If they extended this Snap Shots idea to embrace URNs or something like it, they could also be using all those URNs in pages to help disambiguate searches. Depending on how evil they are, they might auction off snapshot rights so that Amazon got dibs on all ISBNs. I hope they don’t get awarded a patent on the idea of running pop-ups like this.
In the meantime, there’s still the bootstrap problem of getting people to use URNs. You might be able to do it for some kinds of things by analyzing the URL itself, although that would be fragile. Alternatively, perhaps we could cook up a Microformat style convention for using HTML class names or other standard markup to “sneak it in.” But we’ll have to save that for a future post…