Do Tags Matter?

Monday April 11th 2005, 8:42 pm Printer Friendly Version
Filed under:Semantic Web
Posted By: Matt

Tim Bray is still wondering about tags. In reference to Technorati Tags, he asks:

Are tags useful? Are there any questions you want to ask, or jobs you want to do, where tags are part of the solution, and clearly work better than old-fashioned search? I really want to believe that tagging is big, a game-changer, but the longer I go on asking this question and not getting an answer, the more nervous I get.

Well we can’t have that, so let me take a stab at it, keeping in mind that what Tim is really asking is whether tag-based searches have the potential to give better results than keyword-based queries.

Do tags matter? Yes. It’s true that the utility of the Technorati Tags feature is still very limited, as I’ve discussed in the past. It takes a certain leap of faith to see where this is leading. At the same time, there is a clear advantage to the tag approach, which is that it involves a conscious effort on the part of content authors to help you determine whether their material is of interest to you. Any time you can set up a synergistic bond between human provider and human consumer, you’re going to get better results than you can through purely automated processing like massive full-text indexes. Blog authors want more readers and readers want a remedy for information overload; since both of these itches can be scratched by tagging, the necessary synergies are clearly present.

An example: as an enthusiastic developer of Firefox extensions (well, extension really, but I do plan to create many more), I’m interested in posts about Firefox. But forget about searching for the word “firefox” using a full-text engine. To me it looks like at most 10% of the hits are actually about Firefox as opposed to irrelevant posts that use the term tangentially. Now compare this to the corresponding tag-based search. Still not breathtaking, but a whole heck of a lot better.

And let’s not forget that full-text engines have been mature technology for decades, while tag-based systems are in their infancy. There’s any number of things that Technorati could (and doubtless will) do to make their results better. For starters, they could filter based on language. All those Japanese posts (and there are lots of ‘em) are just pretty little pictures dancing on the page for us ignorant western types. Hardly edifying. And, as I’ve mentioned, it would be a lot easier to wade through the RSS feeds if they’d list the blog name and Cosmos statistics for each entry.

In fact, what I’d really like to see is a system like the one used for popular tags on del.icio.us. Instead of using the absolute number of bookmarks to determine what gets on the much-coveted popular page, they use the rate of growth (i.e. the first derivative). So instead of seeing the Slashdots and Boing Boings and other stuff-I-already-knew-about entrenched at the top of the list, you see what’s hot right now. The only problem is that the list is skewed to what the del.icio.us crowd is digging. If Technorati could duplicate this for the whole blogosphere, choosing a reasonable cut-off threshold so I only get links to a manageable number of posts, they’d have a whopping hit on their hands.


2 Comments »

  1. I think the value of tags is, as you say, up for debate. We’ll see. They are useful in a short range, but also going to be very game-able (as you know, add [apple] to one and traffic increases disproportionately…) They may only really become useful when cross-referenced with other search results, user proflle and other search meta data. It’s all about the metadata as Jones tells me…

    Comment by paulpod — 4/12/2005 @ 1:47 am

  2. I too have faith that tags will improve findability. Full-text search can’t “see” the metadata provided by a tag so, though it’s the prevalent technology now, I think it will lose out to tag search. If the gaming problem can be dealt with.

    Comment by Mike Harper — 4/12/2005 @ 11:50 pm

Trackback URL RSS feed for comments on this post. TrackBack URI

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

(required)

(required)