Thursday, June 11, 2009

Semantic Metadata & Sagacious Serendipty

I sat in numerous conferences at the Gilbane conference in San Francisco last week listening to and, preparing to speak on, search. Personally I want to see this word retired, as it conjures up phrases like “…in vain,” “desperately seeking,” and images of poor Diogenes schlepping around Athens with his lamp and cynicism. Or, closer to home, the time I had to find 40 pairs of white tube socks, seamless, for a snowman project for my son’s class (during a snowstorm no less). Since that “in vain,” “desperately seeking” experience, I only volunteer to bake brownies.

But I digress. Now don’t get me wrong; I am not one of those protectionists that is against sharing content. Personally I think sharing is a good thing. What I am against is the word search itself. Because which of us wants to search anyway? I’d rather be finding things, like the $100 bill on the sidewalk outside the OTB located around the corner from my apartment. Or the brand new earring I had lost – and found – outside my car door. Find is about that Eureka! moment; that culmination of both relief and joy that comes with discovery. While search to me is futile and thankless toiling.

Public site search for the most part makes me crazy. Like the time I went to a city’s business site to look for someone – and for whatever reason was inexplicably given people with the same name located from other cities and states. Upon further digging, I “found” the person I was looking for on the site. The search tool just didn’t filter the results by the geographic location I was actually in. It just pulled people with the same name and vomited it out results. Well really, how interested are you going to be in something that was just vomited at you?

Physical-world architects
have long known that site satisfaction and return visits are highly correlated. That people explore their environments encumbered by whatever stresses in their lives: are they late for a meeting? in dire need of a restroom? Do they have specific destinations in mind – or are they out for a stroll and will respond to whatever catches their fancy? The physical world uses many different types of sensory cues to guide people.

The digital world is more limited when it comes to sensory cues -- but there is a way to create a framework to allow people greater site satisfaction and discovery. The key is Metadata. As my friend Ali Rahman says, “metadata provide a big picture and a detailed view of your information. Now we are not talking about generic, run of the mill metadata. The type that says the type of file, the date created, modified, type of format and so on.A ccording to Kent State's College of Library & Information Science, that type is called Administrative Metadata.

No, the type of metadata I am talking about is more Xtreme, if you will. The academics at Kent State call it descriptive metadata, while the folks at Nstein prefer to call it semantic metadata (semdata??). It is metadata that is generated using a multi-faceted approach of computational and linguistic analysis. It not only extracts meaning from documents – but also embeds the synonyms, summary, categories, even the tone, in order to create a linguistic fingerprint. This linguistic fingerprint can then be matched against any other linguistic fingerprint – to find like pieces of content.

Having this metadata means you can create interesting ways to guide people through the site. Go back to the shopping mall metaphor: The mall maps group stores by category – such as women’s shoes -- look at the map, check where you are -- and voila! you are on your way. In the digital world, commerce sites do a super job faceting information so that a person can be guided right to the shoe they want to buy, allowing people to search by Color, Brand, Size, even Heel Height!

But commerce sites are easy – as that data is fairly structured since information is normalized and sitting in fields with headers that say “heel height.” Prose and rich media are considered unstructured, using synonyms and inferences, and as such, are much harder to correlate. Semantically analyzing and enriching content allows sites to marry content – even if those words are not explicitly used! This allows content to be packaged together so people can find what they are looking for without breaking a sweat.

Serendipity is often referred to as an accidental discovery -- but in science, serendipity is linked with sagacity which presupposes a framework that facilitates discovery. The use of semantic metadata provides a matrix of "triggers" from which people are now able to have more meaningful explorations -- leading to those highly coveted finds.

Take it from Diogenes: search just doesn’t guarantee that you will locate what you are looking for.

No comments: