After much planning and beer consumption, InfoCamp PDX is a reality. On Febuary 4, 2012, InfoGeeks from around the Pacific Northwest will converge to talk about information: what it is and how we find it, use it, structure it, design it — really whatever we want, because it’s an unconference.

Come join us! Registration’s open now. Bring your burning ideas for a session, or just come and join the conversation. Want to sponsor us? Send us an email at infocamppdx@gmail.com.

Hope to see you there!

Ah December, the month of the top ten list and the year in review.

I love it; it’s like a cram session for people like me who don’t pay good attention. Come January, I’ll put my head back down, and ignore what the rest of the world is doing. But for now, I’m energized as I belatedly stumble across everything that happened in 2011. (Hey, look, it’s html5. And is that a sparkly vampire chasing it?)

The folks at semanticweb.com have the standard “Best of the semantic web in 2011” round up post running, and there’s a lot of good stuff there. Siri of course remains the golden child of the semantic app world, even if it sometimes acts like, well, a computer. And just as in 2010, some of the most excitement and movement came around open data, linked data, government data and big data. All that data without clear models to structure it makes a modeler nervous, but it’s a good lesson in the need for pragmatism. If you can’t quickly develop and publish simple, general, reusable models (which mostly we can’t), people are going to move on without you.

Just as good is the site’s “Misses and misteps” article. I don’t think I’ve come across an annual post-mortem like this before, and I really appreciate everyone taking off the rose-colored glasses for a few minutes to take a critical look at the year. My favorite quote from it is this:

2011 was the year — well, the latest year — that the Semantic Web didn’t pan out.  The Semantic Web is the New AI: Technology that’s always on the verge of revolutionizing computing that never seems to deliver.  It’s a shame, but at least we’ve learned to focus on what’s practical and more likely to produce business value, semantic technologies such as text analytics are here-and-now rather than perpetually just over the horizon.

A little cynical? Maybe, but it fits my own view of the semantic world. We’re probably never going to achieve the full-blown vision of the semantic web’s original architects, and personally I’m okay with that. That vision is driving a lot of really practical sematically-influenced work that will infiltrate and improve all sorts of technologies and techniques, whether or not it results in a purely semantic solutions.

Also, for historical completeness, here’s a post I wrote for SmartBear’s Software Quality Connection rounding up 2010′s semantic web highlights. I get points for posting the link within the calendar year of the year I wrote the article, right? No? Well, here’s to more regular blogging in 2012.

IA meets semantics

February 28, 2011

My worlds collide! I just caught up with the current issue of the Journal of Information Architecture — and it’s all about semantics and structured data. As an information architect who has never wireframed a website, I’m excited to see the IA community dive into the world of structured data and ontology. It’s great to see the likes of enterprise information architects and master data management practioners add some semantic tools to their toolkits.

I'm almost afraid to click…

104 ways you’re wrong

August 5, 2010

Here’s a fun catalog of some of the many ways our cognitive processes trip us up. I’ve committed at least 37 of them today alone. How about you?

Cognitive Biases – A Visual Study Guide

Here's a short definition of an ontology that I wrote up the website at work. There's a lot more that can be said, but I think the discussion of why ontologies are useful is of interest.
 

An ontology is a description of the entities in an area of interest, or domain, the attributes of those entities, and the relationships between them. This description is both formal, meaning it can be acted on by a computer, and human-readable.

One of the major strengths of an ontology is that it lets us organize information in terms of the problem we’re trying to solve, not the data we’re collecting. While data remains important in an ontology-based information system, it is structured according to the concepts of the domain, not the table structure of the database it’s stored in. This is important for two reasons: we can formalize relationships between pieces of data that would only be hinted at by foreign keys and naming conventions in a database. More importantly, it frees us to think about our problem space in terms of concepts and abstraction, not data. To take a model-driven approach instead of a data-driven one. Humans think in terms of models, not data. It is models that give meaning to data. As we deal with ever increasing volumes of data, it is models that help us identify what’s important, organize it, hypothesize about it, and discover connections between disparate data.

 

Posted via email from Modelicious

Last week Google began including results from Twitter on their results page. The tweets are accessed through a timeline with a handle you can grab to scroll through results over time.

This is incredibly cool. At the same time, I can’t help noticing that while it presents a lot of information, it’s not immediately clear how to construct meaning from it.

Google talks about using the results to “’replay’ what people were saying publicly about a topic on Twitter.” That seems to describe the usage model pretty accurately: search, scroll through all results, and make of them what you will. It seems to lend itself to historical or anthropological purposes, rather than traditional search.

Here’s some sample tweets returned by searching for “Obama“: This isn’t so great if you’re interested in policy, but highly interesting if you’re investigating the teaparty movement. Ditto with this result:

Up until now, if you were researching a group of people, you would search on the group’s name. With tweets, you really want to search on the topics the group publishes about. So this could change the average information consumer’s search strategies.

The Google Blog suggests this search to “relive” Shaun White’s Olympic glory. The idea of reliving it is interesting, because what’s being relived is not the actual moment, but the response of thousands of people to that moment.

(And, like everything else, it could really use semantic search to filter out stuff like this: )

To sum up: Twitter on Google is very cool. It will change the way we search, but right now not even Google knows a good way to use it. It dumps a huge amount of raw info on the searcher, and leaves it the individual to navigate, sift, and construct meaning out of it.

But, it was only announced this week, and clever people are certainly already at work on innovative ways to build meaning out of the firehose that is the global tweetstream. A semantic search layer? Sentiment analysis? There’s a lot of possibility here.

By the time this posts, Google will probably have rolled this out worldwide. Have you tried it? What do you think?

Posted via email from Modelicious

Weirdly enough, this isn’t a rhetorical question. (That it’s not is one of the many things I love about my job.)

Lately I’ve been evaluating Proton, an upper ontology developed by SEKT, which is an EU initiative. Most of the ontologies I evaluate aren’t written by actual ontologists, which leads to a certain amount of ranting and despair on my part. Proton has been developed by a number of skilled ontologists and logicians, and it’s a pleasure to spend time with a well-thought-out model. Of course, modeling is an art, and I don’t agree with every modeling decision in Proton — but that’s okay, because my disagreements give me a lot of food for thought.

One idea that’s given me pause is whether or not notions of time and number are more abstract than other concepts in an ontology. In Proton’s model, classes generally descend from the Entity class, with the exception of a few system classes. Entity has three direct subclasses, Abstract, Happening and Object. The full hierarchy looks like this:
Proton top-level entities

Thinking about it, I can’t come up with a single reason why the concept of time is more abstract than any other concept represented in an ontology. I understand that it’s modeling an abstraction — there’s no concrete thing in the world called Tuesday; it’s a concept in our calendar system. But there’s also not a concrete thing in the world represented by the class “Boston Marathon” or “US Currency” or “World Leader”. There are real instances of all those classes, but Tuesday March 9th is a real instance of the concept of Tuesday. So, is there a difference? 10 points to anyone who can explain it to me.

True Knowledge went into public beta last week. I’ve been playing around with the private beta for the better part of a year now, and there’s a lot I like about this system. So far, they haven’t received a fraction of the hype of some other knowledge bases (<cough> Wolfram|Alpha), but what they’re doing is more interesting, harder, and truly semantic.

One of the things I love about True Knowledge is that it exposes lineage. Run a query, click on the “How do we know this?” link at the bottom of the results, and you’ll see the facts and reasoning used to derive them. This visibility into the reasoning process should be standard operating procedure for any semantic application — without it, you have no way of assessing the quality of the information you’re getting. Lineage is noticeably absent from Wolfram|Alpha, which is one of my main complaints about it.

Similarly, True Knowledge lets you agree or disagree with any fact in their knowledgebase. You can edit existing facts, or contribute new ones. I like this because it means (a) there’s a model they’re computing over and (b) the model is extensible. And their UI is a great example of how to painlessly elicit complex information from end users.

True Knowledge is smartly done, model-driven, and really different than any other “semantic” system I’ve demoed to date (what with actually relying on semantics and all). It’s been interesting to watch True Knowledge evolve, and my hope is that they’ll not only succeed, but become the gold standard for semantic web apps to come.

Posted via email from Modelicious

Is the semantic web a memex?

December 31, 2009

I agree with mc schrafel that the semantic web needs a better metaphor, or really any metaphor, to help people understand and embrace it. I’m just not sure the memex is the right one. It’s not a concept that’s easily recognizable by most people. And I’m not convinced that it’s an accurate metaphor.

Central to Vannevar Bush’s original description of the memex are paths of association between items, the connection made between point a and point b. While ontologies and semantic web apps let us label the relationship between two things, I’ve yet to see an application that lets you capture the path that led you to make that connection.

So for instance Zotero lets me say Paper 1 is related to Paper 2, but not that I followed a link to a citation in paper 1, which led me to a Wikipedia page, which led me to Paper 2. Paper 2 and Paper 1 may have a generally meaningful relationship that any reader would recognize: a shared author, similar subject matter. Or their relationship may be meaningful only to me: there was some association I made along the path from Paper 1 to Paper 2 that may not matter to anyone else. However, that association — the dynamic path leading to the association, not the static association itself — may be a source of information or inspiration to me. Where is the system that lets me preserve it?

To the best of my knowledge, that system doesn’t exist yet. Really, that’s not too surprising: we’re still working on representing the relationship between two things, much less the evolution and lineage of that relationship. There are thorny semantic and user experience questions related to the larger project, especially working across the boundaries of information systems and the semantic web does (or will). But it’s a worthwhile goal, and we should make sure that we make it there and aren’t satisfied with representing static associations. Why? Because doing so creates rich context, that starts to approximate the kind of implicit context humans generate all the time. It grounds are machine representations in human notions of time. And it facilitates that mysterious capacity humans have of sparking new ideas by juxtaposing two apparently unconnected things.

So my answer to my own question at the top of this post — and to dr. schraefel — is: not yet. But maybe someday.

Follow

Get every new post delivered to your Inbox.

Join 79 other followers