DrupalConDC Using Intelligent Web Services for Semantic Drupal Sites
Works best with well-written long-form prose (not well for tweets).
Puts the document into topic.
17 now, will probably go up to 300 and stop there.
semantic web compliant format: RDF
Then goes a step farther. For a subset, takes you into the linked data world. Another emerging standard, a Tim Berners-Lee initiative
for instance dbpedia
CIA World Factbook
Calais is a web service. Hard to demo.
When machines are talking to machines, usually takes us half-a-second to process a normal news article.
People have sent novels, it takes longer.
It would take an editor minutes at least to highlight such things.
As a publisher, right now, you could use the categorization ("Business Finance") to put it in the right place.
Calais is not a dictionary driven system.
80% of names for a South African affiliate are South African tribal names, and it mostly gets them right (knows they are names). There's no dictionary for this.
It had Barack Obama as president of Hilton hotels!!!
Has a web page about AIG - from XML, with facts about the company.
We find the right link to AIG in dbpedia
How do you create the local technologies companies sidebar for the story about local technology companies? Have an editor do it manually. Calais can do this.
How do you make money?
We don't. About enough to have our Christmas
Throing it out in the world and having 30,000
Content interoperability. Thompson Reuters spends billions bringing in content and
We can embrace it. We can't own all the content in the world, but our stuff can work with all the content in the world.
The fact that it's not altruistic is a good thing
I'm a web director at a Danish newspaper. Your stuff is great-- but my newspaper is in Danish.
We have to roll out languages manually right now. Are doing French, with Spanish, German coming.
We will never do Danish the way we have to do it today, looking at an automated situation
No plans to do
How to build a semantic web site using OpenCalais.
I founded Phase2 ("open source. open minds.")
got involved with open source early on
likes doing stupid things with bicycles and Drupal
playing around in semantic web stuff for [six months now]
I can read RDF just by looking at it.
lot of names for it:
ggg (Giant Global Graph), contextual web, web 3.0, linked data
What do we need to build a semantic site?
Drupal has great modules -- CCK, Views, ...
RDF, RDF CCK, FOAF, relations, sparql, sioc, the calais collection
auto-tagging of yoru nodes
What calais returns gets turned into a Calais term, which then integrates with Drupal core taxonomy, making it a taxonomy term.
all attached to nodes as taxonomy terms.
Configure Calais for what nodes will have the process run, if terms are added or just suggested, etc.
All this happens on node save.
Becomes terms in a free tagging taxonomy.
0.0. to 1.00 relevancy scale
Fully integrated all of the terms calais knows about with views.
autodiscovery links-- like the RSS link that makes the icon appear.
We added: "this page is also available as RDF"
<link rel="meta" ...
RDF is ugly-- look away
"Where's the RDFa?"
It's a way of embedding the RDF data in an HTML document. Your users won't see it, but things that can read your page can
We don't have that just yet.
We don't want to reinvent the wheel, and there's a lot of RDFa work going into Drupal 7 theme layer.
"More Like This"
this collection of modules, a plug-in architecture, to show related content to your nodes -- can be on-site or offsite
The collection of terms that you feel represents the essence of your node content.
You can also configure the relevancy for this.
tells you everything about a topic.
the plugin for more like this and the plugin for geo are used here.
Show content in various contexts.
Integrated with Panels 2 to drag around the layout.
It's all about the URIs.
You put a URI
They can be dereferenced.
Washington, D.C., opencalais
owl:sameAs link goes to dbpedia page for Washington D.C., so you know
contextual geo data
The Calais geo module, built on Calais terms.
On a map, but wait, there's more.-- link to DBPedia
Or can pick the most relevant company term, and have it's DPPedia information show there.
How does this all go together?
Sparql query to get the data from DBPedia endpoint
render it into HTML
Marmoset: microformats for search agents
allows search agents to h
We've packaged all these modules together into something we call Open Publish.
Drupal, an installation profile with a bunch of modules, all available on Drupal.org
plus theme and glue code
Released it last night, it's for everyone, can download it
Visit the "Linked Data Lounge" -- go right out of Ledorf room, then right again, then right again -- there's little rooms behind this room.
Topic hub configuration:
Calais Document Category Health and Industry Term Food
Industry Term Exercise
All these tools will work with content you created, or pulled in by feeds.
I can look at hot topics, what is trending on the site.
Put Chicago term on the map, and it grabs the wikipedia page, population, and other info automatically.
Trying to be on the cutting edge, front of what semantic web is.
Integration with taxonomy? Yes.
Vocabularies created automatically? Yes.
Calais entity types are created as vocabularies in Drupal (that's a lot of vocabs)
Relevancy score in relation to content determines the boldness/size of terms on the suggestion page.