In its short life the semantic web we knew so little passed through the peak of inflated expectation, went round the cape of unrealistic ambition and finally found a resting place in the great junkyard of unwanted technology in the virtual cloud. At one time our information industry seemed to have the most to gain (or lose) from the threats and opportunities presented by our recently lost friend. So, what went wrong?
The era passed with the recent announcement by Google, Yahoo and Microsoft of the launch of schema.org. Schema.org provides technical documentation on the ways in which the major search engines will recognize structured data in your web pages. It shows how to get rich snippets of content and data from your site directly into search engine results pages. Rich snippets are the next step in the evolution of search, because they allow search engines to read meaningful semantics into content on the web.
If rich snippets sound surprisingly like an application of the semantic web, then it’s for good reason. A huge amount of time and effort has gone into researching how to add layers of machine-readable information to the human-readable web, with the grand view that the machine-readable web would always underpin a new wave of disruptive innovation. Web 3.0 would be the next big thing.
However, Google et al. have chosen not to base the next big thing in search, rich snippets, or semantic web technology. Schema.org eschews RDFa in favor of simpler HTML5 markup.
For years semantic web purists have been preaching that the future is all about RDF and triples. Yet, in the 12 years that theorists have been working on the semantic web, we’ve yet to see many convincing practical uses for the technology. The graph I’ve included above shows the rise and fall of Web 2.0 job postings compared to job posts requiring semantic web technologies. This makes a pretty clear case that the semantic web simply never took off.
Certainly there are niche applications, in taxonomy design for example, but there is no tidal wave of change. Search is the application with most to gain from semantics (because it enables rich snippets), but the major search engines have abandoned RDF in favor of simpler, easier to use technology.
So, what does this mean for publishers? Firstly, semantics are still vitally important – it’s still critical to produce high quality content and metadata. There is no substitute for building a carefully designed XML workflow. This will ensure that semantic markup and metadata can be delivered to search engines and other downstream partners effectively. This also ensures you have the necessary foundation for maximum usability and discoverability for users within your site.
But it also means that we can lay to rest many of the technical questions around RDF, triples, inference engines, OWL and other esoterica. There are some parts of the semantic technology stack that I think are still very interesting and I’ll be talking about these more in a future post. But for now it clear that XML workflows will deliver what you need to participate fully in the the ecology of search. And henceforth we can lay to rest the Web 3.0 which never happened.