Microdata - semantic markup - schema.org

Posted by


Semantic markup helps technologies such as search engines and web crawlers better understand what information is contained in a web page. Microdata is a specification used to nest semantics within existing content on web pages, used by major search engines Bing, Google, Yahoo! and Yandex to provide better search results.

Semantics (from Greek: sēmantiká) is the study of meaning.

So by adding semantics, put bluntly, the computers that index my blog knows what is contains, or what its meaning is.

Microdata introduces five attributes available for any element to use, which give context for machines about your data. These five new attributes are: itemid, itemprop, itemref, itemscope and itemtype.

Microdata is a subset of making a document have meaning to machines, just as it has meaning to a reader of the document.

For Web crawlers and other machines to understand the meaning of my content, they need to use a shared language to do the interpretation, so the search engines created Schema.org and have agreed to support and understand it.

So, I like crisp bread, and want to share my crisp bread recipe with the world. Being a 21st century, web 2.0 kind of guy I make a blog post to show the world how to make the deliciously crispy, healthy, fairtrade and ecological kind. I still do it the old way by adding some <div>'s, <li>'s, <p> tags to my html to structure the content, but because I want the search engines to know that this is _my_ recipe, I add some structured markup from the schema Thing > CreativeWork from http://schema.org/CreativeWork like so:

Let's say that some company creates a mobile application that lets you scan a box of Flax seeds in the department store, and then find recipes containing that ingredient based on a Google search. Since I added semantic markup to my html, Google knows that my html describes a http://schema.org/Recipe with the ingredient "Flax seeds", and the mobile application may tell the user:

The Recipe "Runes delicious crisp bread" containing Flax seeds is rated to 4 stars based on 42 reviews, but you will also need the ingredient "Sesame seeds".

-Hey wait! I got the part with "you will also need the ingredient", but where did those stars come from?


What is also cool is that html5 microdata specification describes an API for extracting structured markup from web pages. So if the lightweight mobile application only want to extract the ingredients and http://schema.org/NutritionInformation from my recipe blog, they can do exactly that.

If the peson scanning the flax seeds is lazy and don't want to make the crispy bread themselves, they might just want to check if my http://schema.org/Organization can provide a good http://schema.org/Offer. If my crispy bread becomes world famous - without question because of my good semantic markup - it might never be http://schema.org/InStock - so perhaps I should make it available by http://schema.org/PreOrder?

If you are enthusiastic about using Microdata, I suggest you head over to the Schema.org Type Hierarchy where you can see the complete schema.org vocabulary that you may use to mark up your content. It is a well structured page with plenty of good examples for each schema.

Bon appetit!