Giovanni Pirrotta

Just a curious person

Semantic Web Ingredients: the SPARQL language

October 27, 2014

In the previous post of this series I described the RDFa language specifying the most important advantages in using it. Now it’s the turn of the SPARQL, a sort of SQL language for the Semantic Web world. #}

Image

The SPARQL language is a W3C Recommendation defining a standard way to retrieve data from RDF graphs distributed on the Web. SPARQL performs complex joins exploring data by querying unknown relationships. The querying criterion is based on pattern matching mechanisms, and, in particular, on triple pattern constructs, that reflect the RDF triple assertion model and provide a flexible model for finding matches.

Triple patterns are just like triples, except that any of the parts of a triple can be replaced with a variable. For example, in the following pattern ?book ex:title ?title, in place of the subject and of the object, two variables are involed, marked with ?, that act as unknown variables while the ex:title property acts as a constant. SPARQL variables can match any resources or literals in the RDF dataset.

The SPARQL skeleton is:

# prefix declarations
PREFIX ex: <http://example.org/>
...
# dataset definition
FROM ...

# result clause
SELECT ...

# query pattern
WHERE {
...
}

# query modifiers
ORDER BY ...

In the graph http://example.org/books.rdf we want to find all book resources (?book) and all person resources (?person) linked with the ex:hasAuthor predicate. So, book titles (?book_title) associated with relative authors’ name and surname (?person_name,person_surname) are returned.

PREFIX ex: <http://example.org/>
SELECT ?book_title ?person_name ?person_surname
FROM <http://example.org/books.rdf>
WHERE {
       ?book a ex:Book;
             ex:title ?book_title;
             ex:hasAuthor ?person.
       ?person ex:name ?person_name;
               ex:surname ?person_surname.
}

The SPARQL structure

  • PREFIX clause defines prefixes and namespaces, for abbreviating URIs;
  • SELECT clause defines the information we want to retrieve from the statement repository;
  • FROM clause defines the RDF graph(s) to explore. It can be a local or remote source; we can also set the FROM NAMED and GRAPH clauses to specify multiple data sources;
  • WHERE clause defines the graph pattern to find a match in the dataset; it represents the most important SPARQL clause in the query;
  • ORDER BY ordering, slicing and other rearranging query results.

This is the result:

------------------------------------------------------------------------
| book_title                             | person_name | person_surname |
========================================================================
|"UML Distilled"                         | "Martin"    | "Fowler"       |
------------------------------------------------------------------------
|"Test-Driven Development: By Example"   | "Kent"      | "Beck"         |
------------------------------------------------------------------------

SPARQL language is similar to SQL language. SPARQL selects data from the dataset by using a SELECT statement to determine which subset of selected data is returned. Also, SPARQL uses a WHERE clause to define graph patterns to find a match for in the query data set. Also the binding between variables and instances will generally return in table format but we can also specify other formats, such as JSON, RDF/XML, etc.

That’s all for now. In the next posts I will introduce how to develop a new ontology using the Semantic Web ingredients learned in the previous posts of this series. So, stay tuned!

Comments