Gmane
From: Holger Knublauch <holger <at> SMI.Stanford.EDU>
Subject: Re: Protege, named graphs, and SPARQL
Newsgroups: gmane.comp.misc.ontology.protege.owl
Date: 2005-08-03 20:11:54 GMT (3 years, 48 weeks, 6 hours and 38 minutes ago)
This is working indeed!  We now have an implementation that wraps a live 
Protege OWL triple store as a Jena Graph (and Model).   This means that 
arbitrary Jena query services can be executed within Protege.

The relevant call is

OWLModel owlModel = ...;    // Protege model
Model model = JenaModelFactory.createModel(owlModel);   // Jena model

I also added a quick-and-dirty SPARQL query tab to Protege (see 
screenshot).  This is extremely primitive yet, but hopefully useful on 
the long run.  All this is on CVS and part of the next beta.

Special thanks to Andy and Dan for valuable Jena guidance!  In fact I 
only had to implement the Jena Graph and PrefixMapping interfaces, and 
here especially the find method.

Holger

Seaborne, Andy wrote:
> 
> 
> Holger Knublauch wrote:
> 
>> Thanks very much for the pointers, Dan.  This relates to a common 
>> request by our users, who want to use query languages like RDQL inside 
>> of Protege.  Protege doesn't have a native implementation of an RDF 
>> query engine yet, but - like Jena - has an underlying triple store [1].
> 
> 
> Internal Jena works on an abstraction that is a symmetric triple store. 
> (Holger - you probably know that- context for everyone)
> 
> Graphs are a a set of Triples, triples consist of three Nodes and there 
> is no restrictions on those nodes (the RDF-ism's like only URIs in the 
> property slot exist at the higher levels).
> 
> http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/graph/Graph.html
> 
> Any implementation of a Graph can be used with the rest of the system.  
> The simplest implementation has lille more than an implementation of 
> "find(S,P,O)" (nulls for "any") and all the other operations for map 
> down to this in various ways (there is also a "contains").
> 
>  From this basis, an implementation can choose to add various facilities 
> for increased performance, like a conjunction of triple patterns (the 
> Jena databases turns these into a single SQL statement where possible) - 
> there are default implementations of all the functionality so it is a 
> tradeoff of development effort and efficiency.
> 
> http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/graph/query/QueryHandler.html 
> 
> 
>>
>> Would it be possible to use a Jena query engine implementation in 
>> conjunction with Protege, so that instead of accessing the Jena triple 
>> store, it would operate on the Protege TripleStore classes?
> 
> 
> Yes.
> 
> The SPARQL query system is called ARQ.
> 
> There are at least two ways:
> 
> 1/ Create a "Protege graph" wrapping a Protege triple store.
>    This is the route several people have used for adding access to
>    other storage technologies (examples: a Lucene based store,
>    D2RQ for existing SQL data)
> 
> 2/ Modify or adding a new query engine implementation to ARQ.  This
>    can be as little as intercepting the few lines of code that connect
>    ARQ to Jena graphs.  In fact, there is very little tie between
>    Jena and ARQ for storage - it's all in the basic pattern matcher.
> 
> Route 1 is the most trodden and also allows use of the triple store from 
> any part of Jena.
> 
> Route 2 allows increasing degrees of specialised process queries but 
> looking at Protege's triple store.
> 
> If you want XML document results then there are no Jena dependences 
> (it's a format defined by DAWG).  This format is suitable for XSLT or 
> XQuery transformation.
> 
> Interally ARQ uses Jena nodes but not Triples. The Protege objects don't 
> look that different with the exception of the property/resource split.  
> Externally, the API speaks Jena Resource/Literals but you could convert 
> to Protege objects easily enough if you want API access.  The API is 
> quite a thin layer.  It is effectively what is being done anyway, 
> turning low level Jena Node objects into higher level Resources that 
> happens throughout Jena.
> 
> Aside: Queries with the variable in a subject/object slot and also in 
> the property slot break the Resource/Property is distinction but they 
> are rather unusual.
> 
> like ... WHERE { ?p rdfs:range :something ; :x ?p ?o }
> 
>> If yes, then this would be an extremely useful mechanism to bridge the 
>> work in both communities!  It would for example become quite trivial 
>> to have a query front-end as a Protege tab.  How generic is the SPARQL 
>> implementation and which interfaces would we need to implement to hook 
>> in Protege's storage model at run time?
> 
> 
> I hope the brief outline above helps - feel free to ask for more 
> details.  The current version is ARQ 0.9.6.  Same license as Jena.
> 
> There is also remote access to stores and a server (Joseki3) 
> implementaing the DAWG SPARQL protocol.  Joseki3 is less advanced than 
> ARQ but works (HTTP and SOAP).  It's only in CVS - the WG is still 
> trashing out details of the protocol so I haven't done a release build 
> yet because changes to names and namespaces are more disruptive for a 
> distributed deployment.
> 
>     Andy
> 
>>
>> Holger
>>
>> [1] 
>>
http://protege-owl.sourceforge.net/javadoc/edu/stanford/smi/protegex/owl/model/triplestore/package-summary.html 
>>
>>
>>
>> Dan Brickley wrote:
>>
>>> (cc:'ing some folks who'll know more than me)
>>>
>>> Holger Knublauch wrote:
>>>
>>>
>>>> Nicolas F Rouquette wrote:
>>>>
>>>>
>>>>> There's work on "Named Graphs" here:
>>>>>
>>>>> http://www.w3.org/2004/03/trix/
>>>>>
>>>>> There's even a query language for Named Graphs, TriQL:
>>>>>
>>>>> http://www.wiwiss.fu-berlin.de/suhl/bizer/TriQL
>>>>>
>>>>> and an implementation of the whole thing in "NG4J"
>>>>>
>>>>> http://sourceforge.net/projects/ng4j
>>>>>
>>>>> However, this is based on Jena 2.x and as Holger pointed out, 
>>>>> protege-owl is based on Jena 1.x.
>>>>
>>>>
>>>>
>>>>
>>>> To avoid misunderstandings, we use Jena 2.x in Protege, and 
>>>> currently mostly the ARP parser and persistence mechanism.  We don't 
>>>> use Jena for the internal model representation at run-time, 
>>>> therefore we cannot easily reuse extensions to Jena in Protege.  You 
>>>> can always get a Jena OntModel from a Protege OWLModel at run-time, 
>>>> but this is not updated on changes.
>>>>
>>>>
>>>>
>>>>> Jena itself supports graphs as well. However, it is unclear whether 
>>>>> named graphs will show up in OWL
>>>>> and more precisely within OWL-DL.
>>>>
>>>>
>>>>
>>>>
>>>> Named graphs should work in OWL in principle, due to its layering on 
>>>> top of RDF (each OWL document is also an RDF document).
>>>>
>>>> A main reason why there isn't support for named graphs in Protege 
>>>> yet is that I didn't even know that they existed until they were 
>>>> recently brought to my attention at the Protege conference (thanks 
>>>> for this by the way).  I scanned the documents which you mention and 
>>>> see that this is a useful feature, and Protege should certainly 
>>>> support it in one way or another.
>>>>
>>>> From my naive point of view, a limitation seems to be that named 
>>>> graphs cannot easily be represented in the usual XML/RDF 
>>>> serialization.  This is the dominant file format though, and the one 
>>>> that is supported by Protege's parser.  So it seems that a solution 
>>>> would require to build upon a mechanism which splits each named 
>>>> graph into a separate RDF file.  This could certainly be implemented 
>>>> at save time.  We are considering to add more options to the save 
>>>> mechanism anyway, e.g. to automatically split instances and classes, 
>>>> to separate public and private classes etc.  So the idea of named 
>>>> graphs will also play a role there (and suggestions from the 
>>>> community are welcome).
>>>>
>>>> A quite different question then is how to represent the named graphs 
>>>> internally at run-time (so that the "cut points" can be determined), 
>>>> and how to allow the user to define them in the user interface. 
>>>
>>>
>>>
>>> I'd also encourage you to take a look at the SPARQL work on RDF 
>>> querying, if you haven't already.
>>>
>>> SPARQL --- now a Last Call Working Draft --- includes an approach to 
>>> querying a form of named graphs. There is also an XML format for 
>>> representing query results, and a protocol. The former relates
>>> (in ways I'm unsure about) to the problem of serializing named 
>>> graphs. I'm cc:'ing Andy Seaborne,
>>> who might have more to say about how the SPARQL work for Jena relates 
>>> to other forms of "named graph".
>>>
>>> http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/#specifyingDataset 
>>>
>>> [[
>>> Specifying RDF Datasets
>>>
>>> A SPARQL query may specify the dataset to be used for matching.  The 
>>> FROM clauses give IRIs that the query processor can use to create the 
>>> default graph and the FROM NAMED clause can be used to specify named 
>>> graphs.
>>> ]]
>>>
>>>
>>> The spec has a number of examples such as the following, which might 
>>> give a
>>> flavour of SPARQL's graph-naming facilities (see spec for the
>>> corresponding target data and resultset)...
>>>
>>>    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>>>    PREFIX dc: <http://purl.org/dc/elements/1.1/>
>>>
>>>    SELECT ?who ?g ?mbox
>>>    FROM <http://example.org/dft.ttl>
>>>    FROM NAMED <http://example.org/alice>
>>>    FROM NAMED <http://example.org/bob>
>>>    WHERE
>>>    {
>>>       ?g dc:publisher ?who .
>>>       GRAPH ?g { ?x foaf:mbox ?mbox }
>>>    }
>>>
>>>
>>> http://jena.hpl.hp.com/ARQ/ has an online demo, 
>>> http://www.sparql.org/query.html
>>> that might be of interest. There's also usually one at 
>>> http://librdf.org/query/
>>> built on top of Redland, though that's offline today.
>>>
>>> cheers,
>>>
>>> Dan
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------- 
>>>
>>> To unsubscribe go to 
>>> http://protege.stanford.edu/community/subscribe.html
>>>
>