Finding all steps in property path
I'm new to SPARQL and I'm trying to create a property path query that will spit out each intermediate step along the path. So far I have this:
select ?object
where {
<subjectURI> <isRelatedTo>+ ?object .
}
This gives me a list of all the relations to my subject URI throughout the path, no matter how distantly the relation (correct me if I'm wrong so far).
But, I'd like to see how the relations are organized. Something like:
<subjectURI> <isRelatedTo> <object1>
<object1> <isRelatedTo> <object2>
<object2> <isRelatedTo> <object3>
and so on...Is this possible?
Solution 1:
While there are some limitations in what property paths can do, depending on your exact requirements, you may be able to get what you need here. Consider this data:
@prefix : <urn:ex:>.
:a :relatedTo :b .
:b :relatedTo :c .
:c :relatedTo :d .
:a :relatedTo :e .
:e :relatedTo :f .
:f :relatedTo :g .
:h :relatedTo :i .
:i :relatedTo :j .
:j :relatedTo :k .
:k :relatedTo :l .
in which there are three :relatedTo
paths:
a --> b --> c --> d
a --> e --> f --> g
h --> i --> j --> k --> l
I realize that in your case, you had a specific subject, but we can generalize a little bit, and ask for each edge in each of these paths with a query like this:
prefix : <urn:ex:>
select * where {
# start a path
?begin :relatedTo* ?midI .
FILTER NOT EXISTS { [] :relatedTo ?begin }
# grab next edge
?midI :relatedTo ?midJ .
# get to the end of the path.
?midJ :relatedTo* ?end .
FILTER NOT EXISTS { ?end :relatedTo [] }
}
order by ?start ?end
$ arq --data data.n3 --query query.sparql
-----------------------------
| begin | midI | midJ | end |
=============================
| :a | :a | :b | :d |
| :a | :b | :c | :d |
| :a | :c | :d | :d |
| :a | :a | :e | :g |
| :a | :e | :f | :g |
| :a | :f | :g | :g |
| :h | :h | :i | :l |
| :h | :i | :j | :l |
| :h | :j | :k | :l |
| :h | :k | :l | :l |
-----------------------------
which shows each edge of each :relatedTo
path. You could make the output a little bit prettier, too:
prefix : <urn:ex:>
select (concat(str(?begin),"--",str(?end)) as ?path) ?midI ?midJ where {
# start a path
?begin :relatedTo* ?midI .
FILTER NOT EXISTS { [] :relatedTo ?begin }
# grab next edge
?midI :relatedTo ?midJ .
# get to the end of the path.
?midJ :relatedTo* ?end .
FILTER NOT EXISTS { ?end :relatedTo [] }
}
order by ?path
$ arq --data data.n3 --query query.sparql
--------------------------------------
| path | midI | midJ |
======================================
| "urn:ex:a--urn:ex:d" | :a | :b |
| "urn:ex:a--urn:ex:d" | :b | :c |
| "urn:ex:a--urn:ex:d" | :c | :d |
| "urn:ex:a--urn:ex:g" | :a | :e |
| "urn:ex:a--urn:ex:g" | :e | :f |
| "urn:ex:a--urn:ex:g" | :f | :g |
| "urn:ex:h--urn:ex:l" | :h | :i |
| "urn:ex:h--urn:ex:l" | :i | :j |
| "urn:ex:h--urn:ex:l" | :j | :k |
| "urn:ex:h--urn:ex:l" | :k | :l |
--------------------------------------
This same approach would let you do some interesting things like finding out how far separated certain nodes are:
prefix : <urn:ex:>
select ?begin ?end (count(*) as ?length) where {
# start a path
?begin :relatedTo* ?midI .
FILTER NOT EXISTS { [] :relatedTo ?begin }
# grab next edge
?midI :relatedTo ?midJ .
# get to the end of the path.
?midJ :relatedTo* ?end .
FILTER NOT EXISTS { ?end :relatedTo [] }
}
group by ?begin ?end
------------------------
| begin | end | length |
========================
| :a | :g | 3 |
| :a | :d | 3 |
| :h | :l | 4 |
------------------------
In the data I provided above, the paths happen to be in alphabetical order and so the sorting produces the edges in the right order. However, even if the edge nodes aren't in alphabetical, we can still print them in order by computing their position in the list. This query:
prefix : <urn:ex:>
select ?begin ?midI ?midJ (count(?counter) as ?position) ?end where {
?begin :relatedTo* ?counter .
?counter :relatedTo* ?midI .
FILTER NOT EXISTS { [] :relatedTo ?begin }
?midI :relatedTo ?midJ .
?midJ :relatedTo* ?end .
FILTER NOT EXISTS { ?end :relatedTo [] }
}
group by ?begin ?end ?midI ?midJ
----------------------------------
| begin | midI | midJ | .1 | end |
==================================
| :a | :a | :b | 1 | :d |
| :a | :b | :c | 2 | :d |
| :a | :c | :d | 3 | :d |
| :a | :a | :e | 1 | :g |
| :a | :e | :f | 2 | :g |
| :a | :f | :g | 3 | :g |
| :h | :h | :i | 1 | :l |
| :h | :i | :j | 2 | :l |
| :h | :j | :k | 3 | :l |
| :h | :k | :l | 4 | :l |
----------------------------------
We don't necessary need to see that count, but you could, instead of selecting the position, you could use it as a sorting condition:
prefix : <urn:ex:>
select ?begin ?midI ?midJ ?end
where {
?begin :relatedTo* ?counter .
?counter :relatedTo* ?midI .
FILTER NOT EXISTS { [] :relatedTo ?begin }
?midI :relatedTo ?midJ .
?midJ :relatedTo* ?end .
FILTER NOT EXISTS { ?end :relatedTo [] }
}
group by ?begin ?end ?midI ?midJ
order by ?begin ?end count(?counter)
and be guaranteed to get your edges in sequence.
Solution 2:
No, this is a limitation of the design of property paths.
Paths may either be used to compact more complex query patterns or they may be used to test for arbitrary length paths as in your example.
The former may be converted into a form that gives you the intermediate steps e.g.
SELECT * WHERE
{
?s <http://predicate>/<http://predicate> ?o
}
Can be converted to the following:
SELECT * WHERE
{
?s <http://predicate> ?intermediate .
?intermediate <http://predicate> ?o .
}
Unfortunately the same cannot be done for arbitrary length paths. However if you know what the upper bound on paths are you can rewrite your query like so:
SELECT *
WHERE
{
{
?s <http://predicate> ?step1 .
?step1 <http://predicate> ?o .
}
UNION
{
?s <http://predicate> ?step1 .
?step1 <http://predicate> ?step2 .
?step2 <http://predicate> ?o .
}
# Add additional UNION for each length of path you want up to your upper bound
}
Though as you can immediately see this makes things very verbose.