Archive

Posts Tagged ‘nepomuk’

Nepomuk filters in KDE Dolphin

February 2, 2014 Comments off

Finally worked this out : how to use sparql queries against nepomuk to give a file list.

In the “places” tab of Dolphin you will see “Search For” section on the left hand side, with items like Documents, Images, Audio Files. Each of these contains a simple search query for Nepomuk. What do you do if you want something more complicated – a file type or file name regexp ? The former would cover not just file extensions, but anything that nepomuk regards as a file that “is a” something – eg “is a” archive type. The later needs to look at the filename label and filter.

After much messing about, trying to find the nepomuk log[1], looking at the ontology in Protege etc heres what I came up with. For each expression you want to use, right click on “Search For” and choose “add entry”. Now give your expression a name, then in the Location field add the sparql query, using the nepomuk query protocol prefix. For example –

Search for filenames that contain the word “dog” – in the Location field add the expression : nepomuksearch:/?sparql=select ?r where {?r nfo:fileName ?fname . ?fname bif:contains 'dog' . }

Search for Archive file types – in the Location field add the expression : nepomuksearch:/?sparql=select ?r where { ?r a nfo:Archive}

Search for files ending with “pdf” – in the Location field add the expression : nepomuksearch:/?sparql=select ?r where {?r nfo:fileName ?fname . FILTER regex(?fname, '.*[.]pdf', 'i') }.

This last one took a while to work out because of the wildcard and the literal dot and basic regexp syntax is easy to forget if you don’t use it everyday 😉 Haven’t tried this with the bif syntax, but it should work if virtuoso is doing its thing.

The nepomuk ontology is not too complicated to once you remember the basics of sparql and regex, but it would also be good to have a Nepomuk sparql console in KDE to help set these things up and get feedback. I found one Nepomuk log (~/.kde/share/apps/nepomuk/repository/main/data/virtuosobackend/soprano-virtuoso.log), but I’m not sure its what would be needed in this case – it seemed to contain some syntax error messages that would have helped me, but also didnt seem to update in real time.

Another good reference is the KDE Nepomuk Manual blog : http://kdenepomukmanual.wordpress.com/. This explains Nepomuk integration in KDE and how it can be used, including the tag:/ protocol. Check out the Nepomuk tag manager at http://techbase.kde.org/Projects/Nepomuk/Repositories#Tag_Manager as well. (Further browsing here reveals a sparql console as well – the Nepomuk Shell).

Speaking of protocols, you can also use timeline:/ to get an historical inventory view of changes in Dolphin. (http://www.chakra-project.org/wiki/index.php?title=Nepomuk). For a list of protocols, expand the downarrow control to the very left of Dolphins search bar (when you clear the search bar that is) – timeline:/ and nepomuksearch:/ are listed under “other”.

Advertisements
Categories: semantic, technology Tags: , , ,

Aperture Nepomuk queries

February 22, 2011 1 comment

Having crawled an Imap store (ie google mail), I now need to query the results to see whats what, whos who, and how they are connected, if at all.

These are the namespace prefixes used in the queries

Prefix URI
nie http://www.semanticdesktop.org/ontologies/2007/01/19/nie#
nco http://www.semanticdesktop.org/ontologies/2007/03/22/nco#
nfo http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#
nmo http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#
sesame http://www.openrdf.org/schema/sesame#

And these are the queries. Note that each message is in its own graph, and references the folder in which it rests – eg <imap://youraddress@imap.yourprovider.com/INBOX;TYPE=LIST>. This in turn nie:isPartOf another folder, which isnt nie:isPartOf a parent folder.

An Imap store has a username and password etc, but doesnt have an associated email address. A folder may contain messages to the owner with an email address the server accepts, but may also contain messages to other addresses if the cc list contains the owner address.

id Folder Relationship Purpose Query
inbox direct Basic find list of emails, with
sender email address
select distinct ?subject
?from ?address{

?s nmo:from ?o  .
?o nco:fullname ?from .
?o nco:hasEmailAddress ?e .
?e nco:emailAddress
?address .

?s nmo:messageSubject
?subject .

?s a nmo:Email} 


note : with a Jena TDB dataset, use

select distinct ?subject ?from ?address{
graph ?g{
?s nmo:from ?o  .
?o nco:fullname ?from .
?o nco:hasEmailAddress ?e .
?e nco:emailAddress
?address .
?s nmo:messageSubject
?subject .
?s a nmo:Email
}
}

inbox direct Find emails, distinguish
replies(and what replied to), and CC addresses
select distinct ?s ?subject
?r ?to ?refid ?from ?address{

?s nmo:from ?o  .
?s nmo:messageId ?sid .
?o nco:fullname ?from .
?o nco:hasEmailAddress ?e .
?e nco:emailAddress
?address .

?s nmo:messageSubject
?subject .

?s a nmo:Email
optional {
?s nmo:inReplyTo ?r .
?r nmo:messageId ?mid .
}
optional {
?s nmo:to ?toid .
?toid nco:fullname ?to .
}
optional {
?s nmo:cc ?ccid .
optional{
?ccid nco:fullname ?ccto .
}
}
optional {
?s nmo:references ?refid .}} 

order by ?subject

Note : nco:fullname optional as you may not know the email addressee’s name
Note : As with the basic query about, where using a Jean Dataset, you need a graph selector in the where clause eg

select * { graph ?g {?s ?p ?o}}
inbox direct most messages direct to you
select (count(?from) as ?count) ?from ?address{
graph ?g{
?s nmo:from ?o  .
?o nco:fullname ?from .
?o nco:hasEmailAddress ?e .
?e nco:emailAddress ?address .
?s a nmo:Email
} 

}
group by ?from ?address
order by desc(?count)
inbox direct most messages CC to you Not so easy : where you are a CC recipient, its not possible to match on the to: field, or with any metadata on the imap server.
inbox direct fastest replies
inbox direct most replies
inbox contacts and counts by mail
domain
inbox indirect messages to others on CC list
(may not be known to you, but sender knows)
outbox direct recipents (to,cc,bcc)
outbox direct replies
outbox direct most replied to
outbox direct most sent to
outbox direct fastest replied to (by message,
by recipient)
output direct fastest sent to (by message, by
recipient)

Things get more interesting when more that one mailbox is available for
analyis…but Im going to need Sesame3 or revert to Jena because Sesame2 doesnt do aggregate functions like count. 2 steps forward, 1 step back. So, Jena support in Aperture is minimal and old. It cannot make use of graphs, TDB or SDB, (but the libraries are up to date). It also doesnt support Datasets or Named Graphs in Jena. So, I add ModelSet (the RDF2Go adapter type needed), Dataset and Named graph support, in TDB to begin with. This involves updating the Aperture Jena adapter. Doesn’t seem to be any activity on the Aperture mailing list tho, as I get zero response to a question about updating the Jena support. Is Aperture another nice-but-dead Semantic Web technology ?