1
0
Fork 0
This repository has been archived on 2021-01-06. You can view files and clone it, but cannot push or open issues or pull requests.
dennogumi.org-archive/_posts/2011-06-29-pykde4-queries-with-nepomuk.markdown

158 lines
5.8 KiB
Markdown

---
author: einar
comments: true
date: 2011-06-29 19:27:42+00:00
layout: page
slug: pykde4-queries-with-nepomuk
title: 'PyKDE4: Queries with Nepomuk'
wordpress_id: 924
categories:
- KDE
- Linux
tags:
- KDE
- Linux
- pykde
- python
---
In one of my previous blog posts I dealt with [tagging files and resources with Nepomuk]({{ site.url }}/2010/10/pykde4-tag-and-annotate-files-using-nepomuk). But Nepomuk is not only about storing metadata, it is also about _retrieving_ and _interrogating _data. Normally, this would mean querying the metadata database directly, using queries written in SPARQL. But this is not intuitive, can be inefficient (if you do things the wrong way) and error prone (oops, I messed up a parameter!).
Fortunately, the Nepomuk developers have come up with a high level API to query already stored metadata, and today's post will deal with querying tags in Nepomuk. As per the past tutorials, the full source code is available [in the kdeexamples module](https://projects.kde.org/projects/kde/kdeexamples/repository/revisions/master/changes/bindings/python/nepomuk/nepomuk_tag_query_example.py).
Let's start off with the basic imports:
{% highlight python %}
import sys
import PyQt4.QtCore as QtCore
import PyKDE4.kdecore as kdecore
import PyKDE4.kdeui as kdeui
from PyKDE4.kio import KIO
from PyKDE4.nepomuk import Nepomuk
from PyKDE4.soprano import Soprano
{% endhighlight %}
Then let's create a simple class that wil be used for the rest of this exercise:
{% highlight python %}
class NepomukTagQueryExample(QtCore.QObject):
def __init__(self, parent=None):
super(NepomukTagQueryExample, self).__init__(parent)
{% endhighlight %}
__init__ is just used to construct the instance, nothing more. The bulk of the work is in the query_tag() function, which we'll take a look at in parts.
{% highlight python %}
def query_tag(self, tag):
"""Query for a specific tag."""
tag = Nepomuk.Tag(tag)
{% endhighlight %}
First of all we convert the tag we want to query into a proper Nepomuk.Tag() instance. Of course we should use an already existing tag: even if Nepomuk.Tag() automatically creates new tags, it makes little sense to query for a newly created tag, doesn't it?
For our job, we need to use _properties_ which define the terms of our query. As we're looking for tags, we'll use Soprano.Vocabulary.NAO.hasTag():
{% highlight python %}
soprano_term_uri = Soprano.Vocabulary.NAO.hasTag()
nepomuk_property = Nepomuk.Types.Property(soprano_term_uri)
{% endhighlight %}
The first call generates an URI pointing to a specific RDF resource for this specific term, which is then wrapped as a Nepomuk.Types.Property in the second call. While the C++ API docs don't show this, I found it to be necessary, or the Python interpreter would raise a TypeError. Notice that this is not the only term we can use: aside for tags, there are a lot of other URIs we can use for querying, [listed in the Soprano API docs](http://api.kde.org/kdesupport-api/kdesupport-apidocs/soprano/html/namespaceSoprano_1_1Vocabulary_1_1NAO.html).
Once we have our property set up, it's time to define which kind of query we're going to use. In this case, since we want to check for the presence of tags, we use a Nepomuk.Query.ComparisonTerm, which is a query term used to match values of specific properties (in our case, tags):
{% highlight python %}
comparison_term = Nepomuk.Query.ComparisonTerm(nepomuk_property,
Nepomuk.Query.ResourceTerm(tag))
{% endhighlight %}
Our tag is wrapped in a ResourceTerm, which is used exactly for the purpose. Now we make the proper query: in this specific case, we want to look up _files _tagged, so we use a FileQuery. We could also get other items, such as mails (in Akonadi): in that case we could use a a Nepomuk.Query.Query():
{% highlight python %}
query = Nepomuk.Query.FileQuery(comparison_term)
{% endhighlight %}
Lastly, we want to get some _results_ out of this query. There are different methods, but for this tutorial we'll use the tried-and-tested KIO technology:
{% highlight python %}
search_url = query.toSearchUrl()
search_job = KIO.listDir(kdecore.KUrl(search_url))
search_job.entries.connect(self.search_slot)
search_job.result.connect(search_job.entries.disconnect)
{% endhighlight %}
First we convert the query to a nepomuksearch:// url, which then we pass to KIO.listDir, to list the entries. Unlike [my previous post on KIO]({{ site.url }}/2011/01/pykde4-retrieve-data-using-kio), this job emits entries() every time one is found, so we connect the signal to our search_slot method. We also connect the job's result() signal in a way that it will disconnect the job once it's over.
Finally, let's take a look at the search_slot function:
{% highlight python %}
def search_slot(self, job, data):
# We may get invalid entries, so skip those
if not data:
return
for item in data:
print item.stringValue(KIO.UDSEntry.UDS_DISPLAY_NAME)
{% endhighlight %}
Entries are emitted as [UDSEntries](http://api.kde.org/4.x-api/kdelibs-apidocs/kio/html/classKIO_1_1UDSEntry.html): to get something at least understandable, we turn them into the file name, which is obtained by the stringValue() call using KIO.UDSEntry.UDS_DISPLAY_NAME.
That's it. As you can see, it was pretty easy. Of course there's more than that. For further reading, take a look at [Nepomuk's Query API docs](http://api.kde.org/4.x-api/kdelibs-apidocs/nepomuk/html/namespaceNepomuk_1_1Query.html), and [Query Examples](http://api.kde.org/4.x-api/kdelibs-apidocs/nepomuk/html/examples.html#examples_query). Bear in mind however that to the best of my knowledge, the "fancy operators" mentioned there will not work with Python.
Happy Nepomuk querying!