PyKDE4: Queries with Nepomuk

In one of my previous blog posts I dealt with tagging files and resources with Nepomuk. But Nepomuk is not only about storing metadata, it is also about retrieving and interrogating data. Normally, this would mean querying the metadata database directly, using queries written in SPARQL. But this is not intuitive, can be inefficient (if you do things the wrong way) and error prone (oops, I messed up a parameter!). 

Fortunately, the Nepomuk developers have come up with a high level API to query already stored metadata, and today’s post will deal with querying tags in Nepomuk. As per the past tutorials, the full source code is available in the kdeexamples module.

Let’s start off with the basic imports:

import sys

import PyQt4.QtCore as QtCore

import PyKDE4.kdecore as kdecore
import PyKDE4.kdeui as kdeui
from PyKDE4.kio import KIO
from PyKDE4.nepomuk import Nepomuk
from PyKDE4.soprano import Soprano

Then let’s create a simple class that wil be used for the rest of this exercise:

class NepomukTagQueryExample(QtCore.QObject):

    def __init__(self, parent=None):

        super(NepomukTagQueryExample, self).__init__(parent)

__init__ is just used to construct the instance, nothing more. The bulk of the work is in the query_tag() function, which we’ll take a look at in parts.

    def query_tag(self, tag):

        """Query for a specific tag."""

        tag = Nepomuk.Tag(tag)

First of all we convert the tag we want to query into a proper Nepomuk.Tag() instance. Of course we should use an already existing tag: even if Nepomuk.Tag() automatically creates new tags, it makes little sense to query for a newly created tag, doesn’t it?

For our job, we need to use properties which define the terms of our query. As we’re looking for tags, we’ll use Soprano.Vocabulary.NAO.hasTag():

        soprano_term_uri = Soprano.Vocabulary.NAO.hasTag()
        nepomuk_property = Nepomuk.Types.Property(soprano_term_uri)

The first call generates an URI pointing to a specific RDF resource for this specific term, which is then wrapped as a Nepomuk.Types.Property in the second call. While the C++ API docs don’t show this, I found it to be necessary, or the Python interpreter would raise a TypeError. Notice that this is not the only term we can use: aside for tags, there are a lot of other URIs we can use for querying, listed in the Soprano API docs.

Once we have our property set up, it’s time to define which kind of query we’re going to use. In this case, since we want to check for the presence of tags, we use a Nepomuk.Query.ComparisonTerm, which is a query term used to match values of specific properties (in our case, tags):

        comparison_term = Nepomuk.Query.ComparisonTerm(nepomuk_property,
                Nepomuk.Query.ResourceTerm(tag))

Our tag is wrapped in a ResourceTerm, which is used exactly for the purpose. Now we make the proper query: in this specific case, we want to look up files tagged, so we use a FileQuery. We could also get other items, such as mails (in Akonadi): in that case we could use a a Nepomuk.Query.Query():

        query = Nepomuk.Query.FileQuery(comparison_term)

Lastly, we want to get some results out of this query. There are different methods, but for this tutorial we’ll use the tried-and-tested KIO technology:

        search_url = query.toSearchUrl()
        search_job = KIO.listDir(kdecore.KUrl(search_url))
        search_job.entries.connect(self.search_slot)
        search_job.result.connect(search_job.entries.disconnect)

First we convert the query to a nepomuksearch:// url, which then we pass to KIO.listDir, to list the entries. Unlike my previous post on KIO, this job emits entries() every time one is found, so we connect the signal to our search_slot method. We also connect the job’s result() signal in a way that it will disconnect the job once it’s over.

Finally, let’s take a look at the search_slot function:

    def search_slot(self, job, data):

        # We may get invalid entries, so skip those
        if not data:
            return

        for item in data:
            print item.stringValue(KIO.UDSEntry.UDS_DISPLAY_NAME)

Entries are emitted as UDSEntries: to get something at least understandable, we turn them into the file name, which is obtained by the stringValue() call using KIO.UDSEntry.UDS_DISPLAY_NAME.

That’s it. As you can see, it was pretty easy. Of course there’s more than that. For further reading, take a look at Nepomuk’s Query API docs, and Query Examples. Bear in mind however that to the best of my knowledge, the “fancy operators” mentioned there will not work with Python.

Happy Nepomuk querying!

Taking video snapshots quickly: KDE VLC Snapper

Some of the oldest readers of this blog are well aware of a certain hobby of mine. Over the years I’ve always wanted to write more about that, including the stuff I’m viewing nowadays, but I found a hassle to collect snapshots from videos / DVDs, selecting them, and so on. 

Recently I learnt that VLC has some rather complete Python bindings, and I thought, why not make the process automated? Yesterday I had some free time on my hands and a quick session of hacking brought some results already.
As the stuff is somewhat past prototypal stage, I thought I would push somewhere for others to use.  Lo and behold, here I present you KDE VLC Snapper.
As you can see, it’s a minimal dialog: just select your source video file (any file supported by VLC will do), the number of screencaps, the destination directory, and the program will do the rest. Currently it works somewhat OK (see caveats below) and is good enough for my use cases.

How do I get it?

Just clone this repository:
git clone http://git.gitorious.org/kde-vlc-snapper/kde-vlc-snapper.git

followed by

sudo python setup.py install

You can then invoke the program with

kdevlcsnapper
Requirements include PyKDE4 (tested on KDE Dev Platform 4.6), numpy (just for its “linspace” function, alternatives are welcome) and VLC installed (you don’t need the bindings, however: I provide a local copy).
What about bugs? Well, currently there are two issues that I’m unsure on how to fix: the first is a crash on exit, the second is that certain media files make VLC crash in the background when called from the bindings.
In any case, if you try it out, let me know what you think in the comments!

PyKDE4: Retrieve data using KIO

One of the greatest strengths of KDE is undoubtedly the asynchronous and network-transparent I/O access, employed by the so-called “I/O” slaves, part of the KIO class. If you are developing an application that requires file or network access, those classes make things incredibly simple to do, and they don’t freeze your GUI when you are in the middle of a process.

In this post I’ll show how to use KIO to retrieve files from network resources using PyKDE4. The whole example is also available in the kdeexamples module.

Our first step is to create a simple UI to show how KIO works. It will be a text edit along with two buttons to retrieve and clear items. Here’s how it looks in Designer (the ui file and its compiled Python version are available at the above link):

Image of the example form

Once this is done, we turn our attention to code. We start customary imports:

#!/usr/bin/env python
import sys
import PyQt4.QtCore as QtCore
import PyQt4.QtGui as QtGui
import PyKDE4.kdecore as kdecore
import PyKDE4.kdeui as kdeui
from PyKDE4.kio import KIO

These will provide for everything we need. Then we set up our widget:


from ui_textbrowser import Ui_Form

class TextArea(QtGui.QWidget, Ui_Form):

    """Example class used to show how KIO works."""

    def __init__(self, parent=None):

        super(TextArea, self).__init__(parent)
        self.setupUi(self)

        self.downloadButton.clicked.connect(self.start_download)
        self.clearButton.clicked.connect(self.textWidget.clear)

Nothing strange in the initializer here. We simply make two connections, one to the clear() slot of the clear button, and the other to start the KIO process, that is the retrieval of the index from www.kde.org. Let’s take a look at the start_download slot:

    def start_download(self):
        kdeui.KMessageBox.information(self.parent(),
                                      "Now data will be retrieved from "
                                      "www.kde.org using KIO")

        # KIO wants KUrls
        data_url = kdecore.KUrl("http://www.kde.org")
        retrieve_job = KIO.storedGet(data_url, KIO.NoReload, KIO.HideProgressInfo)
        retrieve_job.result.connect(self.handle_download)

What do we do here? We show a KMessageBox, just for informational purposes. Once this is done, we prepare the actual KIO  job. KIO wants KUrls so we first of all wrap the URL we want to download from in that. Then we create the actual job: in this case it’s KIO.storedGet, that is we retrieve the data in full from our URL and store it in a QByteArray. This is a common use case, but you have to keep in mind that for large files this may be impractical. In such a case, we would be better off using KIO.get followed by a connection to the “data” signal, to get the data in chunks.

A KIO job can have many flags: here we set to remove the progress information, so that you won’t get a notification in the Plasma notifier. For small operations, this should be always present. For longer downloads, it’s likely not a good idea. More information are available in the KIO namespace page (C++ version).

As a last step, we connect the result signal (emitted when the job is complete) to a slot to handle the download. This is what makes KIO useful, because it’s asynchronous, so you can perform long downloads without blocking the user interface of your program

Lastly, we see the “handle_download” slot:


    def handle_download(self, job):

        # Bail out in case of errors
        if job.error():
            return

        print "This slot has been called. The job has finished its operation."

        data = job.data()
        self.textWidget.setPlainText(QtCore.QString(data))

This slot’s signature include a KJob instance, that is what we’ll use to get the data. In fact, using the data() function we can obtain the QByteArray containing what we have retrieved. Then, in this case we simply use setPlainText to put the downloaded data into the text edit.

What if something goes wrong? We can check for errors if job.error() returns True: in that case we can perform recovery, or simply tell our user that something went wrong. Especially with networked resources, this should always be present in your code.

So that’s all for now. As you can see, it was pretty simple, and also very effective.