My Akademy talk proposal was not accepted, but the organizers were kind enough to offer me the chance to hold a BoF on the same subject. Now I bet you wonder on what I’m going to discuss, and I think the title already gives you an idea:
KDE and bioinformatics: the missing link
Although in the KDE community we have our fair share of scientists (hey there, Stuart!), my BoF will focus on the adoption of KDE in the field of bioinformatics (my day job, not-so-by-chance) on the "outsiders" front and how to improve the current situation. To elaborate further, bioinformatics is a rather broad field where biological data are treated with computational methods. The oldest and most famous branch of bioinformatics is sequence analysis and related field, where sequences of DNA are analyzed, for example, to find common ancestors among several species, or to reconstruct the genetic code of an organism by comparing it to a related species. Another recent example is related to high-throughput technologies, technologies which produce huge amounts of data from a very small number of experiments ("ultramassive sequencing" and DNA microarrays are examples of such a technology).
Either way, bioinformaticians have to deal with large amounts of data all the time, and usually there’s no "shrink-wrap" solution to the problems they have to face, software-wise. That’s because we do research, so we need to find something new. So the solution is often to write algorithms, or re-implement existing ones in a form that is suited for the tasks at hand. So, bioinformaticians also write software, although they’re by no means (usually) professional coders: some have a mathematical or statistical background, others (like me) come from an experience at the lab bench. What kind of programs bioinformaticians write? Normally scripts and small stuff, but in certain cases even full blown-algorithms and applications. Some become so famous that are even trend-setters.
Which brings us to the heart of the matter: how does KDE stand in all of this? Sadly, not too well. I’ve done some research in the published literature, but there’s just one hit returned that’s proper: a KDE application for neuroscience (based on the 3.5.x Development Platform) published in 2008. I know that big research places like CERN use KDE, but to my knowledge smaller realities such as research group code in the majority of the cases for Windows or for web-based solutions. Given that at least a signficant portion of bioinformaticians uses UNIX-like operating systems, the question we need to answer is: why?
The first and foremost problem is related to market share. Research groups don’t even know that KDE exists, so it’s unlikely they develop something using the Development Platform (even now that’s becoming more cross-platform). This is where some promo efforts could help. Secondly, the problem lies in the "difficulty" (notice the quotes!) of developing using the KDE Development platform: most bioinformaticians, as I wrote, are not professional coders, and few of them know C++. The most used languages in bioinformatics are Perl and Java (with some Python and Ruby thrown into the mix). Thus, the need for proper bindings. The bindings are there, thanks to the excellent work of the kde-bindings team, but documentation is still lacking (namely in the examples department, but also in tutorials and getting started guides that aren’t aimed at C++). Some documentation is auto-generated, and while the KDE API docs are usually not too hard to read, they can still scare off newcomers. Of course this is not the fault of the kde-bindings team: namely, more help is needed.
Promo efforts and better bindings are the keys to spread KDE more in the field of the bioinformatics. This is what my BoF is about, plus an informal discussion on the use of FOSS in academia and related matters.
Interested? If you are, you can come to the BoF which will be on Tuesday, 6th July at 15.00 in the Area 2 of the main room at Demola.
I’ll also be around later till the following morning (sadly, two days is the best I can do to attend) in case you’re interested for a chat.

Funny. I never knew that to become a professional coder you need to know C++. I think you will find that a lot of “professional” coding is done in other languages than C or C++. So, don’t feel ashamed when you code in another language :-)
Apart from that, I think that it would really help if the BoF could focus on answering a few questions: what problem will KDE help solve for bioinformatics researchers? Why would it be better than other solutions that are already used? How much effort would it be to switch? Without answers to these questions it will be very hard to get people to switch.
Well, I did not mean that pros only use C++. ;) The real explanation is a bit more complex, but the entry was already getting too long…
As your other questions, they are very valid points. The answer to some of these lies in the (very complex!) interaction between the bioinformatician and the biologist, a love-hate relationship.
Also, the use of an established framework helps to solve a lot of problems related to software in research. You don’t get funds for maintaining software, just to create it, and established frameworks limit the effort. Secondly, there’s a massive NIH syndrome in bioinformatics, unless you’re truly a trend-setter. KDE offers a lot already as a development platform, and would reduce such problems (also excellent applications like rkward or Cantor show how good can KDE be for science…).
Anyway, you gave me good food for thought to explore this even further. Thanks.
I do not think that KDE will assist bioinformatics research. The need to use assembler language (second generation computer language) can be a daunting task for non-programmers. That same need would be diverting the sole purpose of research in bioinformatics. I do not know any computing machine that can write algorithms, but for data, I suggest mainframe computers because they surpass the volume of personal computing machines.
This can’t really work with small research groups, which do not have the resources for such an infrastructure. Plus, you don’t need a huge computing power for everything. Lastly, the use of a platform like KDE may help making some of the routinary tasks more easily available.
I run a Proteomics facility and use a fair amout of bioinformatic tools. They tend to come in 2 forms. Windows based apps which are commercial and often very buggy and more focused for small facilities, and open source tools more suited which are generally command line driven but have the huge advantage of being able to be supported in a computing cluster.
I’m not sure what role KDE could play in all this. Most work for developing front ends to make the use of opensource tools easier is targeting web based apps, so KDE would not play a role here.
@Edwin: The vast majority of web based applications are non-Free, and this brings huge problems on their maintainability, especially when due to lack of funding the services shut down (I had this happen to me twice).
A FOSS, desktop based application is certianly better because it does not force you to depend on someone else’s systems, and does not force you to have a network connection. There are good examples of applications like this, such as Cytoscape.
Also, web based front ends don’t work that well if host != localhost when you deal with data on the size of hundreds of microarray files or high-throughput proteomics experiments. I would certainly not use HTTP for experiments of the size I’m working on at the moment, for example.
(gosh, it appears I have a follower. I’m honoured)
I do agree that mainframes are not the right solution for everything. I know of a university here in .nl (Wageningen UR) where there are several clusters running. All of these have been tuned to fix a different problem: some of the programs are CPU bound, others are memory bound, others are disk bound, so they all need a different solution or else you will be wasting time.
But, that’s a bit of a side issue. Another thing I was thinking about is that it would be good to investigate what is currently a big problem in bioinformatics and how we could remedy that. For example, what is the biggest user interface element or class they could benefit from? I wouldn’t be surprised if it would be some component which can visualize HDF data, or which could visualize things like molecules in an easy way. With something like that you have a real selling point.
Plotting and visualization widgets would be best, as it’s essential when you display results to a biologist (who is the one that will ultimately decide if the results are meaningful or not). For my own field, good widgets would be ones to display box-plots or heat-maps, as they’re widely used.
For chemistry and the like, Avogadro is already a great start (and it’s Qt-based, even!).
Looking forward to discussing KDE and science* at Akademy
* in the general sense – what I know about bioinformatics is probably too small to be detected by the most advanced scientific techniques :-)