<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>dennogumi.org &#187; Science</title>
	<atom:link href="http://www.dennogumi.org/category/science/feed" rel="self" type="application/rss+xml" />
	<link>http://www.dennogumi.org</link>
	<description>On the web since 1999</description>
	<lastBuildDate>Sat, 06 Mar 2010 09:19:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>DataMatrix 0.8 is finally out</title>
		<link>http://www.dennogumi.org/2009/06/datamatrix-08-is-finally-out</link>
		<comments>http://www.dennogumi.org/2009/06/datamatrix-08-is-finally-out#comments</comments>
		<pubDate>Sat, 13 Jun 2009 13:29:40 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[datamatrix]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/2009/06/datamatrix-08-is-finally-out</guid>
		<description><![CDATA[At last, after months of inactivity, I pushed out a new release of DataMatrix. Although the version bump is small (0.8) there are a lot of changes since last releases. The most notable include:


Ability to apply functions to elements of the matrix
Ability to filter rows by column contents
Ability to transpose rows with columns
An option to [...]]]></description>
			<content:encoded><![CDATA[<p>At last, after months of inactivity, I pushed out a new release of <a href="http://www.dennogumi.org/projects/datamatrix" title="DataMatrix"><em>DataMatrix</em></a>. Although the version bump is small (0.8) there are a lot of changes since last releases. The most notable include:</p>
</p>
<ul>
<li>Ability to apply functions to elements of the matrix</li>
<li>Ability to filter rows by column contents</li>
<li>Ability to transpose rows with columns</li>
<li>An option to load text files produced by R (which are, by design, broken)</li>
<li>Removed the getter for columns, using dictionary-like syntax directly</li>
<li>A lot of bug fixes</li>
</ul>
<p>The download links on <a href="http://www.dennogumi.org/projects/datamatrix" title="Project page">the project page</a> have been updated, along with <a href="http://www.dennogumi.org/doc/datamatrix/" title="Documentation">the documentation</a>.  Also, there is another change, because from now on the official Git repository <a href="http://gitorious.org/datamatrix/datamatrix" title="Web interface on gitorious.org">is hosted on gitorious.org</a>, and no longer on github, because gitorious (the software) is also free, while github.com&#8217;s is not. It&#8217;s mainly a philosophical issue (the same that prompted me to switch from twitter to identi.ca). </p>
<p>Also, from today <em>DataMatrix</em> is also officially hosted on the <a href="http://pypi.python.org/pypi/datamatrix/0.8" title="Page on PyPI">Python Package Index</a> (with the name &#8220;datamatrix&#8221;), meaning that you can use easy_install to quickly install it.</p>
<p>If you use this module, let me know what you think (including bugs, if you find them).</p></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/06/datamatrix-08-is-finally-out/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gene search applet: suggestions and code review needed</title>
		<link>http://www.dennogumi.org/2009/03/gene-search-applet-suggestions-and-code-review-needed</link>
		<comments>http://www.dennogumi.org/2009/03/gene-search-applet-suggestions-and-code-review-needed#comments</comments>
		<pubDate>Tue, 31 Mar 2009 17:33:09 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[KDE]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[bioinformatics]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=594</guid>
		<description><![CDATA[In the past months I&#8217;ve always wanted to write a small Plasma applet to aid me in some boring tasks as a bioinformatician. One example (for the non-scientific crowd out there) is when I find a specific gene out of my analysis work which I want to take a look at. I am often lazy, [...]]]></description>
			<content:encoded><![CDATA[<p>In the past months I&#8217;ve always wanted to write a small Plasma applet to aid me in some boring tasks as a bioinformatician. One example (for the non-scientific crowd out there) is when I find a specific gene out of my analysis work which I want to take a look at. I am often lazy, so instead of firing up the browser to look at the online resources, I wanted to write something which could access said resources programmatically.</p>
<p><span id="more-594"></span></p>
<p>I found a way thanks to the <a href="http://biopython.org" title="The Biopython project">Biopython project,</a> which offers a Python module to access the resources of the <a href="http://www.ncbi.nlm.nih.gov" title="NCBI">National Center for Biotechnology Information (NCBI)</a> by providing an interface to their <a href="http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html" title="EUtils web page">EUtils</a>. Since the back-end was already taken care of, almost, at least, I sought to write a small Plasma applet. Which is what I&#8217;m presenting today. It&#8217;s written in Python, and uses the Python ScriptEngine to work. Currently, it searches the &#8220;Gene&#8221; database at NCBI by inputting the &#8220;Entrez Gene IDs&#8221;, that are numerical IDs that uniquely identify a gene record, and returns name, official symbol,  organism, and a description if it&#8217;s present. It does not support anything else (see below).</p>
<p>The code lives in <a href="http://github.com/cswegger/plasma-genesearch/tree/master" title="Code repository">a git repository at github</a>. <strong>WARNING: </strong>The code may be a complete mess (I&#8217;m not too well versed in GUI stuff, I mostly do text file manipulation) If you are so daring, you can obtain and install it in a very simple manner:</p>
<p>
<pre class="brush: bash;">git clone git://github.com/cswegger/plasma-genesearch.git
cd plasma-genesearch
zip -r ../plasma-genesearch.plasmoid *
plasmapkg -i ../plasma-genesearch.plasmoid</pre>
</p>
<p>After that you will see an &#8220;Entrez Gene Searcher&#8221; in your add applets dialog. Once added, it&#8217;ll look like this:</p>
<p align="center"><img src="http://www.dennogumi.org/wp-content/uploads/2009/03/plasma-genesearch1.png" title="Gene searcher" alt="Gene searcher image" /></p>
<p align="left">Pretty horrible, isn&#8217;t it? Well, once you get past that, you can input an ID (only IDs will work for now) in the text field (which doesn&#8217;t clear the text: see further on) and push &#8220;Go!&#8221;. The following is an example with ID 10000, which corresponds to the human gene <em>AKT3</em>:</p>
<p align="center"><img src="http://www.dennogumi.org/wp-content/uploads/2009/03/plasma-genesearch2.png" title="Gene search results" alt="Gene search results image" /></p>
<p align="left">&#8220;Search again&#8221; will bring you back to the search form.</p>
<p align="left">Now, what has this to do with Planet KDE? Well, I&#8217;m asking for some code review from the community, if it&#8217;s possible, and suggestions to improve the horrid default look. I am especially interested in layouting, since I did not quite understand how it works, I mean, it should not work and it <em>does&#8230;.</em> </p>
<p align="left">Other things that need to be improved are:</p>
<ul>
<li align="left">The Plasma.TextEdit is not cleared upon clicking. Is there a signal I can catch for that, so I can connect it to clear()?</li>
<li align="left">Proper searching. Bio.Entrez already does this: what I need is  a way to display the records properly. </li>
<li align="left">A way to link the names to URLs, and have them open in Konqueror. </li>
</ul>
<p>That should be it. I hope to work on it some more next weekend&#8230;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/03/gene-search-applet-suggestions-and-code-review-needed/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Moving on</title>
		<link>http://www.dennogumi.org/2009/02/moving-on</link>
		<comments>http://www.dennogumi.org/2009/02/moving-on#comments</comments>
		<pubDate>Fri, 27 Feb 2009 16:30:57 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=569</guid>
		<description><![CDATA[Some say that all good things must come to an end. I&#8217;m not entirely sure that this is a universal truth, but I can say that at some point in life there are decisions that need to be taken.
In this case I made my own: today was the last day in Dr.Cristina Battaglia&#8217;s laboratory, a [...]]]></description>
			<content:encoded><![CDATA[<p>Some say that all good things must come to an end. I&#8217;m not entirely sure that this is a universal truth, but I can say that at some point in life there are decisions that need to be taken.</p>
<p>In this case I made my own: today was the last day in<a href="http://www.centro-cisi.com/microarray.htm"> Dr.Cristina Battaglia&#8217;s laboratory</a>, a place where I spent my three-year Ph.D. course and one year as a post-doc research fellow.</p>
<p>Those four years were not bad at all. They were interesting, and provided a good learning experience. I think I owe quite a bit to that place, especially because I was able to learn and improve my skills alongside the analysis and research work. So my thanks go to my former supervisor (Dr.Cristina Battaglia) and all my colleagues. It&#8217;s been a fun ride.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/02/moving-on/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Science and KDE: kile</title>
		<link>http://www.dennogumi.org/2009/02/science-and-kde-kile</link>
		<comments>http://www.dennogumi.org/2009/02/science-and-kde-kile#comments</comments>
		<pubDate>Sun, 22 Feb 2009 20:49:20 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[KDE]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[latex]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=551</guid>
		<description><![CDATA[During the course of my research work, I may obtain results that are worthy of publication in scientific journals. Since my master&#8217;s thesis I&#8217;ve been using LaTeX as my writing platform, mainly because I can concentrate on content rather than presentation (I find it useful also for writing non-scientific stuff as well). Also, I can [...]]]></description>
			<content:encoded><![CDATA[<p>During the course of my research work, I may obtain results that are worthy of publication in scientific journals. Since my master&#8217;s thesis I&#8217;ve been using <a title="LaTeX web page" href="http://latex-project.org">LaTeX</a> as my writing platform, mainly because I can concentrate on content rather than presentation (I find it useful also for writing non-scientific stuff as well). Also, I can handle bibliography (essential for a scientific publication) very well without using expensive proprietary applications (such as Endnote).</p>
<p>In my early days I used kLyX first, then <a title="LyX" href="http://www.lyx.org">LyX</a>, but I found the platform to be too limited for my tastes, and also LaTeX errors were difficult to diagnose. I needed a proper editor, and that&#8217;s when I heard of <a title="Kile's web page" href="http://kile.sourceforge.net">kile, a KDE front-end for LaTeX</a>. Kile is currently at version 2.0.2 and is a KDE 3 application. However, in KDE SVN work is ongoing to produce a KDE4 version (2.1) and that&#8217;s what I&#8217;ll look at in this entry.</p>
<p><span id="more-551"></span></p>
<p><strong>Obtaining kile 2.1</strong></p>
<p>First and foremost, a disclaimer. kile 2.1 has not been released yet in any form, and so should be considered unstable and crash-prone. That said, it runs more or less well on my platform.</p>
<p>The first thing to do is to grab the sources from SVN:</p>
<p><code>svn checkout svn://anonsvn.kde.org/home/kde/trunk/extragear/office/kile</code></p>
<p>That will put kile&#8217;s sources in a directory called &#8220;kile&#8221;. The next step is to compile it (as usual, you need KDE4 development packages/files installed):</p>
<p><code>cd kile<br />
mkdir build; cd build<br />
cmake -DCMAKE_INSTALL_PREFIX=`kde4-config --prefix` ../<br />
make</code></p>
<p>Followed by the usual <code>make install</code> as root or using <code>sudo</code>.</p>
<p><strong>kile 2.1 at a glance</strong></p>
<p>This is how kile looks when loaded on my system:</p>
<p style="text-align: center;"><a class="shutterset_" title="Kile at startup" href="http://www.dennogumi.org/wp-content/gallery/screenshots/kile1.png"><img class="ngg-singlepic ngg-none" src="http://www.dennogumi.org/wp-content/gallery/screenshots/thumbs/thumbs_kile1.png" alt="kile1.png" /></a></p>
<p style="text-align: left;">(For the inquisitive people, it&#8217;s not a scientific work, rather a sci-fi like book I&#8217;m writing).</p>
<p style="text-align: left;">Kile uses the katepart for editing, so that means all the goodies that come with Kate can be used, including the recently-added vim input mode. Aside from editing and LaTeX syntax highlighting, kile offers a configurable LaTeX command completion, like this screenshot shows:</p>
<p style="text-align: center;"><a class="shutterset_" title="Command completion" href="http://www.dennogumi.org/wp-content/gallery/screenshots/kile4.png"><img class="ngg-singlepic ngg-none" src="http://www.dennogumi.org/wp-content/gallery/screenshots/thumbs/thumbs_kile4.png" alt="kile4.png" /></a></p>
<p style="text-align: left;">From the toolbars and the menus you can insert almost every LaTeX command known to mankind. For the people less apt with LaTeX, kile offers a series of wizards in order to make the creation of figures, tables and even complete documents. The one I&#8217;m showing here is the Quick Start wizard, which enables you to select document classes, add packages, and add information like author and date. As I was saying earlier, kile 2.1 is still a work in progress, and that explains why the dialog is still a little unrefined.</p>
<p style="text-align: center;"><a class="shutterset_" title="Quick start wizard" href="http://www.dennogumi.org/wp-content/gallery/screenshots/kile2.png"><img class="ngg-singlepic ngg-none" src="http://www.dennogumi.org/wp-content/gallery/screenshots/thumbs/thumbs_kile2.png" alt="kile2.png" /></a></p>
<p style="text-align: left;">Like with its KDE3 counterpart, kile offers the possibility of using &#8220;projects&#8221;, which means you can collect LaTeX documents, bib files, and so on, and associate them together. You can also set a master document, so that even if you are editing other files (included in the master document), when you build your LaTeX file the compilation runs on the master document.  Even in this case, a wizard helps in creating a project and the master document.</p>
<p style="text-align: center;"><a class="shutterset_" title="New project" href="http://www.dennogumi.org/wp-content/gallery/screenshots/kile3.png"><img class="ngg-singlepic ngg-none" src="http://www.dennogumi.org/wp-content/gallery/screenshots/thumbs/thumbs_kile3.png" alt="kile3.png" /></a></p>
<p style="text-align: left;">Lastly, kile has a plethora of other options, including customizing what you can use to build LaTeX files and view them (DVI, PS, PDF&#8230;), as shown in this screenshot.</p>
<p style="text-align: center;"><a class="shutterset_" title="Build options" href="http://www.dennogumi.org/wp-content/gallery/screenshots/kile5.png"><img class="ngg-singlepic ngg-none" src="http://www.dennogumi.org/wp-content/gallery/screenshots/thumbs/thumbs_kile5.png" alt="kile5.png" /></a></p>
<p style="text-align: left;"><strong>Conclusions</strong></p>
<p style="text-align: left;">I have merely scratched the surface of this application, which is extremely powerful and can help anyone with their LaTeX needs. While the many options may be confusing, I think that this application is already geared towards a technically-inclined userbase and so it doesn&#8217;t matter much. kile 2.1 is still unstable but extremely promising, and I&#8217;m looking forward to its release.</p>
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/02/science-and-kde-kile/feed</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Science and KDE: rkward</title>
		<link>http://www.dennogumi.org/2009/02/science-and-kde-rkward</link>
		<comments>http://www.dennogumi.org/2009/02/science-and-kde-rkward#comments</comments>
		<pubDate>Sat, 07 Feb 2009 18:55:53 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[KDE]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[rkward]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=533</guid>
		<description><![CDATA[I try to use FOSS extensively for my scientific work. In fact, when possible, I use only FOSS tools. Among these there is the R programming language. It&#8217;s a Free implementation of the S-plus language, and it&#8217;s mainly aimed at statistics and mathematics. As the people who read my scientific posts know, I don&#8217;t like [...]]]></description>
			<content:encoded><![CDATA[<p>I try to use FOSS extensively for my scientific work. In fact, when possible, I use <em>only</em> FOSS tools. Among these there is the R programming language. It&#8217;s a Free implementation of the S-plus language, and it&#8217;s mainly aimed at statistics and mathematics. As the people who read my scientific posts know, I don&#8217;t like R much. But sometimes it&#8217;s the only alternative.</p>
<p>Well, what does R have to do with KDE? With this post I&#8217;d like to start a series (hopefully) of articles that deals with KDE programs used for scientific purposes. In this particular entry, I&#8217;ll focus on rkward, a GUI front-end for R.<br />
<span id="more-533"></span><br />
<strong>Introduction</strong></p>
<p>Although R is a programming language, it&#8217;s mainly used in an interactive session, started from the terminal. The standard installation can be improved by the use of add-on packages, <em>libraries</em> in R-speak, which can be installed from the Internet (Comprehensive R Archive Network or CRAN) or from local files. One of the most famous third party repositories is the Bioconductor project, which hosts a lot of packages used by life scientists who do bioinformatics.</p>
<p>The Windows version of R has a GUI (Rgui) which provides extra functionality, such as package management and loading, and other goodies. Although there were plan for a GTK+ frontend for Linux, the project is (as far as I know) stuck in a limbo.</p>
<p>That&#8217;s where rkward comes to the rescue. It&#8217;s a GUI front-end for R for KDE4, which aims to provide a graphical shell for many R commands and environments (and especially the publication-quality plotting figures).</p>
<p><strong>Getting rkward</strong></p>
<p>rkward is available from <a title="rkward main page" href="http://rkward.sourceforge.net/">Sourceforge.net</a>. Unfortunately, if you use a recent (&gt;=2.8) version of R  it won&#8217;t compile, due to the changes in R itself. For that, you need to directly download the sources off SVN with a command like this</p>
<pre class="brush: cpp;">

svn co https://rkward.svn.sourceforge.net/viewvc/rkward/trunk/rkward/
</pre>
<p>Either way, the sources are compiled the usual, way, that is</p>
<pre class="brush: cpp;">

cd rkward-xxx # Your rkward source dir
mkdir build; cd build
cmake  -DCMAKE_INSTALL_PREFIX=`kde4-config --prefix` ../
make
</pre>
<p>Followed by <code>make install</code> as root or using sudo, depending on your distribution.</p>
<p><strong>rkward at a glance</strong></p>
<p><strong>
<a href="http://www.dennogumi.org/wp-content/gallery/screenshots/rkward1.png" title="" class="shutterset_singlepic263" >
	<img class="ngg-singlepic ngg-center" src="http://www.dennogumi.org/wp-content/gallery/cache/263__320x240_rkward1.png" alt="rkward1.png" title="rkward1.png" />
</a>
</strong></p>
<p>This is how rkward looks when loading it up (yes, it&#8217;s in Italian because that is my own locale). You have the R console (which I brought up) and then an output window which is used to display results. There is also another tab called &#8220;mio.dataset&#8221; (my.dataset) which keeps data, in a spreadsheet-like form. This is useful when you want to create your own datasets from scratch, or if you want to inspect one you have loaded.</p>
<p>So how do you start coding? You can create a new script using the &#8220;Script File&#8221; button. Like that, you can input R commands and then execute them all at once, or the current line. If you prefer interactive work, you can use the R command line (shown in the screenshot).</p>

<a href="http://www.dennogumi.org/wp-content/gallery/screenshots/rkward2.png" title="" class="shutterset_singlepic264" >
	<img class="ngg-singlepic ngg-center" src="http://www.dennogumi.org/wp-content/gallery/cache/264__320x240_rkward2.png" alt="rkward2.png" title="rkward2.png" />
</a>

<p>You can also use rkward to import data: R provides a series of functions (like <code>read.table</code>) to load data sets (usually comma- or tab-delimited text files). rkward provides a complete GUI to those functions, which is shown in the screenshot above. Notice that for working, it requires PHP (the line command version).</p>

<a href="http://www.dennogumi.org/wp-content/gallery/screenshots/rkward5.png" title="" class="shutterset_singlepic266" >
	<img class="ngg-singlepic ngg-center" src="http://www.dennogumi.org/wp-content/gallery/cache/266__320x240_rkward5.png" alt="rkward5.png" title="rkward5.png" />
</a>

<p>Ok, we have data loaded. Now we may want to do some operations: rkward provides front-ends to many of R&#8217;s statistical functions. In the screenshot, we can see the GUI for a two-variable t-test. Notice how it shows also the code, so the most experienced R people can view exactly what it does.</p>
<p>Like with statistics, R has powerful support for graphics, and even in this case rkward offers some frontends, for example histograms, boxplots, and scatter plots. You can also plot all kinds of distributions.</p>

<a href="http://www.dennogumi.org/wp-content/gallery/screenshots/rkward3.png" title="" class="shutterset_singlepic265" >
	<img class="ngg-singlepic ngg-center" src="http://www.dennogumi.org/wp-content/gallery/cache/265__320x240_rkward3.png" alt="rkward3.png" title="rkward3.png" />
</a>

<p>Lastly, rkward can manage your R packages (R package management is akin to one of a Linux distribution), and als your package sources. You can install or upgrade packages, and select where they&#8217;ll get installed to.</p>
<p><strong>Conclusions</strong></p>
<p>rkward is a nice frontend for the R programming language, which adds a GUI with the power of KDE to R. Unfortunately the program is still somewhat unstable (also shown by a warning when you run it) and its main developer has currently very little time to work on it. In case you may want to help, you can hop to the r<a title="rkward-devel" href="http://sourceforge.net/mailarchive/forum.php?forum_name=rkward-devel">kward-devel mailing list.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/02/science-and-kde-rkward/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Published! (and it matters more)</title>
		<link>http://www.dennogumi.org/2009/01/published-and-it-matters-more</link>
		<comments>http://www.dennogumi.org/2009/01/published-and-it-matters-more#comments</comments>
		<pubDate>Tue, 06 Jan 2009 17:39:39 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[pathway analysis]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=489</guid>
		<description><![CDATA[Finally I can lift the curtain of silence and tell the reason why I&#8217;ve been very busy before Christmas: it all lies in the publication of a paper, &#8220;Using Pathway Signatures as Means of Identifying Similarities among Microarray Experiments&#8221;, which is finally out on this week&#8217;s issue of PLoS ONE. It&#8217;s different from the previous [...]]]></description>
			<content:encoded><![CDATA[<p>Finally I can lift the curtain of silence and tell the reason why I&#8217;ve been very busy before Christmas: it all lies in the publication of a paper, &#8220;Using Pathway Signatures as Means of Identifying Similarities among Microarray Experiments&#8221;, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0004128">which is finally out on this week&#8217;s issue of <em>PLoS ONE</em></a>. It&#8217;s different from <a href="http://www.dennogumi.org/2008/01/phd">the previous paper I mentioned</a> (which was not my first publication, either), for two main reasons:</p>
<ul>
<li>It&#8217;s a bioinformatics paper;</li>
<li>I am <strong>first author</strong> there.</li>
</ul>
<p>The second point is very important because usually for a person doing bioinformatics is more difficult to end up as first author in a paper, since most we do is &#8220;something in the middle&#8221; like data analysis. Therefore, this paper is quite important for me. Also, it deals with an interest of mine, mainly analysis of biological networks using high-throughput platforms such as microarrays. Actually I&#8217;m also interested in network <em>reconstruction</em>, but I need to study far more than what I&#8217;m doing right now. </p>
<p>In any case, let&#8217;s hope this is the first of a (hopefully long) series!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2009/01/published-and-it-matters-more/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DataMatrix 0.7 has been released</title>
		<link>http://www.dennogumi.org/2008/12/datamatrix-07-has-been-released</link>
		<comments>http://www.dennogumi.org/2008/12/datamatrix-07-has-been-released#comments</comments>
		<pubDate>Sat, 27 Dec 2008 15:33:07 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[datamatrix]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/2008/12/datamatrix-07-has-been-released</guid>
		<description><![CDATA[Finally a new entry! I&#8217;ve been extremely busy with other things, that is why I did not have time to write more. One of the main reason is related to an important landmark in my professional career, but I&#8217;ll write more about it after January 1st (hint: those who follow my Twitter updates may have [...]]]></description>
			<content:encoded><![CDATA[<p>Finally a new entry! I&#8217;ve been <strong>extremely</strong> busy with other things, that is why I did not have time to write more. One of the main reason is related to an important landmark in my professional career, but I&#8217;ll write more about it after January 1st (hint: those who follow my Twitter updates may have already understood).</p>
<p>As a nice way to break the hiatus, I&#8217;m releasing a new version of DataMatrix, my implementation of R&#8217;s data.frame in Python. Although the version bump is small, there are loads of improvements. First of all, there is proper support for file-like objects, as well as support for appending and inserting both rows and columns. writeMatrix has been substantially improved and now writes files correctly, and I have added (experimental) support for a DataMatrix object that does not require files &#8211; EmptyMatrix. Also, there is now <a href="http://www.dennogumi.org/doc/datamatrix/">proper documentation</a>. Last but not least, unit tests have been added, a good way to watch out for regressions in the code.</p>
<p>Finally, this version marks the entrance of <a href="http://bioinfoblog.it">dalloliogm</a> as contributor to the code. He gave quite a number of helpful hints, especially with regards to unit tests.</p>
<p>I&#8217;m quite satisfied on how DataMatrix behaves &#8211; as a matter of fact I use it extensively on a number of internal projects.</p>
<p>You can grab DataMatrix 0.7 as a <a href="http://www.dennogumi.org/files/datamatrix-0.7.tar.gz">source package</a> or as <a href="http://www.dennogumi.org/files/datamatrix-0.7.win32.exe">a Windows installer</a>.  Comments are welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2008/12/datamatrix-07-has-been-released/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The plague of cross-database annotations</title>
		<link>http://www.dennogumi.org/2008/11/the-plague-of-cross-database-annotations</link>
		<comments>http://www.dennogumi.org/2008/11/the-plague-of-cross-database-annotations#comments</comments>
		<pubDate>Sun, 02 Nov 2008 14:15:20 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[annotation]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[microarray]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=470</guid>
		<description><![CDATA[Recently I had to annotate a large (10,000+) number of genes identified by Entrez Gene IDs. My goal was to avoid &#8220;annotation files&#8221; (basically CSV files) that a part of wet lab group likes, because I wanted to stay up-to-date without having to remember to update them. So the obvious solution was to use a [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I had to annotate a large (10,000+) number of genes identified by Entrez Gene IDs. My goal was to avoid &#8220;annotation files&#8221; (basically CSV files) that a part of wet lab group likes, because I wanted to stay up-to-date without having to remember to update them. So the obvious solution was to use a service available on the web, and in an automated way. For reference, I just tried to attach gene symbol, gene name, chromosome and cytoband.<br />
I tried many services:</p>
<ul>
<li><strong><a href="http://genome.ucsc.edu">UCSC Genome Browser</a></strong>: it has a MySQL server but it&#8217;s rather slow and I did not want to clog it up. Using their tables and .sql files I managed to get a first shot at annotation, but about 2,000 genes were without annotation!</li>
<li><strong><a href="http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene">NCBI&#8217;s own Entrez Gene</a></strong>: This needs EUtils, and in Biopython there is not a parser for Entrez Gene XML entries. I had to scrap the idea because I did not have time.</li>
<li><strong><a href="http://www.ensembl.org">Ensembl</a></strong>: I decided to use the <a href="http://www.biomart.org">Biomart</a> service, through Rpy. There were missing genes, and sometimes the IDs were &#8220;converted&#8221; in something else (I  had no time to figure out what was happening). Also some perfectly valid genes (in Entrez Gene) were not present in Ensembl.</li>
</ul>
<p>In the end I just grabbed <a href="http://www.bioconductor.org/packages/2.3/data/annotation/html/org.Hs.eg.db.html">Bioconductor&#8217;s &#8220;org.Hs.eg.db&#8221; package </a>and used its sqlite gene database (from Entrez Gene) to annotate the list, with only 97 missing IDs (mostly genes that had changed identifiers). However, this effort revealed a problem:<em>the annotations are not consistent between databases</em>. This is a real pain when doing microarray-based analysis, because you often have large number of genes and perceived lack of annotation might get lead to a number of them getting discarded. </p>
<p>I thought the situation was better than this. If I annotate genes in different databases with the same ID, I expect to get identical results. I mean, it&#8217;s not like Gene or Ensembl have little resources&#8230; or am I wrong?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2008/11/the-plague-of-cross-database-annotations/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DataMatrix 0.5</title>
		<link>http://www.dennogumi.org/2008/09/datamatrix-05</link>
		<comments>http://www.dennogumi.org/2008/09/datamatrix-05#comments</comments>
		<pubDate>Fri, 19 Sep 2008 19:54:58 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[datamatrix]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=451</guid>
		<description><![CDATA[At last, since it&#8217;s been like ages, I decided to put out a new version of DataMatrix. For those who haven&#8217;t seen my previous post, DataMatrix is a Pythonic implementation of R&#8217;s data.frame. It enables you to manipulate a text file by columns or rows, to your liking, using a dictionary-like syntax. 
In this new [...]]]></description>
			<content:encoded><![CDATA[<p>At last, since it&#8217;s been like ages, I decided to put out a new version of DataMatrix. For those who haven&#8217;t seen my previous post, DataMatrix is a Pythonic implementation of <a href="http://stat.ethz.ch/R-manual/R-patched/library/base/html/data.frame.html">R&#8217;s data.frame</a>. It enables you to manipulate a text file by columns or rows, to your liking, using a dictionary-like syntax. </p>
<p>In this new version there have been a few improvements and correction to a couple bugs (for example saveMatrix did not really save) and the start (only a stub at the moment) of an append function to add more columns (I&#8217;ll also think about a function to add rows).</p>
<p>DataMatrix is licensed under the GNU GPL, version 2 only. You can download <a href="http://www.dennogumi.org/files/datamatrix-0.5.win32.exe">the installer</a> (Windows) or <a href="http://www.dennogumi.org/files/datamatrix-0.5.tar.gz">the source distribution</a> (Linux and other *nixes). The only requirement is Python 2.5 or later installed on your system.</p>
<p> The README currently is a stub, but you can <a href="http://www.dennogumi.org/files/datamatrix.html">browse the pydoc generated documentation</a>, which details how to instantiate and use DataMatrix objects (or <a href="http://www.dennogumi.org/2008/06/dataframes-in-python-datamatrix">you can turn to my older post</a>). </p>
<p>Also, since git is the new &#8220;cool feature of the day&#8221;, DataMatrix is is hosted on github&#8217;s repository, and you can grab the source with </p>
<pre class="brush: cpp;">
git clone git://github.com/cswegger/datamatrix.git
</pre>
<p>Comments and suggestions are welcome. I&#8217;ll be putting a static page on DataMatrix tomorrow, if time permits.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2008/09/datamatrix-05/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>data.frames in Python &#8211; DataMatrix</title>
		<link>http://www.dennogumi.org/2008/06/dataframes-in-python-datamatrix</link>
		<comments>http://www.dennogumi.org/2008/06/dataframes-in-python-datamatrix#comments</comments>
		<pubDate>Sun, 29 Jun 2008 08:13:55 +0000</pubDate>
		<dc:creator>Einar</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.dennogumi.org/?p=405</guid>
		<description><![CDATA[For a long time I have tried to handle text files in Python in the same way that R&#8217;s data.frame does &#8211; that is, direct access to columns and rows of a loaded text file. As I don&#8217;t like R at all, I struggled to find a Pythonic equivalent, and since I found none, I [...]]]></description>
			<content:encoded><![CDATA[<p>For a long time I have tried to handle text files in Python in the same way that R&#8217;s <a title="R Data Frames" href="http://pbil.univ-lyon1.fr/library/base/html/data.frame.html">data.frame</a> does &#8211; that is, direct access to columns and rows of a loaded text file. As I don&#8217;t like R at all, I struggled to find a Pythonic equivalent, and since I found none, I decided to eat my own food and write an implementation, which is what you&#8217;ll find below.</p>
<p><span id="more-405"></span></p>
<p>The idea is to store the values of the text file as a dictionary of columns which includes then a list of (row name, row value) tuples. Like this, you can access the columns by their name (I need to see if it&#8217;s workable to also use numbers), or you can view specific rows, including all or a subset of the columns. It&#8217;s decently faster and it allows for non-sequential access, which you can&#8217;t do when reading a file (or a file-like structure).</p>
<p><strong>Requirements</strong></p>
<p>I have tested this on Python 2.5.1. Older versions may or may not work. All modules called by this one should be shipped with Python itself.</p>
<p><strong>Download and installation<br />
</strong></p>
<p><a title="Download" href="http://www.dennogumi.org/files/datamatrix.py">Download the py file directly</a>. Currently there is no installation mechanism, so copy it wherever Python can find it.  There&#8217;s <a title="Documentation" href="http://www.dennogumi.org/files/datamatrix.html">some API documentation</a> generated with pydoc.</p>
<p>This module is licensed under the GNU General Public License, version 2.</p>
<p><strong>Usage</strong></p>
<p>First of all, import the module</p>
<pre class="brush: python;">

import datamatrix</pre>
<p>Then open a file and instantiate a DataMatrix object</p>
<pre class="brush: python;">

fh = open(&quot;somefile.txt&quot;)
data = datamatrix.DataMatrix(fh)</pre>
<p>By default no column with row names is specified, so if you have one, you have to specify it:</p>
<pre class="brush: python;">
data = datamatrix.DataMatrix(fh, row_names=1)
</pre>
<p>More options are in the documentation.</p>
<p>Once the DataMatrix is initialized, you can view how many columns are there and also view rows with the getRow method:</p>
<pre class="brush: python;">

&gt;&gt; data.columns
[&quot;GeneID&quot;,&quot;Great_Exp1&quot;,&quot;Great_Exp2&quot;]

&gt;&gt; data[&quot;Great_Exp1&quot;]
[(&quot;Gene1&quot;,56.34),
...
]

&gt;&gt; data.getRow(5)
[&quot;NOT_EXISTENT&quot;,&quot;56.545&quot;,&quot;4.56&quot;]
</pre>
<p>Sometimes you&#8217;d want to get only the column without the row identifier, and that&#8217;s where getColumn comes in:</p>
<pre class="brush: python;">

&gt;&gt; data.getColumn(&quot;Great_Exp1&quot;)
[56.34,2.55.....]
</pre>
<p>Should you want to save a DataMatrix instance, you can use the writeMatrix function:</p>
<pre class="brush: python;">

datamatrix.writeMatrix(data,fname=&quot;/path/to/somewhere/file.txt&quot;)
</pre>
<p>That&#8217;s all. Questions and suggestions, especially on coding and improvements, are very welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dennogumi.org/2008/06/dataframes-in-python-datamatrix/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
