<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Echo.2 &#187; programming</title>
	<atom:link href="http://blogs.infoecho.net/echo/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.infoecho.net/echo</link>
	<description>ping and pong in the ocean of network</description>
	<lastBuildDate>Tue, 27 Mar 2012 05:49:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Can Not Resist Hacking IPython + d3.js, Another Force Layout Demo for Word Ladder Game</title>
		<link>http://blogs.infoecho.net/echo/2012/03/26/can-not-resist-hacking-ipython-d3-js-another-force-layout-demo-for-word-ladder-game/</link>
		<comments>http://blogs.infoecho.net/echo/2012/03/26/can-not-resist-hacking-ipython-d3-js-another-force-layout-demo-for-word-ladder-game/#comments</comments>
		<pubDate>Tue, 27 Mar 2012 05:45:30 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=443</guid>
		<description><![CDATA[Post a video for visualizing the neighbors of English word (http://en.wikipedia.org/wiki/Word_ladder). It shows what can be done now with minimum change to IPython 0.13-dev source + some simple monkey patches. I will write down what I think where we can go from here later. The IPython notebook can be download from here. The monkey patches [...]]]></description>
			<content:encoded><![CDATA[<p>Post a <a href="http://dl.dropbox.com/u/69208751/ipython_d3_word_ladder.mov" title="ipython_d3_word_ladder">video</a> for visualizing the neighbors of English word (http://en.wikipedia.org/wiki/Word_ladder). It shows what can be done now with minimum change to IPython 0.13-dev source + some simple monkey patches.</p>
<p>I will write down what I think where we can go from here later. The IPython notebook can be download from <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup/blob/master/Word_Ladder_network_vis.ipynb" title="Word_Ladder_network_vis.ipynb">here</a>. The monkey patches that make this working can be downloaded from my fork of the IPython source code from the GitHub site too.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2012/03/26/can-not-resist-hacking-ipython-d3-js-another-force-layout-demo-for-word-ladder-game/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://dl.dropbox.com/u/69208751/ipython_d3_word_ladder.mov" length="6310525" type="video/quicktime" />
		</item>
		<item>
		<title>Yet another ipython + d3.js example: motion chart</title>
		<link>http://blogs.infoecho.net/echo/2012/02/26/yet-another-ipython-d3-js-example-motion-chart/</link>
		<comments>http://blogs.infoecho.net/echo/2012/02/26/yet-another-ipython-d3-js-example-motion-chart/#comments</comments>
		<pubDate>Mon, 27 Feb 2012 01:11:41 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=413</guid>
		<description><![CDATA[I gave a lightening talk in a recent Bay Area Python meet-up. I went over some of my recent hacks on combining ipython notebook and d3.js. What I wanted to show was how to mix python code and javascript code to create a dynamic programming/data analysis notebook. I created yet another example to demonstrate the [...]]]></description>
			<content:encoded><![CDATA[<p>I gave a lightening talk in <a href="http://www.meetup.com/silicon-valley-python/events/53418592/">a recent Bay Area Python meet-up</a>.  I went over some of my recent hacks on combining ipython notebook and d3.js. What I wanted to show was how to mix python code and javascript code to create a dynamic programming/data analysis notebook.  I created yet another example to demonstrate the great potential on combining the powerful tools.  </p>
<p>If you are interested, you can try the this ipython <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup/blob/master/GDP_CO2_Example.ipynb" title="notebook">notebook</a>.  You will need to download the development branch of the ipython v 0.13 to see the notebook.  The notebook itself includes some of the explanation on how to run it and how it is done. I did not spend too much polishing the code and the motion chart, but it got the basic ingredients. If you want to peek it, here is a short screen recoding to show it looks like. </p>
<div class="hvlog"> <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup/raw/master/images/ipy_nb_d3_movable_chart.m4v" rel="enclosure"><br />
<img src="https://github.com/cschin/IPython-Notebook---d3.js-mashup/raw/master/images/ipy_nb_d3_movable_chart.jpg" width=500> <br />(click to download the movie)<br />
</a> </div>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2012/02/26/yet-another-ipython-d3-js-example-motion-chart/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Experimenting with ipython notebook bi-directional communication</title>
		<link>http://blogs.infoecho.net/echo/2012/02/21/experimenting-with-ipython-notebook-bi-directional-communication/</link>
		<comments>http://blogs.infoecho.net/echo/2012/02/21/experimenting-with-ipython-notebook-bi-directional-communication/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 05:37:38 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=407</guid>
		<description><![CDATA[Thanks for Brian Granger pointing out how to make bi-directional communication from javascript in a ipython-notebook front-end to back-end ipython kernel using the existing websocket/zmq channel architecture in ipython (see the thread ). I have been hacking around to see how to do it. I need to modified a few lines of the ipython-notebook javascript [...]]]></description>
			<content:encoded><![CDATA[<p>Thanks for Brian Granger pointing out how to make bi-directional communication from javascript in a ipython-notebook front-end to back-end ipython kernel using the existing websocket/zmq channel architecture in ipython  (see the <a href="http://mail.scipy.org/pipermail/ipython-dev/2012-February/008778.html">thread</a> ). I have been hacking around to see how to do it. I need to modified a few lines of the ipython-notebook javascript to make it work ( see my <a href="https://github.com/cschin/ipython/commit/3a34d3b0c4d42bb1ef7b42660b12d429936cb287">github commit</a> ). I wrote some example to show how it works ( the <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup/blob/master/BidirectionalComm.ipynb">ipython notebook</a> and a <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup/blob/master/images/screenshot_bi_widget.jpg">screen shot</a> ).  Pretty cool that it works.  It seems that one can develop a widget library to avoid hand-crafting both the javascript and python code for such communication. All right, one more small step toward to building some cool interactive visualization / analysis tools with ipython.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2012/02/21/experimenting-with-ipython-notebook-bi-directional-communication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>iPython Notebook / d3.js mashup</title>
		<link>http://blogs.infoecho.net/echo/2012/02/05/ipython-notebook-d3-js-mashup/</link>
		<comments>http://blogs.infoecho.net/echo/2012/02/05/ipython-notebook-d3-js-mashup/#comments</comments>
		<pubDate>Mon, 06 Feb 2012 04:19:54 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=402</guid>
		<description><![CDATA[While I have been using ipython for a long time, I never really it more than just checking whether some code snippets working as expected. (Well, I tried to play with the parallel computing framework with ipython, but I never put it into production.) Just recently, I start to look into the ipython web-based notebook [...]]]></description>
			<content:encoded><![CDATA[<p>While I have been using <code>ipython</code> for a long time, I never really it more than just checking whether some code snippets working as expected. (Well, I tried to play with the parallel computing framework with <code>ipython</code>, but I never put it into production.) Just recently, I start to look into the <code>ipython</code> web-based notebook feature more carefully. It is great and make me think the <code>ipython</code> will make a python programmer or someone uses python for data analysis much more productive. (I used to envy the &#8220;RStudio&#8221; in the R-lang land, now, we python programmer finally have something more competitive.) </p>
<p>The cool thing using a web page as front-end is there are a lot potential using web interface for some cool visualization.  I played with protovis.js a while ago. Recently, I went to a visualization meets-up, d3.js was mentioned a numbers of time.  Then the idea comes to my mind &#8220;is it possible to combine the best of two world, python and d3.js?&#8221; After consulting some more experience users in the ipython-dev mailing list to see what is possible, I decided to spend some of my weekend time to hack it around.  In the meantime, I get the chance to play with tornado, zero-mq and websocket, all the fun stuff these days.  At the end, I am able to pass some javascript code written within the ipython notebook to get the browser to execute it and show some animation with <code>d3.js</code>. This will enable to create more fancier visualization in an interactive way all in a browser.  </p>
<p>My weekend hacking results are hosted at <a href="https://github.com/cschin/IPython-Notebook---d3.js-mashup" title="github repository "> github </a>. I think there is a great potential to make thing like this working better. (For example, can we have a pythonic backend of d3.js? <img src='http://blogs.infoecho.net/echo/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  )  It definitely worth to mess it around to see more use like this.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2012/02/05/ipython-notebook-d3-js-mashup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Some thought on the interview puzzles from a dot-com/DNA sequencing data processing company</title>
		<link>http://blogs.infoecho.net/echo/2011/10/22/some-thought-on-the-interview-puzzles-from-a-dot-comdna-sequencing-data-processing-company/</link>
		<comments>http://blogs.infoecho.net/echo/2011/10/22/some-thought-on-the-interview-puzzles-from-a-dot-comdna-sequencing-data-processing-company/#comments</comments>
		<pubDate>Sat, 22 Oct 2011 19:09:23 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[comment]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=372</guid>
		<description><![CDATA[Run into this earlier today. If you are interested in solving computational puzzles, I think they are good ones. It is tempting to write some real code to solve them but it is more interesting and important for me to spend my time analyzing some real data to solve real scientific puzzles this weekend. Anyway, [...]]]></description>
			<content:encoded><![CDATA[<p>Run into <a title="DNANexus Puzzle" href="https://dnanexus.com/careers/puzzles">this</a> earlier today. If you are interested in solving computational puzzles, I think they are good ones. It is tempting to write some real code to solve them but it is more interesting and important for me to spend my time analyzing some real data to solve real scientific puzzles this weekend. Anyway, to think and find the clues to the answers is straightforward, especially during and after my morning shower time.</p>
<p>Regardless the verbose description of the questions, the route to solve question is quite straightforward more or less. A real implementation might be slightly more complicated. The following are my tips on &#8220;how to solve&#8221; these puzzles. (I could be giving wrong answers/tips. If you want a job from <a href="https://dnanexus.com/" title="DNAnexus">DNAnexus</a>, you are on your own.)</p>
<p>(1) Insane in the Membrane</p>
<p>Obviously, the question has nothing really to do with any biological membrane. It is actually more related to &#8220;maze solving algorithm&#8221; or &#8220;finding shortest path&#8221; in a graph. One way to solve the problem is to convert the lattice to a graph where each node in the graph represents each empty space (&#8220;o&#8221;). The edges connect all nodes satisfying the constrain &#8220;Each successive position is only 1 nm away from the one before it&#8221;. You can start the search with a seed node that only has an edge. Then, using the standard <a title="BFS algorithm" href="http://en.wikipedia.org/wiki/Breadth-first_search">BFS algorithm</a> to find the longest path that connect to the seed node.  If there is no node with single edge attached, one can pick any node that is not visited during the graph search as the new seed node.  If you find one path that is longer the &#8220;Danny Dendrite&#8217;s genome&#8221;, output the solution. If not, try other seed node until you test all seed nodes. If you can not find any path from any seed node that is longer than the &#8220;Danny Dendrite&#8217;s genome&#8221;, output &#8220;impossible&#8221;.  </p>
<p>(2) Hungry Hungry Coders</p>
<p>Think the &#8220;enjoyment values&#8221; as a matrix of M by N. What you try to do to find maximum sum when picking one single element per row but non of the element can have a share column. One obvious initial state is to pick the maximum value for each row. If there is no overlapping column, Done. If such assignment is not possible, one need to find potential other assignment which minimize the reduction of the sum. I have not to encounter such problem in my work or research yet, although I can see such algorithm is super-useful on solving practical resource allocation.  A quick search shows the &#8220;correct&#8221; solution is probably the &#8220;<a href="http://en.wikipedia.org/wiki/Hungarian_algorithm" title="Hungarian algorithm">Hungarian algorithm</a>&#8220;. There are several different variants to solve such problem. It would be interested to know which one is mostly efficient especially in the case that there might be a lot of degenerated solutions.  Also, it might be interesting to see whether it really works sociologically asking engineers to writing down 10 numbers on 10 menu items for every lunch.  It seems quite a way to waste of time. <img src='http://blogs.infoecho.net/echo/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>(3) Genome Search</p>
<p>The major constrain of the problem is &#8220;You must not load the entire genome into memory; furthermore, you may read through the genome sequence only once.&#8221; Namely, you have to stream the reference genome and doing sequence comparison at the same time. Like all good exact sequencing match problem, using hash values is the way to go. For streaming approach, I think the answer is in some sort of <a title="rolling hash" href="http://en.wikipedia.org/wiki/Rolling_hash">rolling hash</a>. One can calculate the hash values of the &#8220;K sequences, each of length M&#8221;, into K hash values with one of the rolling hash algorithms. Then, calculate the rolling hash values of the 670G bps sequence with the various lengths of the K sequences while streaming the 670G bps sequence file through the memory. If there is a matched hash value, double check the strings do match and the matched hash value is not due to collision. By the way, good luck on sequencing and doing assembly on the 670 Gbps potentially highly repetitive genome.</p>
<p>Oh, well, while it could be fun to implement these algorithms to see that I indeed get the &#8220;correct&#8221; answer, I would be more interested to see how they are used in solving real genomics problems. While doing exact match string in a smart way is cool, to build a computing infrastructure to be able to do not-exact matches over and over again is way more useful, e.g., NCBI&#8217;s blast server. I assume that is one of what DNAnexus&#8217; directions in the future.  I do hope there is indeed a good HPC computation platform to help the scientific communities to solve large data / large analysis problems. In the meantime, I do also believe that the innovation on the DNA detection/sequencing technologies remains one of the most important parts driving biological/medical research to solve important problems. Maybe large data analysis with well-known analytics is only part of the equation.  Using data and good analytics to solve basic technology, chemistry, signal processing problems, and to understand the nature of different kind of DNA sequence data is really fun and important.    </p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2011/10/22/some-thought-on-the-interview-puzzles-from-a-dot-comdna-sequencing-data-processing-company/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DiggsInABox&#8230;</title>
		<link>http://blogs.infoecho.net/echo/2011/04/10/diggsinabox/</link>
		<comments>http://blogs.infoecho.net/echo/2011/04/10/diggsinabox/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 04:45:56 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=348</guid>
		<description><![CDATA[William McVey left a message about my code for the &#8220;DiggsInAbox&#8221; and this. I never really intended to release the code. I don&#8217;t mind to share but it is just I have never really motivated to polish the code so it can be released as a professional written code. Anyway, I am pasting my code [...]]]></description>
			<content:encoded><![CDATA[<p>William McVey left <a href="http://blogs.infoecho.net/echo/about-2/comment-page-1/#comment-129">a message</a> about my code for the &#8220;<a href="http://blogs.infoecho.net/echo/2007/02/12/diggs-in-a-box/">DiggsInAbox</a>&#8221; and <a href="http://infoecho.net/Sandbox/DiggsInABox.py">this</a>.  I never really intended to release the code. I don&#8217;t mind to share but it is just I have never really motivated to polish the code so it can be released as a professional written code.  Anyway, I am pasting my code here so I hope it can be useful for some one. I wrote this code for fun and to learn the algorithm to generate the Treemap. There are a lot of Treemap implementation these days. You might be able to find better code.</p>
<p>DiggsInABox.py</p>
<pre class="brush: python; title: ; notranslate">
#!/usr/bin/env python

&quot;&quot;&quot;
Copyright 2007-2011 Jason Chin, All right reserved
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice,
  this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS'
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.&quot;&quot;&quot;

from Treemap import Treemap, Node

class diggTreemap(Treemap):

    def __init__(self, rootNode):
        self.rootNode = rootNode;
        self.setWidthHeight(300,200)
        self.setPosition(0,0)
        self._cMap = self._colorMap()

    def _colorMap(self):
        cMap = [&quot;#fc0&quot;,&quot;#cc0&quot;,&quot;#f63&quot;,&quot;#39c&quot;,&quot;#696&quot;,&quot;#f93&quot;,
                &quot;#6f0&quot;,&quot;#c69&quot;,&quot;#cf9&quot;,&quot;#36c&quot;,&quot;#393&quot;,&quot;#c0f&quot;]
        i = 0
        while True:
            yield cMap[i]
            i = i+1
            if i &gt; len(cMap) - 1: i = i % len(cMap)

    def writeCSS(self):
        return u&quot;&quot;&quot;\
&lt;style type=&quot;text/css&quot; media=&quot;screen&quot;&gt;
body {
background: black;
}

div.group2 {
position:absolute;
overflow:hidden;
margin: 3px;
z-index: 1;
}

div.leave {
position:absolute;
text-align:center;
overflow:hidden;
vertical-align:middle;
opacity: 1;
border: 1px outset #000;
margin: 3px;
z-index:2;
}

div.leave &gt; a &gt; div {
    display: table-cell;
    position: static;
    vertical-align: middle;
    padding:3px;
    color:#000000;
}

div.leave &gt; a {
text-decoration:none;
}
div.leave &gt; a:link &gt; div  {
color: black;
}
div.leave &gt; a:visited &gt; div {
color: #bcc;
}

div.index {
position:absolute;
left: 810px;
width: 90px;
text-align:center;
overflow:hidden;
vertical-align:middle;
opacity: 1;
border: 2px #fff outset;

}

div.index &gt; a {
text-decoration:none;
}
div.index &gt; a:link &gt; div  {
color: black;
}
div.index &gt; a:visited &gt; div {
color: black;
}

div.index &gt; a &gt; div {
    display: table-cell;
    position: static;
    vertical-align: middle;
    padding:3px;
    color:#000000;
    width: 90px;
    -moz-user-select: none;
}

&lt;/style&gt;\n&quot;&quot;&quot;

    def writeJS(self):
        return &quot;&quot;&quot;\
&lt;script language='javascript'&gt;

function hiliteBlock(blockId) {
    elm = document.getElementById(blockId);
    elm.style.width = parseInt(elm.style.width) - 6 + 'px';
    elm.style.height = parseInt(elm.style.height) - 6 + 'px';
    elm.style.borderWidth = '3px';
    elm.style.borderStyle = 'dashed';
    elm.style.borderColor = '#fff';

    return true;
}

function removeHilite(blockId) {
    elm = document.getElementById(blockId);
    elm.style.width = parseInt(elm.style.width) + 6 + 'px';
    elm.style.height = parseInt(elm.style.height) + 6 + 'px';
    elm.style.borderWidth = '0px';
    elm.style.borderStyle = '';
    elm.style.borderColor = '';
    return true;
}

function showSummary(event,leaveId) {
    if (document.getElementById('summaryPopup')) {
        popup = document.getElementById('summaryPopup');
        document.body.removeChild(popup);
    } 

    //elm = document.getElementById(leaveId);
    popup = document.createElement('div');
    popup.id = 'summaryPopup';
    x = event.pageX;
    y = event.pageY;
    document.body.appendChild(popup);

    popup.style.position = 'absolute';
    popup.style.left = x + 'px';
    popup.style.top = y + 'px';
    popup.style.width = '250px';
    popup.style.height = '150px';
    popup.style.zIndex = '100';
    popup.style.background = '#cec';

    //still working on this function, ajax maybe needed

    return true;
}

function adjustFont() {
    tmElm = document.getElementById('treemap');

    for (idx=0; idx &lt; tmElm.childNodes.length; idx++) {
        elm = tmElm.childNodes[idx];
        if (elm.className != 'leave' &amp;&amp; elm.className != 'index') {
            continue;
        }
        while ( elm.scrollWidth &gt; elm.clientWidth ||
            elm.scrollHeight &gt; elm.clientHeight) {
            curFontsize =  parseInt(elm.style.fontSize);
            newFontSizeInPx = parseInt(elm.style.fontSize) - 1;
            if (newFontSizeInPx &lt;= 2) {
                newFontSizeInPx = 2;
                elm.style.fontSize = newFontSizeInPx + 'px';

                break;
             }
            elm.style.fontSize = newFontSizeInPx + 'px';

        }
    }
}

&lt;/script&gt;
&quot;&quot;&quot; 

    def writeAll(self):
        outStr = u&quot;&quot;
        outStr += u'&lt;html&gt;&lt;head&gt;'
        outStr += self.writeCSS()
        outStr += self.writeJS()
        outStr += u&quot;&lt;/head&gt;&lt;body onload='adjustFont();'&gt;&lt;div style='font-size:30px;color:white;margin-left:40px'&gt;Diggs in a Box&lt;br&gt; &lt;span style='font-size:0.3em;'&gt;(v. 0.02 by Chen-Shan Chin)&lt;/span&gt;&lt;/div&gt;&quot;

        outStr += u&quot;&lt;div id='treemap' style='position:relative;left:40px; top:5px;width:%dpx;height:%dpx;'&gt;&quot; % (self.width+5, self.height+5)

        #write feed navigator
        rssMapLabel = ['All','Technology','Science','Business','Sports',
                       'Entertainment','Gamming']
        outStr += &quot;&lt;div id='nav1' style='position:absolute;right:0px;top:-30px;height:30px;width:600px'&gt;&quot;
        outStr += &quot;&lt;table align=right&gt;&lt;tr&gt;&quot;
        for label in rssMapLabel:
            col = '#fff'
            if self.rootNode.name == label:
                col = '#fff'
            else:
                col = '#888'
            outStr += &quot;&lt;td&gt;&lt;a href='http://infoecho.net/Sandbox/DiggsInABox.py?feed=%s' style='text-decoration:none;'&gt;&lt;div style='color:%s;border:1px #888 solid;align:right;padding:3px;text-decoration:none;'&gt;%s&lt;/div&gt;&lt;/a&gt;&lt;/td&gt;&quot; % (label,col,label)

        outStr += &quot;&lt;/table&gt;&lt;/div&gt;&quot;

        #write index navigator
        y = 0;
        cMap = self._colorMap()
        totalWeight = 0

        for node in self.rootNode.children:
            if node.weight &gt; 10:
                totalWeight += node.weight
            else:
                totalWeight += 10
        dhMap = {}
        dhSum = 0

        for node in self.rootNode.children:
            dh =  1.0 * node.weight / totalWeight * (self.height+3)
            if dh &gt; 20:
                dhMap[node] = dh
            else:
                dhMap[node] = 20
            dhSum += dhMap[node]

        for node in self.rootNode.children:
            dh = 1.0 * dhMap[node] / dhSum * (self.height+3)
            fs = 12
            if dh &gt; 24:
                fs *= 1.8
            if max([len(w) for w in node.name.split(' ')]) &gt; 10:
                fs = (1.0*max([len(w) for w in node.name.split(' ')]) / 10)
            if len(node.name.split(' ')) &gt; 1 and dh &lt; 48:
                fs *= 0.7
            if fs &lt; 12: fs = 12
            if fs &gt; 24: fs = 24
            outStr += u&quot;&lt;div id='%s_index' class='index' \
style='top:%dpx; height:%dpx; background:%s;font-size:%dpx' onmouseover='hiliteBlock(\&quot;%s\&quot;);'  onmouseout='removeHilite(\&quot;%s\&quot;);'&gt;&lt;a target='_blank' href='%s'&gt;&lt;div style='height:%dpx;'&gt;%s&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&quot;\
% (node.name, y, dh-2, cMap.next(), fs,  node.name, node.name, node.properties['link'], dh-2,  node.name)
            y += dh

        #write all nodes
        outStr += self.writeNodes(self.rootNode)
        outStr += u&quot;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;&quot;
        return outStr

    def writeNodes(self, node):
        outStr = self.writeNode(node)
        for n in node.children:
            outStr += self.writeNodes(n)
        return outStr

    def writeNode(self, node):

        outStr = u&quot;&quot;

        if &quot;group2&quot; in node.properties:
            x,y,dw,dh = node.rect
            x = int(round(x))
            y = int(round(y))
            dw = int(round(dw))
            dh = int(round(dh))
            color = self._cMap.next()
            outStr += u&quot;&lt;div id='%s' class='group2' \
style='left:%dpx; top:%dpx; width:%dpx; height:%dpx; background:%s;'&gt;&lt;/div&gt;&quot;\
% (node.name, x, y, dw, dh, color)

            return outStr

        elif node.properties['is_leave'] == False:
            return outStr

        x,y,dw,dh = node.rect
        x = int(round(x+3))
        y = int(round(y+3))
        dw = int(round(dw-9))
        dh = int(round(dh-9))
        label = node.properties['data']['title'].strip()
        if len(label) &gt; 60:
            label = label[:60]+&quot; ...&quot;

        if dw &gt; 20 and dh &gt;20:
            fs = node.area**0.5 / 7;
            if len(label) &gt; 40:
                fs *= 0.75
            if len(label) &lt; 20:
                fs *= 1.25
            if max([len(w) for w in label.split(' ')]) * fs * 0.7 &gt; dw:
                fs = 2 * dw / (max([len(w) for w in label.split(' ')]))
            fs = int(fs)
            outStr += u&quot;&lt;div id='%s' class='leave' \
style='left:%dpx; top:%dpx; width:%dpx; height:%dpx;\
 font-size:%fpx;'&gt;&quot; % (node.name, x, y, dw, dh, fs)
            outStr += u&quot;&lt;a target='_blank' href='%s'&gt;&lt;div style='width:%d;height:%d;'&gt;%s&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&quot; % (node.properties['data']['link'], dw, dh, label)

        else:

            outStr += u&quot;&lt;a target='_blank' href='%s'&gt;&lt;div id='%s' class='leave' \
style='left:%dpx; top:%dpx; width:%dpx; height:%dpx;'&gt;&lt;/div&gt;&lt;/a&gt;&quot; % (node.properties['data']['link'], label, x, y, dw, dh)

        return outStr

###########################################################
import cgitb; cgitb.enable()
import cgi

import feedparser

form = cgi.FieldStorage()
print &quot;Content-Type: text/html&quot;
print

feed = &quot;&quot;
if form.has_key('feed'):
    feed = form['feed'].value.strip()

rssMap = {'All':'http://digg.com/rss/index.xml',
          'Technology':'http://digg.com/rss/containertechnology.xml',
          'Science':'http://digg.com/rss/containerscience.xml',
          'Business':'http://digg.com/rss/containerworld_business.xml',
          'Sports':'http://digg.com/rss/containersports.xml',
          'Entertainment':'http://digg.com/rss/containerentertainment.xml',
          'Gamming':'http://digg.com/rss/containergaming.xml'}

if feed not in rssMap:
    feed = 'All'

data = feedparser.parse(rssMap[feed])
entries = data['entries']
term2entries = {}
term2link = {}
for e in entries:
    term = e['digg_category']
    if term not in term2entries:
        term2entries[term] = []
    term2entries[term].append( {'id':e['id'],
                                'diggcount':int(e['digg_diggcount']),
                                'link':e['link'],'title':e['title'],
                                'commentcount':int(e['digg_commentcount']),
                                'summary':e['summary']} )
    if term not in term2link:
        term2link[term] = &quot;/&quot;.join(e['link'].split('/')[:-1])

rootNode = Node(feed)

for term in term2entries:

    n2 = Node(term)
    n2.properties['is_leave'] = False
    n2.properties['group2'] = True
    n2.properties['link'] = term2link[term]
    n2.weight = 0
    for entry in term2entries[term]:
        n3 = Node(entry['id'].split(&quot;/&quot;)[-1])
        n3.properties['is_leave'] = True
        n3.properties['data'] = entry
        n3.weight = entry['diggcount']
        n2.addAChild(n3)
        n2.weight = n2.weight + n3.weight
    rootNode.addAChild(n2)
    rootNode.weight = rootNode.weight + n2.weight
    rootNode.properties['is_leave'] = False

#for n in rootNode.children:
#    print n.name, [n2.name for n2 in n.children]

rootNode.sortChildrenByWeight()
TM = diggTreemap(rootNode)
TM.setWidthHeight(800,540)
TM.layout()
outStr = TM.writeAll()
print outStr.encode('utf-8')
</pre>
<p>Treemap.py</p>
<pre class="brush: python; title: ; notranslate">
#!/usr/bin/env python

&quot;&quot;&quot;
Copyright 2007-2011 Jason Chin, All right reserved
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice,
  this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS'
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.&quot;&quot;&quot;

import sys

class Node:

    def __init__(self, name, weight=1):
        self.name = name
        self.properties = {}
        self.children = []
        self.parentNode = []
        self.area = 1.0 #need to normalized such that sum(children.area) = the area assigned
        self.rect = []
        self.weight = weight

    def addAChild(self, aNode):
        self.children.append(aNode)
        aNode.parentNode.append(self)    

    def addChildren(self, Nodes):
        for n in Nodes:
            self.children.append(n)
            n.parentNode.append(self) 

    def numOfChildren(self):
        return len(children)

    def addAProperty(self, pk, pv):
        self.properties[pk] = pv

    def sortChildrenByWeight(self):
        if len(self.children) == 0:
            return
        tmpNodes = [ [-c.weight, c] for c in self.children]
        tmpNodes.sort()
        self.children = 1 for c in tmpNodes]
        for c in self.children:
            c.sortChildrenByWeight()

    def normalizeChildrenArea(self, totalArea):
        sw = 1.0 * sum([n.weight for n in self.children])
        for n in self.children:
            n.area = n.weight / sw * totalArea;

class Treemap:

    def __init__(self, rootNode):
        self.rootNode = rootNode
        self.setHeightWidth(300,200)
        self.setPosition(0,0)
        pass

    def setWidthHeight(self, width,height):
        self.height = height
        self.width = width

    def setPosition(self,left,top):
        self.left = left
        self.top = top

    def _worst(self, rowOfNodes, mw):
        rowBlockAreas = [n.area for n in rowOfNodes];
        s = sum(rowBlockAreas)
        rmin = min(rowBlockAreas)
        rmax = max(rowBlockAreas)
        return max([ (1.0*mw*mw * rmax)/(s*s), (1.0*s*s)/(mw*mw * rmin)] )    

    def _squarified(self, nodes, rowOfNodes, top, left, mw, mh, layoutDir=1):

        if mw &gt; mh:
            mw, mh = mh, mw
            layoutDir = -layoutDir

        nodesToPlot = nodes;
        nodesInRow = []

        while nodesToPlot:

            c = nodesToPlot[0];
            if not nodesInRow:
                nodesInRow.append(c)
                nodesToPlot = nodesToPlot[1:]
                continue

            if self._worst(nodesInRow, min([mw,mh])) &gt;= self._worst(nodesInRow+1, min([mw,mh])):
                nodesInRow = nodesInRow+1
                nodesToPlot = nodesToPlot[1:]
                continue
            else:
                dh = 1.0*sum([n.area for n in nodesInRow])/mw;

                self._layoutRowOfNodes(nodesInRow, left, top, dh, layoutDir)

                if layoutDir == 1:
                    top = top + dh
                else:
                    left = left + dh
                mh = mh - dh;

                if mh &lt; mw:
                    mw, mh = mh, mw
                    layoutDir = -layoutDir

                nodesInRow = []

        dh = 1.0*sum([n.area for n in nodesInRow])/mw;
        self._layoutRowOfNodes(nodesInRow, left, top, dh, layoutDir)      

    def _layoutRowOfNodes(self,rowOfNodes,left,top,mh,ld):

        x = left
        y = top

        for n in rowOfNodes:
            r = n.area
            if ld == 1:
                dw = 1.0 * r / mh
                dh = 0
                n.rect = [x,y,dw,mh]
                #print &quot;Rect(%f,%f,%f,%f);&quot; % (x,y,dw,mh)
            else:
                dw = 0
                dh = 1.0 * r / mh
                n.rect = [x,y,mh,dh]
                #print &quot;Rect(%f,%f,%f,%f);&quot; % (x,y,mh,dh)
            x = x + dw
            y = y + dh    

    def _layoutANode(self, aNode, left, top, w, h):
        if len(aNode.children)==0: return

        aNode.normalizeChildrenArea(w*h)
        #print aNode,[n for n in aNode.children]

        self._squarified(aNode.children, [], top, left, w, h)
        for n in aNode.children:
            x,y,w,h = n.rect
            self._layoutANode(n, x, y, w, h)

    def layout(self):
        w = self.width
        h = self.height
        self.rootNode.area = w * h;
        self.rootNode.rect = [0, 0, w, h];
        #self.rootNode.normalizeChildrenArea(w*h)
        self._layoutANode(self.rootNode, 0, 0, w, h)  

    def writeAll(self, outputStream=sys.stdout):
        self.outputStream = outputStream
        self.printNodes(self.rootNode)
        pass

    def writeNodes(self, node):
        self.writeNode(node)
        for n in node.children:
            self.writeNodes(n)

    def writeNode(self, node):
        outputStream.write(node.name)
        outputStream.write(&quot;\n&quot;)
        outputStream.write(node.rect)
        outputStream.write(&quot;\n&quot;)
        pass

class CanvasTreemap(Treemap):
    import random
    def __init__(self, rootNode):
        self.rootNode = rootNode;
        self.setWidthHeight(300,200)
        self.setPosition(0,0)

    def printAll(self):
        print &quot;function plotCanvas(cId){&quot;
        print &quot;ctx = document.getElementById(cId).getContext('2d');&quot;
        print &quot;ctx.globalAlpha=0.2;&quot;
        self.printNodes(self.rootNode);
        print &quot;}&quot;

    def printNodes(self, node):
        self.printNode(node)
        for n in node.children:
            self.printNodes(n)        

    def printNode(self,node):
        x,y,dw,dh = node.rect
        x = x+1
        y = y+1
        dw = dw-1
        dh = dh-1
        level = node.properties['level']
        style = {0:'rgb(255,0,0)', 1:'rgb(255,255,0)', 2: 'rgb(0,255,0)', 3: 'rgb(0,255,255)', 4: 'rgb(255,0,255)'}

        print &quot;ctx.strokeStyle = '%s';&quot; % style[int(random.uniform(0,4.9))]
        print &quot;ctx.fillStyle = '%s';&quot; % style[int(random.uniform(0,4.9))]
        print &quot;ctx.lineCap='round';&quot;
        print &quot;ctx.lineWidth = %d;&quot; % (10-2*int(random.uniform(0,4)))
        print &quot;ctx.beginPath();&quot;
        print &quot;ctx.moveTo(%.1f,%.1f);&quot; % (x,    y);
        print &quot;ctx.lineTo(%.1f,%.1f);&quot; % (x+dw, y);
        print &quot;ctx.lineTo(%.1f,%.1f);&quot; % (x+dw, y+dh);
        print &quot;ctx.lineTo(%.1f,%.1f);&quot; % (x,    y+dh);
        print &quot;ctx.lineTo(%.1f,%.1f);&quot; % (x,    y);
        if random.uniform(0,1) &lt; 0.5:
            print &quot;ctx.stroke();&quot;
        else:
            print &quot;ctx.fill();&quot;
            #print &quot;ctx.stroke();&quot;

class DivTreemap(Treemap):

    def __init__(self, rootNode):
        self.rootNode = rootNode;
        self.setWidthHeight(300,200)
        self.setPosition(0,0)

    def printAll(self):
        print &quot;&lt;html&gt;&quot;
        print '&lt;head&gt;&lt;style type=&quot;text/css&quot; media=&quot;screen&quot;&gt;'
        print &quot;&quot;&quot;div.node {
            position:absolute;
            text-align:center;border:2px solid #bbaaaa;
            overflow:hidden;
            vertical-align:middle;
            background:#f0ffff
            }&quot;&quot;&quot;
        print '&lt;/style&gt;&lt;head&gt;'
        print &quot;&lt;body&gt;This is a test.&lt;br/&gt;&quot;
        print &quot;&lt;div id='treemap' style='position:relative;left:30px; top:60px;width:%dpx;height:%dpx;'&gt;&quot; % (self.width, self.height)
        self.printNodes(self.rootNode);
        print &quot;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;&quot;

    def printNodes(self, node):
        self.printNode(node)
        for n in node.children:
            self.printNodes(n)        

    def printNode(self,node):
        if node.name == &quot;root&quot;: return
        x,y,dw,dh = node.rect
        level = node.properties['level']
        fs = node.area**0.5 / 100;
        if fs &lt; 0.75: fs = 0.75
        print &quot;&lt;div id='%s' class='node' \
        style='left:%dpx; top:%dpx; width:%dpx; height:%dpx;\
        line-height:%dpx; font-size:%fem;' onclick='alert(\&quot;%s\&quot;)'&gt;%s&lt;/div&gt;&quot;\
         % (node.name, x, y, dw-5, dh-5, dh-5, fs, \
         &quot;Do you like to eat &quot;+node.name+&quot;?&quot;, node.name)

import random
def testCanvas():
    nodes = []
    for i in range(0,5):
        n1 = Node('node:%d' % i)
        n1.properties['level'] = 1
        n1.weight = random.uniform(1,20)
        for j in range(0,5):
            n2 = Node('node:%d-%d' %(i,j))
            n2.properties['level'] = 2
            n2.weight = random.uniform(1,20)
            for k in range(0,10):
                n3 = Node('node:%d-%d-%d' % (i,j,k))
                n3.properties['level'] = 3
                n3.weight = random.uniform(1,20)
                for l in range(0,50):
                    n4 = Node('node:(%d-%d-%d-%d)' % (i,j,k,l))
                    n4.properties['level'] = 4
                    n4.weight = random.uniform(1,20)
                    n3.addAChild(n4)
                n2.addAChild(n3)
            n1.addAChild(n2)
        nodes.append(n1)

    root = Node('root')
    root.addChildren(nodes)
    root.sortChildrenByWeight()
    root.properties['level']=0
    TM = CanvasTreemap(root)
    TM.setWidthHeight(800,800)
    TM.layout()
    print &quot;&lt;html&gt;&lt;head&gt;&quot;
    print &quot;&lt;script&gt;&quot;
    TM.printAll()
    print &quot;&lt;/script&gt;&lt;head&gt;&quot;
    print &quot;&quot;&quot;&lt;body onload=&quot;plotCanvas('c')&quot;&gt;
    &lt;canvas id=&quot;c&quot; width=810 height=810&gt;&lt;/canvas&gt;&lt;/body&gt;&lt;/html&gt;&quot;&quot;&quot;

def testDiv():
    tagArray = {&quot;apples&quot;: 12,
	            &quot;oranges&quot;: 38,
	            &quot;pears&quot; : 10,
	            &quot;mangos&quot; : 24,
	            &quot;grapes&quot; : 18,
	            &quot;bananas&quot; : 56,
	            &quot;watermelons&quot; : 80,
	            &quot;lemons&quot; : 12,
	            &quot;limes&quot; : 12,
	            &quot;pineapples&quot; : 15,
	            &quot;strawberries&quot; : 20,
	            &quot;coconuts&quot; : 43,
	            &quot;cherries&quot; : 20,
	            &quot;raspberries&quot; : 8,
	            &quot;peaches&quot; : 25
                }
    nodes = []
    for tag in tagArray:
        n = Node(tag)
        n.weight = tagArray[tag]
        n.properties['level']=1
        nodes.append(n)
    root = Node('root')
    root.addChildren(nodes)
    root.properties['level']=0
    root.sortChildrenByWeight()
    TM = DivTreemap(root)
    TM.setWidthHeight(800,250)
    TM.layout()
    TM.printAll() 

if __name__ == '__main__':

    #testCanvas();
    testDiv();
</pre>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2011/04/10/diggsinabox/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to implement the Needleman–Wunsch alignment algorithm without using a single loop in Python</title>
		<link>http://blogs.infoecho.net/echo/2011/04/10/how-to-implement-the-needleman%e2%80%93wunsch-alignment-algorithm-without-using-a-single-loop-in-python/</link>
		<comments>http://blogs.infoecho.net/echo/2011/04/10/how-to-implement-the-needleman%e2%80%93wunsch-alignment-algorithm-without-using-a-single-loop-in-python/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 04:11:46 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=338</guid>
		<description><![CDATA[I am still fascinated about the programming style using co-routine. Actually, it is possible to implement the Needleman–Wunsch alignment algorithm by purely message passing fashion. The following code shows how to implement the algorithm using co-routines again. I modify the code from my previous post such that the alignment array itself is also generated dynamically. [...]]]></description>
			<content:encoded><![CDATA[<p>I am still fascinated about the programming style using co-routine.  Actually, it is possible to implement the <a href="http://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm">Needleman–Wunsch alignment algorithm</a> by purely message passing fashion. The following code shows how to implement the algorithm using co-routines again.  I modify the code from <a href="http://blogs.infoecho.net/echo/2011/03/24/yet-another-python-coroutine-fun-stuff/"> my previous post </a> such that the alignment array itself is also generated dynamically.  We can completely remove those setting up loops.  This code is also annotated to show how it is done. If any reader is interested and have any comment, I do like to hear.     </p>
<pre class="brush: python; title: ; notranslate">
# @author Jason Chin
#
# Copyright (C) 2011 by Jason Chin
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the &quot;Software&quot;), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

&quot;&quot;&quot;

This is an example to implement Needleman-Wunsch sequence algorithm using python's
co-routine. One of the most interest aspect of such implementation is that there
is no explicitly loop. You can not find either the &quot;for&quot; nor &quot;while&quot; keywords
in this code.  Each alignment cell is a co-routine and the calculation of alignement
score and backtracking that generates the alignment string are done with a message
passing fashion.  The alignment cells are also generated in a dynamic way.  A
banded alignment can be done by limiting not generate the whole alignment array but
only the banded part of the array.

Is it useful? I am not sure, but it is definitely fun to show it is possible.

--Jason Chin, Apr. 10, 2011

&quot;&quot;&quot;

### Set up the alignment score scheme
matchScore, mismatchScore, gapScore = 4, -3, -4

### Two testing string for alingment
seq1 = &quot;TTAAGTGTAGCCTTGTGTGACATGTATTTTTAT&quot;
seq2 = &quot;TTTCTAGGTAGTTGTGGTGAGTTTAGTTGATAT&quot;

### cellMap is a dictionary that maps integer pairs to the co-routines
cellMap = {}

### For tracking the global best alignment cell
globalBestCellScore = [None, -100000]

def getAnAlignCell(x, y, seq1, seq2):
    &quot;&quot;&quot;
    This function returns a co-routine the represents an alignment cell at position
    x and y.  The alignment strings are passed explicilty for simplicity.
    &quot;&quot;&quot;

    def alnCell():

        &quot;&quot;&quot;
        This is the co-routine for an alignment cell. A alignment cell co-routine is
        excuted in roughly two stage. The first stage it collects the alignment score
        from the cells at (x-1,y-1), (x-1,y), and (x, y-1) and calculate the best
        alignment score. Depending the alignment path through the alignment cell, a new
        alignment score is generated and passed to the cells at (x+1, y+1),
        (x+1,y), and (x, y+1). If any of those cell has not be generated, it will
        generate the co-routine and regisiter them with the cellMap dictionary. After
        this it waits for the backtracking caculation.  If a cell is in the best alignment
        path, it will pass the best alignment pair to next cell in the best alignment
        path.
        &quot;&quot;&quot;

        global globalBestCellScore
        global cellMap

        b1, b2 = seq1[x], seq2[y]
        mx, my = len(seq1), len(seq2)

        cellData = []

        # if the cell is on the top or the left side of the alignment, they only have
        # to wait for one other cell to pass in the alignment score. Otherwise, they
        # need to collect three messages from those (x-1,y), (x,y-1), and (x-1, y-1)
        # before they can do any calculation.
        if x == 0 or y == 0:
            cellId, s = yield
            cellData.append( (cellId, s) )
        else:
            cellId, s = yield
            cellData.append( (cellId, s) )
            cellId, s = yield
            cellData.append( (cellId, s) )
            cellId, s = yield
            cellData.append( (cellId, s) )

        # find the best cell that gives the best alignment score
        cellData.sort( key=lambda x: -x[1] )
        bestCell, bestScore = cellData[0]

        if bestScore &gt; globalBestCellScore[1]:
            globalBestCellScore = [ (x,y), bestScore ]

        # pass the new alignment score to (x+1, y+1)
        if x+1 &lt; mx and y+1 &lt; my:
            # generate the cell at (x+1, y+1) if necessary
            if (x+1, y+1) not in cellMap:
                cellMap[ (x+1, y+1) ] = getAnAlignCell( x+1, y+1, seq1, seq2 )()
                cellMap[ (x+1, y+1) ].next()
            if b1 == b2: # a match, seq1[x] == seq[2], new_score = bestScore + matchScore
                cellMap[ (x+1, y+1) ].send( ((x,y), bestScore + matchScore) ) # pass the new score to cell (x+1, y+1)
            else: # a mismatch, seq1[x] != seq[2], new_score = bestScore + mismatchScore
                cellMap[ (x+1, y+1) ].send( ((x,y), bestScore + mismatchScore) ) # pass the new score to cell (x+1, y+1)
        # pass the new alignment score to (x+1, y), namely, the base seq1[x] is aligned to a gap
        if x+1 &lt; mx:
            # generate the cell at (x+1, y) if necessary
            if (x+1, y) not in cellMap:
                cellMap[ (x+1, y) ] = getAnAlignCell( x+1, y, seq1, seq2 )()
                cellMap[ (x+1, y) ].next()
            cellMap[ (x+1, y) ].send( ((x,y), bestScore + gapScore) )
        # pass the new alignment score to (x, y+1), namely, the base seq2[y] is aligned to a gap
        if y+1 &lt; my:
            # generate the cell at (x, y+1) if necessary
            if (x, y+1) not in cellMap:
                cellMap[ (x, y+1) ] = getAnAlignCell( x, y+1, seq1, seq2 )()
                cellMap[ (x, y+1) ].next()
            cellMap[ (x, y+1) ].send( ((x,y), bestScore + gapScore) )

        path = yield # wait, if the cell is on the best path, the co-routine will resume 

        # generate the alignment pair according the best alinged cells
        if bestCell[0] &gt;= 0 and bestCell[1] &gt;=0 :
            if path == None:
                path = []

            if bestCell[0] - x == 0:
                c1 = &quot;-&quot;
            else:
                c1 = seq1[x-1]
            if bestCell[1] - y == 0:
                c2 = &quot;-&quot;
            else:
                c2 = seq2[y-1]
            path.extend( [ (c1, c2) ] )

            # send calculated partial path to the best alingment cell to this cell
            cellMap[ bestCell ].send(  path   )

        # return the best path if bestCell[0] = -1 or bestCell[1] = -1
        yield path

    return alnCell

# initialize the cell at (0,0)
cellMap[ (0,0) ] = getAnAlignCell( 0, 0, seq1, seq2 )()
# prime it
cellMap[(0,0)].next()
# start the whole execution by sending in the initial score to cell at (0,0)
cellMap[(0,0)].send( ( (-1, -1), 0 ) )

# get the best global cell
bestCell = globalBestCellScore[0]

# continue to excute the best cell co-routine to get the alignment path
bestPath = cellMap[bestCell].next()
bestPath.reverse()

# some simple mechinary to print out the alignment path
alnRes = zip(*bestPath)
print &quot;&quot;.join(alnRes[0])
print &quot;&quot;.join(alnRes[1])
</pre>
<p>The result:</p>
<pre class="brush: plain; title: ; notranslate">
$ python coAlign_v2.py
-TT-AAGTGTAGCCTTGT-GTGACATGTA-TTTTTA
TTTCTAG-GTAG--TTGTGGTGA-GTTTAGTTGATA
</pre>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2011/04/10/how-to-implement-the-needleman%e2%80%93wunsch-alignment-algorithm-without-using-a-single-loop-in-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yet Another Python Coroutine Fun Stuff</title>
		<link>http://blogs.infoecho.net/echo/2011/03/24/yet-another-python-coroutine-fun-stuff/</link>
		<comments>http://blogs.infoecho.net/echo/2011/03/24/yet-another-python-coroutine-fun-stuff/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 07:14:37 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=330</guid>
		<description><![CDATA[It might be a totally useless python hack. Yes, it is possible to implement dynamic programming using message passing style python co-routine with the enhanced python generator. Here is the code. I will write some details about how this piece code works. However, the main idea is simple (although you might need some background knowledge [...]]]></description>
			<content:encoded><![CDATA[<p>It might be a totally useless python hack. Yes, it is possible to implement dynamic programming using message passing style python co-routine with the enhanced python generator. Here is the code.  I will write some details about how this piece code works. However, the main idea is simple (although you might need some background knowledge about sequence alignment algorithm.)  We create a co-routine for each alignment cell. The alignment score is generated by passing the best score around the neighboring cells. The backtracking is also implemented as message passing backward.</p>
<pre class="brush: python; title: ; notranslate">
matchScore, mismatchScore, gapScore = 4, -5, -3
seq1 = &quot;AGTGTAGTTGTGTGAATGTATTTTTAT&quot;
seq2 = &quot;AGGTAGTTGTGGTGATTTAGTTGATAT&quot;

cellMap = {}
globalBestCellScore = [None, -100]

def getAnAlignCell(x, y, p):
    def f():
        global globalBestCellScore
        global cellMap
        b1, b2 = p
        cell1Id, s1 = yield
        cell2Id, s2 = yield
        cell3Id, s3 = yield
        cellData = [ (cell1Id, s1), (cell2Id, s2), (cell3Id, s3) ]
        cellData.sort( key=lambda x: -x[1] )
        bestCell, bestScore = cellData[0]
        if bestScore &gt; globalBestCellScore[1]:
            globalBestCellScore = [ (x,y), bestScore ]
        if x+1 &lt; len(seq1) and y+1 &lt; len(seq2):
            if b1 == b2:
                cellMap[ (x+1, y+1) ].send( ((x,y), bestScore + matchScore) )
            else:
                cellMap[ (x+1, y+1) ].send( ((x,y), bestScore + mismatchScore) )
        if x+1 &lt; len(seq1):
            cellMap[ (x+1, y) ].send( ((x,y), bestScore + gapScore) )
        if y+1 &lt; len(seq2):
            cellMap[ (x, y+1) ].send( ((x,y), bestScore + gapScore) )

        path = yield
        if bestCell[0] &gt;= 0 and bestCell[1] &gt;=0 :
            if path == None:
                path = []
            path.extend( [ (x,y) ] )

            cellMap[ bestCell ].send(  path   )
        yield path
    return f

for x in range(len(seq1)):
    for y in range(len(seq2)):
        cellMap[ (x,y) ] = getAnAlignCell( x, y, (seq1[x], seq2[y]) )()
        cellMap[ (x,y) ].next()

for x in range(len(seq1)):
    cellMap[ (x,0) ].send( ( (x, -1), 0 ) )
    cellMap[ (x,0) ].send( ( (x-1, -1), 0 ) )

for y in range(len(seq2)):
    if y != 0:
        cellMap[ (0,y) ].send( ( (-1, y), 0 ) )
        cellMap[ (0,y) ].send( ( (-1, y-1), 0 ) )

cellMap[(0,0)].send( ( (-1, -1), 0 ) )

bestCell = globalBestCellScore[0]
bestPath = cellMap[bestCell].next()
bestPath.reverse()

s1 = []
s2 = []
px, py = bestPath[0]
for x,y in bestPath[1:]:
    if x - px != 0:
        s1.append(seq1[px])
    else:
        s1.append(&quot;-&quot;)
    if y - py != 0:
        s2.append(seq21)
    else:
        s2.append(&quot;-&quot;)
    px, py = x, y
print &quot;&quot;.join(s1)
print &quot;&quot;.join(s2)
</pre>
<p>The result seems to be correct</p>
<pre class="brush: plain; title: ; notranslate">
$ python coAlign.py
GTGTAGTTGTGTGAATGTATTT--TT-A
G-GTAGTTGTG-G--TG-ATTTAGTTGA
</pre>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2011/03/24/yet-another-python-coroutine-fun-stuff/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python Generator Fun</title>
		<link>http://blogs.infoecho.net/echo/2011/03/23/python-generator-fun/</link>
		<comments>http://blogs.infoecho.net/echo/2011/03/23/python-generator-fun/#comments</comments>
		<pubDate>Thu, 24 Mar 2011 04:45:52 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=319</guid>
		<description><![CDATA[The following python code generates 100 by 100 = 10,000 generators and use them to simulate 100 step random walk 500 times. Not particular useful thing but it was fun to find out you can simulate random walk differently. I will probably try to write some dynamical programming code using the extensive generator in python [...]]]></description>
			<content:encoded><![CDATA[<p>The following python code generates 100 by 100 = 10,000 generators and use them to simulate 100 step random walk 500 times. Not particular useful thing but it was fun to find out you can simulate random walk differently.  I will probably try to write some dynamical programming code using the extensive generator in python (co-routine like construct) if I find some time to work on it. </p>
<pre class="brush: python; title: ; notranslate">

import random

maxStep = 100
fmap = {}
def getFun(i,j):
    def f():
        path = [(i,j)]
        while 1:
            if i &lt; maxStep - 1:
                path.extend( fmap[ (i+1, j+1) ].next() if random.uniform(0,1) &gt; 0.5 else fmap[ (i+1, j) ].next() )
            yield path
            path = [(i,j)]
    return f

for i in range(maxStep):
    for j in range(maxStep):
        f = getFun(i,j)()
        fmap[ (i,j) ] = f

for i in range(500):
    print i, [ x[1] for x in fmap[ (0,0) ].next() ]
</pre>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2011/03/23/python-generator-fun/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Review on Mitchell Model&#8217;s book, &#8220;Bioinformatics Programming Using Python&#8221;</title>
		<link>http://blogs.infoecho.net/echo/2010/02/15/review-on-mitchell-models-book-bioinformatics-programming-using-python/</link>
		<comments>http://blogs.infoecho.net/echo/2010/02/15/review-on-mitchell-models-book-bioinformatics-programming-using-python/#comments</comments>
		<pubDate>Mon, 15 Feb 2010 16:07:37 +0000</pubDate>
		<dc:creator>Jason Chin</dc:creator>
				<category><![CDATA[comment]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blogs.infoecho.net/echo/?p=300</guid>
		<description><![CDATA[I am helping a local Pyhton interests group for a review of the book &#8220;Bioinformatics Programming Using Python&#8221; by Mitchell Model. Here is my review. &#8212; Comparing to Perl, Python has a quite lagged adoption as the scripting language of choice in the field of bioinformatics, although it is getting some moment recently.   If you [...]]]></description>
			<content:encoded><![CDATA[<p>I am helping a local Pyhton interests group for a review of the book &#8220;Bioinformatics Programming Using Python&#8221; by Mitchell Model. Here is my review.</p>
<p>&#8212;</p>
<p>Comparing to Perl, Python has a quite lagged adoption as the scripting language of choice in the field of bioinformatics, although it is getting some moment recently.   If you read job descriptions for bioinformatics engineer or scientist positions a few year back, you barely saw Python mentioned, even as a &#8220;nice to have optional skill&#8221;.  One of the reasons is probably lacking of good introductory level bioinformatics books in Python so there are, in general, less people thinking Python as a good choice for bioinformatics.   The book &#8220;Beginning Perl for Bioinformatics&#8221; from O Reilly was published in 2001.  Almost one decade later, we finally get the book &#8220;Bioinformatics Programming Using Python&#8221; from Mitchell Model to fill the gap.</p>
<p>When I first skimmed the book &#8220;Bioinformatics Programming Using Python&#8221;, I got the impression that this book was more like &#8220;learning python using bioinformatics as examples&#8221; and felt a little bit disappointed as I was hoping for more advanced content.  However, once I went through the book, reading the preface and everything else chapter by chapter, I understood the main target audiences that author had in mind and I thought the author did a great job in fulfilling the main purpose.</p>
<p>In modern biological research, scientists can easily generate large amount of data where Excel spreadsheets that most bench scientists use to process limiting amount of data is no longer an option.  I personally believe that the new generation of biologists will have to learn how to process and manage large amount inhomogeneous data to make new discovery out of it.  This requires general computational skill beyond just knowing how to use some special purpose applications that some software vendor can provide.  The book gives good introduction about practical computational skills using Python to process bioinformatics data.  The book is very well organized for a newbie who just wants to start to process the raw data their own and get into a process of learning-by-doing to become a Python programmer.</p>
<p>The book starts with an introduction on the primitive data types in Python and moves toward the flow controls and collection data type with emphasis on, not surprisingly, string processing and file parsing, two of most common tasks in bioinformatics. Then, the author introduces the object-oriented programming in Python. I think a beginner will also like those code templates for different patterns of data processing task in Chapter 4.  They summarize the usual flow structure for common tasks very well.</p>
<p>After giving the basic concept of programming with Python, the author focuses on other utilities which are very useful for day-to-day work for gathering, extracting, and processing data from different data sources. For example, the author discusses about how to explore and organize files with Python in the OS level, using regular expression for extracting complicated text data file, XML processing, web programming for fetching online biological data and sharing data with a simple web server, and, of course, how to program Python to interact with a database. The deep knowledge of all of these topics might deserve their own books. The author does a good job to cover all these topics in a concise way. This will help people to know what can be done very easily with Python and, if they want, to learn any of those topic more from other resources.  The final touch of the book is on structured graphics. This is very wise choice since the destiny of most of bioinformatics data is very likely to be some graphs used in presentations and for publishing.  Again, there are many other Python packages can help scientists to generate nice graph, but the author focuses on one or two of them to show the readers how to do general some graphs with them and the reader might be able to learn something else from there.</p>
<p>One thing I hope the author can also cover, at least at a beginner level, is the numerical and statistical aspect in bioinformatics computing with Python.  For example, Numpy or Scipy are very useful for processing large amount of data, generating statistics and evaluating significance of the results.  They are very useful especially for processing large amount data where the native Python objects are no longer efficient enough.  The numerical computation aspect in bioinformatics is basically lacking in the book.  The other thing that might be desirable for such a book is to show that Python is a great tool for prototyping some algorithms in bioinformatics.  This is probably my own personal bias, but I do think it is nice to show some basic bioinformatics algorithm implementations in python. This will help the readers to understand a little bit more about some of the common algorithms used in the field and to get a taste on a little bit more advanced programming.</p>
<p>Overall, I will not hesitate to recommend this book to any one who will like to start to process biological data on their own with Python. Moreover, it can actually serve as a good introductory book to Python regardless the main focus on bioinformatics examples. The book covers most day-to-day basic bioinformatics tasks and shows Python is a great tool for those tasks.  I think a little more advanced topics, especially on basic numerical and statistical computation in the book, will also help the target audiences. Unfortunately, none of that topic is mentioned in the book. That has been said, even if you are an experienced python programmer in bioinformatics, the book&#8217;s focus on Python 3 and a lot of useful templates might serve well as a quick reference if you are looking for something you do not have direct experience before.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.infoecho.net/echo/2010/02/15/review-on-mitchell-models-book-bioinformatics-programming-using-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

