Archive for category comment

Review on Mitchell Model’s book, “Bioinformatics Programming Using Python”

I am helping a local Pyhton interests group for a review of the book “Bioinformatics Programming Using Python” by Mitchell Model. Here is my review.

Comparing to Perl, Python has a quite lagged adoption as the scripting language of choice in the field of bioinformatics, although it is getting some moment recently.   If you read job descriptions for bioinformatics engineer or scientist positions a few year back, you barely saw Python mentioned, even as a “nice to have optional skill”.  One of the reasons is probably lacking of good introductory level bioinformatics books in Python so there are, in general, less people thinking Python as a good choice for bioinformatics.   The book “Beginning Perl for Bioinformatics” from O Reilly was published in 2001.  Almost one decade later, we finally get the book “Bioinformatics Programming Using Python” from Mitchell Model to fill the gap.

When I first skimmed the book “Bioinformatics Programming Using Python”, I got the impression that this book was more like “learning python using bioinformatics as examples” and felt a little bit disappointed as I was hoping for more advanced content.  However, once I went through the book, reading the preface and everything else chapter by chapter, I understood the main target audiences that author had in mind and I thought the author did a great job in fulfilling the main purpose.

In modern biological research, scientists can easily generate large amount of data where Excel spreadsheets that most bench scientists use to process limiting amount of data is no longer an option.  I personally believe that the new generation of biologists will have to learn how to process and manage large amount inhomogeneous data to make new discovery out of it.  This requires general computational skill beyond just knowing how to use some special purpose applications that some software vendor can provide.  The book gives good introduction about practical computational skills using Python to process bioinformatics data.  The book is very well organized for a newbie who just wants to start to process the raw data their own and get into a process of learning-by-doing to become a Python programmer.

The book starts with an introduction on the primitive data types in Python and moves toward the flow controls and collection data type with emphasis on, not surprisingly, string processing and file parsing, two of most common tasks in bioinformatics. Then, the author introduces the object-oriented programming in Python. I think a beginner will also like those code templates for different patterns of data processing task in Chapter 4.  They summarize the usual flow structure for common tasks very well.

After giving the basic concept of programming with Python, the author focuses on other utilities which are very useful for day-to-day work for gathering, extracting, and processing data from different data sources. For example, the author discusses about how to explore and organize files with Python in the OS level, using regular expression for extracting complicated text data file, XML processing, web programming for fetching online biological data and sharing data with a simple web server, and, of course, how to program Python to interact with a database. The deep knowledge of all of these topics might deserve their own books. The author does a good job to cover all these topics in a concise way. This will help people to know what can be done very easily with Python and, if they want, to learn any of those topic more from other resources.  The final touch of the book is on structured graphics. This is very wise choice since the destiny of most of bioinformatics data is very likely to be some graphs used in presentations and for publishing.  Again, there are many other Python packages can help scientists to generate nice graph, but the author focuses on one or two of them to show the readers how to do general some graphs with them and the reader might be able to learn something else from there.

One thing I hope the author can also cover, at least at a beginner level, is the numerical and statistical aspect in bioinformatics computing with Python.  For example, Numpy or Scipy are very useful for processing large amount of data, generating statistics and evaluating significance of the results.  They are very useful especially for processing large amount data where the native Python objects are no longer efficient enough.  The numerical computation aspect in bioinformatics is basically lacking in the book.  The other thing that might be desirable for such a book is to show that Python is a great tool for prototyping some algorithms in bioinformatics.  This is probably my own personal bias, but I do think it is nice to show some basic bioinformatics algorithm implementations in python. This will help the readers to understand a little bit more about some of the common algorithms used in the field and to get a taste on a little bit more advanced programming.

Overall, I will not hesitate to recommend this book to any one who will like to start to process biological data on their own with Python. Moreover, it can actually serve as a good introductory book to Python regardless the main focus on bioinformatics examples. The book covers most day-to-day basic bioinformatics tasks and shows Python is a great tool for those tasks.  I think a little more advanced topics, especially on basic numerical and statistical computation in the book, will also help the target audiences. Unfortunately, none of that topic is mentioned in the book. That has been said, even if you are an experienced python programmer in bioinformatics, the book’s focus on Python 3 and a lot of useful templates might serve well as a quick reference if you are looking for something you do not have direct experience before.

奇文共賞

在二十一世紀的今天,台灣的某大報系下的海外版的社論出現下列的句子:

『在「百年老店」裡,58歲的馬英九是春秋鼎盛、如日方中的新星。』

『他領導國民黨仆而復起,號召台灣人民、尤其是青年一代,終結了台獨政權,正是「青年創造時代」的典型。』

『愛因斯坦的「相對論」改造了百年間的科學奧秘,而孔孟之道卻歷經千餘年影響世道人心,連馬克斯信徒也不得不信。』

『鼓勵青年學習馬英九,絕不是搞甚麼「偶像祟拜」,更無意要造一座「新神」,而是就近取譬,用大家都看得見的事實,期勉繼往開來的青年世代,好好鑄造自己、鍛鍊自己,無負「青年創造時代」的期望。』

久居國外,我對馬英九了解不算多,也沒有意見。但看了這文章後,不得不想起那連小學裡作文都要以『解救大陸水深火熱同胞』『以三民主義統一中國』的年代。也許,只是也許,某聖君可以不和獨裁磕頭,完成反共復國的大業。這樣就不用每年去拜拜了。

Disclaimer: 我年幼無知的時候為了考試或是混公假,應該也寫了不少奇聞,不過那可是上世紀的歷史共業呀!

奇文原出處之一

我最先發現奇文的地方

Tags:

My one day trip to Lugradio, San Francsico, 2008

從 PingYeh 那聽到有 Lugradio 這週末在 San Francisco 舉行, 一時興起,決定和老婆女兒告假一天去看看熱鬧。

雖然我在 1993 還是 1994 安裝過 Linux with kernel version 0.97 後,有幾年是非 Linux 不用的人,參加 Linux / open source 社群的活動倒是第一次。台灣的 open source 活動開始熱絡的時候,我人已不再台灣,而人在米國的時候,因為學業和懶的關係也沒有看看過有沒有甚嘛好玩的活動可以參加。所以對我來說,這一次湊熱鬧的感覺是很新鮮的。

Img 4957

我約十一點多到達會場,當場交了米金大洋十塊錢,註了冊,拿了名牌和有贊助商的小禮品的小袋子就進到會場裡逛逛。會場是在 San Francisco Metreon 戲院的頂樓的 CITY VIEW,從前到 Metron 時從來沒聽過有這麼個地方可以辦活動又還有不錯的 city view 的地方。同一時間內,會場會有三場演講進行,你可以選擇比較有興趣的來聽。不想聽的話,就可以逛逛廠商地展示。我隨意聽了幾個演講:其中有 Second Life 的人來說他們 open source 的策略,有 Bungee Connect 的人示範他們的發展平台,有 VMWare 的人展示 Virtual Machine 的 Streaming,也有 Humanized 的 Aza Raskin 討論使用者介面等等。 大部分都還滿有趣,但並沒有在很多技術上比較有深度的討論,大多的討論都在比較形而上的層次。但這樣也好。而從其他聽眾的提問看來,很多參與的人很重視 open source 的發展。

在其他廠商展示方面,我跑去收集了不少 linux 廠商提供的 live CD。而在眾多的廠商展示裡,對我來說最有趣的卻是兩個硬體的廠商。其一是 TI 可以跑 Linux 的單晶片電腦 beagleboard 。看來等 TI 六月出了這東西,我可能會受不了灑點錢買來玩玩。

Img 4954

另外一個有趣的是我終於看到傳說中的 OLPC,的確是很可愛讓人會不住把玩的東西。可惜這有趣的 laptop 只能看看而已。

Img 4955

今天最後一個演講到五點,本來要 skip 晚上的 party,已打算要回家了。在離開會場前,一個對 amateur biotech 有很大興趣的軟體工程師在得知我在一家 biotech 公司工作後,興高采烈的和我討論有沒有甚麼可以在家裡做的 biotech 的計畫,聊了一個小時後才放我回家。

明天 Lugradio 還有一整天的活動,不過我有其他事要做,不能去了。但今天的一日行倒是收穫不少。意外地得了不少在工作上或是家中得不到得 inspiration 和平常不容易看到的 San Francisco City View!!
Img 4960

ad$ense or ad$pam?

200804022149

I wish I have a little bit more virtual memory so I can convert virtual money to real one.

Tomorrow

Tomorrow, a new day with new challenges! What a great feeling for entering the next stage of my career! Although I have to make a tough decision, it is indeed time to move on. I am really feeling the excitement of a new environment and a new career path now.

Tags:

Maker Faire 2007

I went to Maker Faire yesterday. I have been a Make magazine subscriber since its debut and I like to hack and make things. Although I have been working some project for my work (in systems biology) last year very hard to a degree that I did not feel to do anything anymore. Since the project is near the end and I am in a position trying to seek new opportunities in my life, it might be a good time to revive some hacker/maker sprit in my heart. So, I could not miss the second Maker Faire,.

My families and I arrived the faire around 10 o’clock. We first spent some time on the Robot Pavilion to see quite a few different interesting robot. Some of them are quite extraordinary, e.g., robots driven by steam engine, self-balancing robot, and home-made Segway, etc.

Although there was a schedule for different presentation and demonstration, with a less than two years old toddler, we were just doing random walk in the faire ground afterward.

The largest Maker demonstration was at ___. Half of the hall was for “Maker” and half of the hall was for “Crafter”. Joy was joking about that most women can stay in the crafter part and most of men can stay in the maker part and this made the “Maker/Crafter Faire” as a great event for family. One can found some of those projects in Make demonstrated in the maker part. There was also a DIY lab to make a “Ybox” provided by Yahoo as sponsor. It was interesting to see that Yahoo seemed putting more effort for this “Maker” event. Compared to Yahoo, Google only put a few computers demonstrating Sketcher. On the other hand, Microsoft did demonstrate some of their cool lab technologies. I talked to a engineer showing me their new web mash-up tools, popfly. It seemed quite interesting, but I had yet to tried it out by myself. Well, at least, my daughter liked their “ducks”.

In the main Make demonstration, what I found quite interesting was the 3D Metal sculptures by Bathsheba. This was my first time to see the results of a 3D printer using metal as the material. It just is amazing to see new technology combining the beauty of math and science. Although I had seen some of the mathematical surfaces on a computer screen, it was just more exciting to see them as “real things”. It was very tempting to buy one of the sculptures and I would probably eventually order one sometime. Or, make one myself for myself in the future. I think I can do it by joining the TechShop and I can use their 3D ABS printer to construct some 3D sculptures too.

There were also all kind of bikes and human powered machines and quite a few DIY activities for kids and adults. Well, my one-and-half years old daughter, Emily was still too small for most of those activities but she seemed like the faire a lot. Oh, she liked the Wii remote controlled RC cars. Ha, this might be a good excuse to buy a Wii, for her and for me (to hack).

Later on, I found the booth Sun demonstrating their Java-power embedding system developing gadget “Sunspot”. The “Sunspot” seemed a cute and powerful little machine for all sorts of purposes, but it was a little bit expansive.

Overall, my families and I had a fun and inspiring day. All I hope is that I will be able to get time to develop some cool projects myself in the near future.

What are interesting:

IMG_3502.JPG

3D Metal sculptures by Bathsheba — exploring how math, science and sculpture meet (web site). ProMetal is the company which make the sculptures by a technology that can “print” metal layer by layer.

TechShop — A club which the members can use their machines for prototyping and DIY. The membership costs $100 per month. I find it is very attractive. If I decide to make something cool, no doubt I will get a membership and learn to use all cool machines they have. They have a 3D ABS printer (from Scicon?) that can be used for fast prototype.

Lots of robots.

Sunspot

Popfly

Moto labs

STL file format for 3D sculpture/prototype.

Pick the nose!!
IMG_3511.JPG

Yahoo, Ybox.tv

puzzle maker

some pictures

Tags:

Viacom vs. google

Well, actually, I just want to test video embedding. This clip is pretty
funny anyway.

Tags:

The Chariman of Google

They must forget to google search “Chariman Schmidt”.
See also “Look up Chariman”.

This video about “googleTube” is fun too.

Tags: ,

A loop

Today, I was googling some term used in stochastic process about finance. After a few link jumps, I found a recent talk by my previous adviser talking about the research we did a few years ago. I took a look about what he said in the talk. It is kind of strange feeling. Recently, I start to consider seriously leaving academic. I don’t feel that I am good enough to work in academia.

Tags:

亂糟糟

這一年來,生活亂糟糟。

Tags: