Monthly Archive for December, 2007

NumPy arrays vs Matlab matrices

NumPy has the same functionality as Matlab in terms of arrays (maybe a little more) but there are some syntax differences in creating and indexing arrays that confused me at first when switching from Matlab to NumPy. Continue reading ‘NumPy arrays vs Matlab matrices’

Check SSH login attempts

cat /var/log/auth.log | grep sshd

To check the zipped ones, use

zcat /var/log/auth.log.1.gz | grep sshd

Sending command line options to Python scripts

If you have some different options in your program and you want to turn them on or off, or feed your functions different arguments, then you can specify all of this from the command line.

You can read about the details of the optparser module here, but here are the basics: Continue reading ‘Sending command line options to Python scripts’

Test your code: so easy there’s no excuse!

I had heard of unittest and how I really needed to use it to make sure my code is doing what I expect . . . but it just seemed so clunky. Plus, a program would have to reach a certain threshold of complexity before I would make the effort to test with unittest.

Then I ran across doctest. And it is so astoundingly easy to use that I might start writing tests even for one-line scripts. Continue reading ‘Test your code: so easy there’s no excuse!’

A quick codon table

Sometimes it’s nice to be able to have a codon table handy. Rather than typing out one by hand, theTranslate module contains the codon table of your choice in dictionary form: Continue reading ‘A quick codon table’

Read FASTA files with BioPython

Here are several ways to parse a FASTA file into BioPython Seq objects.

Getting a FASTA file into Python is as simple as importing the necessary functions from BioPython, opening the file, and calling a parser on the file. Then you have sequences in BioPython that can be readily used. There are a couple of different ways to parse the file, depending on your preference. Choices are: Continue reading ‘Read FASTA files with BioPython’

Reformat HTML with Tidy

I had some HTML from a Joomla! site that I wanted to import to this site. The editor I had set up put all the HTML into one huge line, but I wanted to make it nicer to look at so I can maintain it. After a quick look around, I found tidy (official page or old but well written documentation).

It was available in the Ubuntu (Feisty) repositories, so that’s how I installed it. There’s no GUI or anything, it’s all command line. Here are the options I’ve been using with it:


tidy -f err.txt -w 1000 -m test.html

where:

  • -f err.txt is where any errors should be written to
  • -w means to wrap lines at this many characters (a quick look for an option to turn off line wrap yielded nothing, but this works just as well)
  • -m test.html means to reformat the file test.html in place.

So I copy the poorly-formatted code into a file called test.html, run tidy on it, and the same file is then reformatted nicely.

Software list

I use a lot of open source software every day. Here’s what I use the majority of the time:
Python 2.5, with with the modules

  • matplotlib for plotting and Matlab-like functionality
  • NumPy for fast math
  • RPyfor using R‘s extremely powerful statistics functions from within Python scripts
  • BioPython for manipulation of genomic data and running alignments, BLAST, and NCBI searches from within Python scripts
  • MySQLdb for interfacing with MySQL databases

gvim as a text editor for writing scripts
IPython as an interactive shell.

Personal reasons why I like Python

  • Makes good use of my time — I can code quickly in it with a surprisingly small number of bugs
  • Interactive debugger for squashing those bugs (pdb)
  • Python has everything I like about Perl (text processing, quickly write code, free) without the syntax overhead ($, @, %, {}, $_, ; )
  • Python has everything I like about Matlab (plotting, fast math, interactive interpreter) without the cost ($100′s with toolboxes)
  • Quick enough for small scripts but deep enough for larger programs, GUI and all
  • Syntax that reads like pseudocode
  • IPython

Store Python objects so you can use them later with “shelve”

I had a couple of Python lists that I needed to use in another script on another computer. The annoying way would be to write out to a tab-delimited file, but luckily the shelve module (standard with Python), makes things much easier. It’s the equivalent of Matlab’s .mat files, where you store variables for later use. Here’s how to use it: Continue reading ‘Store Python objects so you can use them later with “shelve”’

Working with BioPython Sequences

Much of BioPython uses Seq objects for dealing with sequences of all kinds. Here’s how to create, get the complement, transcribe, and translate sequences, either from scratch or from a FASTA or GenBank file. Continue reading ‘Working with BioPython Sequences’