Archive for the 'Python' Category

Minimal Sphinx setup for autodocumenting Python modules

Here’s how to get a nice automatic documentation of your Python code using Sphinx. Sphinx can automagically slurp in all your docstrings, format them nicely, and render them as HTML or PDF output. Your docstrings end up looking so nice that sometimes it makes you want to write more of them! (which of course would result in better-documented code).
Continue reading ‘Minimal Sphinx setup for autodocumenting Python modules’

Write Excel files with Python using xlwt

In a previous post (which turned out to be pretty popular) I showed you how to read Excel files with Python. Now for the reverse: writing Excel files.

First, you’ll need to install the xlwt package by John Machin.

The basics

In order to write data to an Excel spreadsheet, first you have to initialize a Workbook object and then add a Worksheet object to that Workbook. It goes something like this:

import xlwt
wbk = xlwt.Workbook()
sheet = wbk.add_sheet('sheet 1')

Now that the sheet is created, it’s very easy to write data to it.

# indexing is zero based, row then column
sheet.write(0,1,'test text')

When you’re done, save the workbook (you don’t have to close it like you do with a file object)

wbk.save('test.xls')

Digging deeper

Overwriting cells

Worksheet objects, by default, give you a warning when you try to overwrite:

sheet.write(0,0,'test')
sheet.write(0,0,'oops') 

# returns error:
# Exception: Attempt to overwrite cell: sheetname=u'sheet 1' rowx=0 colx=0

To change this behavior, use the cell_overwrite_ok=True kwarg when creating the worksheet, like so:

sheet2 = wbk.add_sheet('sheet 2', cell_overwrite_ok=True)
sheet2.write(0,0,'some text')
sheet2.write(0,0,'this should overwrite')

Now you can overwrite sheet 2 (but not sheet 1).

More goodies

# Initialize a style
style = xlwt.XFStyle()

# Create a font to use with the style
font = xlwt.Font()
font.name = 'Times New Roman'
font.bold = True

# Set the style's font to this new one you set up
style.font = font

# Use the style when writing
sheet.write(0, 0, 'some bold Times text', style)

xlwt allows you to format your spreadsheets on a cell-by-cell basis or by entire rows; it also allows you to add hyperlinks or even formulas. Rather than recap it all here, I encourage you to grab a copy of the source code, in which you can find the examples directory. Some highlights from the examples directory in the source code:

  • dates.py, which shows how to use the different date formats
  • hyperlinks.py, which shows how to create hyperlinks (hint: you need to use a formula)
  • merged.py, which shows how to merge cells
  • row_styles.py, which shows how to apply styles to entire rows.

Non-trivial example

Here’s an example of some data where the dates not formatted well for easy import into Excel:

20 Sep, 263, 1148,   0,   1,   0,   0,   1,   12.1,   13.9, 1+1, 19.9
 20 Sep, 263, 1118,   0,   1,   0, 360,   0,   14.1,   15.3, 1+1, 19.9
 20 Sep, 263, 1048,   0,   1,   0,   0,   0,   14.2,   15.1, 1+1, 19.9
 20 Sep, 263, 1018,   0,   1,   0, 360,   0,   14.2,   15.9, 1+1, 19.9
 20 Sep, 263, 0948,   0,   1,   0,   0,   0,   14.4,   15.3, 1+1, 19.9

The first column has the day and month separated by a space. The second column is year-day, which we’ll ignore. The third column has the time. The data we’re interested in is in the 9th column (temperature). The goal is to have a simple Excel file where the first column is date, and the second column is temperature.

Here’s a [heavily commented] script to do just that. It assumes that you have the data saved as weather.data.example.

'''
Script to convert awkwardly-formatted weather data
into an Excel spreadsheet using Python and xlwt.
'''

from datetime import datetime
import xlwt

# Create workbook and worksheet
wbk = xlwt.Workbook()
sheet = wbk.add_sheet('temperatures')

# Set up a date format style to use in the
# spreadsheet
excel_date_fmt = 'M/D/YY h:mm'
style = xlwt.XFStyle()
style.num_format_str = excel_date_fmt

# Weather data has no year, so assume it's the current year.
year = datetime.now().year

# Convert year to a string because we'll be
# building a date string below
year = str(year)

# The format of the date string we'll be building
python_str_date_fmt = '%d %b-%H%M-%Y'

row = 0  # row counter
f = open('weather.data.example')
for line in f:
    # separate fields by commas
    L = line.rstrip().split(',')

    # skip this line if all fields not present
    if len(L) < 12:
        continue

    # Fields have leading spaces, so strip 'em
    date = L[0].strip()
    time = L[2].strip()

    # Datatypes matter. If we kept this as a string
    # in Python, it would be a string in the Excel sheet.
    temperature = float(L[8])

    # Construct a date string based on the string
    # date format  we specified above
    date_string = date + '-' + time + '-' + year

    # Use the newly constructed string to create a
    # datetime object
    date_object = datetime.strptime(date_string,
                                    python_str_date_fmt)

    # Write the data, using the style defined above.
    sheet.write(row,0,date_object, style)
    sheet.write(row,1,temperature)

    row += 1

wbk.save('reformatted.data.xls')

Still curious? Other questions? Check out the python-excel google group ! Also check out xlutils for more functionality, which I plan to play around with next.

Python script to package Latex projects for distribution

This is probably one of those scripts that will evolve over time, but I’m posting it now in case someone can get some use out of it. My problem was this:

I had many, many figures in my working directory, but I didn’t use all of them in the Latex document. I was trying to figure out a way to send the source files — *.tex, *.cls, *.bst, *.bib, etc, plus only the images files that were actually in the document — to someone else so they could edit on their own and compile on their own. I didn’t want to set up a version control (SVN, etc), I just wanted a tar file.

After some poking around I couldn’t find anything already made that would do this (Kile has an Archive menu item, but this doesn’t include figures). It was easy enough to get a Python script going.

This script parses an input file, looks at the various documents and figures that are included, and archives them in a tar.gz file which can then be sent to someone. Note that as it stands, it only looks two levels deep for \include tags. If I use this more I’ll have to make it recursive (it’s not obvious to me how to do that, I haven’t used recursion much before).

Consider this script a rough draft. It worked perfectly for me, but your mileage may vary.


"""
This script gathers the necessary images and files (from
an arbitrarily large number of unneeded figures) and
puts it all in a tarball for distribution.

Usage: latexpackager.py main.tex dissertation.tar.gz
"""

import sys
import re
import os
import tarfile

def find_references(f):
    '''Returns a list of Latex files that f refers to,
    by parsing \include, \bibliography, \bibliographystyle,
    \input, etc.

    If nothing was found, returns an empty list.'''

    s = open(f).read()

    # Find the .tex files.
    texs = []
    for i in re.finditer(r"""[^%]\\include\{(.*)\}""", s):
        texs.append(i.groups()[0]+'.tex')

    # Find the .bib files.
    bibs = []
    for i in re.finditer(r"""[^%]\\bibliography\{(.*)\}""", s):
        bibs.append(i.groups()[0]+'.bib')

    # Find the styles.
    styles = []
    for i in re.finditer(r"""[^%]\\bibliographystyle\{(.*)\}""", s):
        styles.append(i.groups()[0]+'.bst')

    # Find the document class description file
    docclass = []
    for i in re.finditer(r"""[^%]\\documentclass\{(.*)\}""", s):
        docclass.append(i.groups()[0]+'.cls')

    # Look for any inputs.
    inputs = []
    for i in re.finditer(r"""[^%]\\input\{(.*)\}""", s):
        texs.append(i.groups()[0]+'.tex')

    # Here is everything that was referenced in f:
    return texs + bibs + styles + docclass + inputs

def find_figures(f):
    '''Returns a list of figures found in the file.  Only
    looks in .tex files.  If not a .tex file or no figures found,
    returns an empty list.'''

    # Short circuit if not a .tex file.
    if f[-4:] != '.tex':
        return []

    includegraphics = r"""[^%].*\\includegraphics\[.*\]\{([^\}]*)\}"""
    figures = []
    s = open(f).read()
    matches = re.finditer(includegraphics, s)

    for match in matches:
        basename = match.groups()[0]
        if basename[-4] == '.':
            # that is, it has an extension already.
            # This is for things like .png images.
            figures.append(basename)
        else:
            figures.append(basename + '.pdf')
            figures.append(basename + '.eps')

    return figures

main = sys.argv[1]
tarfn = sys.argv[2]

projectdir, main = os.path.split(main)
if projectdir == '':
    projectdir = os.getcwd()

keepers = find_references(main)

# Don't forget to add the main .tex file.
keepers.append(main)

# For each of those that main.tex referenced, look for more.
# These are files referenced two levels deep.

for f in keepers:
    if f[-4:] != '.tex':
        continue
    keepers.extend(find_references(f))

# Now look for graphics.

figures = []
for f in keepers:
    figures.extend(find_figures(f))

#paths = [os.path.join(projectdir, i) for i in keepers + figures]
paths = keepers + figures

tarball = tarfile.open(tarfn, 'w:gz')
for path in paths:
    print path
    tarball.add(path)
tarball.close()

RPy: statistics in R from Python

R is a free, open source statistics package written by statisticians, for statisticians. Python on the other hand lacks a comprehensive statistics package. RPy allows you to combine the power of Python with the power of R for an unbeatable combination in data analysis.

Note that in order to use R from Python, you need to know a little of both . . . so the learning curve can be steep. You also need to have a feel for what would be easy in R and what would be easy in Python.

There are some detailed examples below if you want to skip right to ‘em.

I use Python for most tasks, but when I need high-powered stats, I embed R code in my Python scripts to perform the analysis.

Disclaimer: I figured all of this stuff out by trial and error. The RPy documentation, while complete, was difficult for me to make sense of when I was learning. If there’s a better way to do things, please let me know! For the details that I don’t cover here, check the online documentation

Why use R?

You’ll need R if you want to do any sort of sophisticated (or even not-so-sophisiticated) statistical analysis. There are no solid statistics libraries that I’ve come across for Python . . . but maybe that’s because R is the best possible statistics library there could be.

Be warned however that accessing R from Python can get tricky at times. I’ve tried to outline some of what I’ve learned here to make it easier for others.

Why use RPy instead of writing files out to R, then using R scripts to deal with it? I did this for a little while and found that it was too much work to maintain two separate code bases . . . one for Python, then one for R. If I changed anything in the output of a Python script, I’d have to fire up R and open my R scripts to modify and debug them. I’ve found that using RPy lets me put all my code in one spot, resulting in fewer bugs and less maintenance.

R and Python are separate . . .

I found that the easiest way to think about this is to think about doing things “inside R” or “inside Python”. Things that are to be done inside R are typically wrapped in a string (a Python string). For example, this creates a variable inside R called x with a value of 5.

from rpy import *
r('x=5')

Assuming this was typed into a fresh Python session, Python has no idea about the existence of the variable x! It works in reverse, too: R has no idea about what’s in the Python namespace. So you can do this in Python:

x = 'I'm a Python string'

and the variable x inside R is still the same:

r('print(x)')  # still 5

. . . but they can talk to each other

RPy does some automatic conversions:

x_from_R = r('x')  # 5

What happened here is that RPy looked at what x was inside R, saw that it was an integer, and returned that integer to Python, which assigned it to the Python variable x_from_R. So that’s how you get data from R to Python: by sending a string (the variable name you want to retrieve in R) to the r object.

At first you might think this is how you send data from Python to R:

r('x_from_python') = x
#SyntaxError: can't assign to function call

Nope. Turns out you have to use the r.assign() function to do that:

r.assign('x_from_python', x)
r('print(x_from_python)')  # 'I'm a Python string'

So that’s how you get data from Python to R: by using the r.assign() function, first giving the name of the variable you want to be assigned in R followed by the Python object to be sent to R.

Other data types

OK, so you can get integers back from R. And as you can imagine, strings work the same way. But what about more complex data types? This list of conversions tells you which R objects will be converted into which Python objects. It’s pretty intuitive, a string becomes a string, a list becomes a list, etc.

But then there are things like data frames in R, which have row names and column names.

It’s not on that list linked above, but an R data frame is converted to a Python dictionary. For example, the Motor Trend car data set, which comes standard in R, is a data frame.

from rpy import *
r('print(head(mtcars))') # print just the first 6 lines.  Note the variable names.

# Returns:
#                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Now send the whole thing to Python and check the keys of the dictionary that is created:

mt = r('mtcars')
mt.keys()

Note that the keys are the same as the variable names in the dataframe.

Just like you get a Python dictionary from a dataframe, you can send a dictionary to R:

r.assign('df', dict(a=1, b=2, c=3))
r('print(df)')
r('names(df)')

May have to convert it into a dataframe once inside R though:

r('df = data.frame(df)')

R functions

So far, with the exception of r.assign(), we’ve just been sending strings to the r object. But the r object also has methods. Unfortunately, you can’t see them all using IPython’s introspection. Personally I find that I don’t use this functionality that much, (I use r.assign() to get the data into R and then operate on it in there) but here it is for completeness.

There is a trick here. Remember, before we were sending a string to the r object and it was executing the code inside R:

r('x=5')

But when you use a method of the r object, you pass it raw Python objects. For example, you can plot a Python list in R using the plot() method of the r object:

x = [1,2,3]
r.plot(x)

There are some slight name changes though. R tends to use a “.” as a spacer in function names, like “_” tends to be used in Python. The “.” however is special in Python, so in method names of the r object, “.” is converted to “_”. For example, R’s t.test() function becomes r.t_test().

These methods of the r object are what Python sees, so that’s why their names have to be changed. On the other hand, you call R function with its true name when you send the r object a string, like we were doing before. So both of these refer to the same underlying t-test function in R:

r.t_test
r('t.test')

This next one is tricky. First, since print is a Python function, it needs to have a slightly different name when you want to use the version in R. So an underscore is added to the end. Second, what’s in the parentheses is a Python string. So all that will get printed is the string, ‘x’ . . . not 5, or “I’m a Python string” or anything else.

r.print_('x') # 'x'

In practice though, if I want to print something I’ll either use Python’s print or if I want to print something from R, I’ll do this:

r('print(x)')  # prints 5

Plotting examples

Here’s are a couple of examples of creating a plot. In each case a plot is created of the list 1,2,3. These are trivial examples, but they illustrate different ways of getting data to and from R.

Option 1: Do everything in R

You can execute arbitrary R commands by sending them as a string to the r object. Here, everything is done in R: a list is created and plotted. In this example, the variable x is never seen by Python.

from rpy import *
r("""
y = c(1,2,3)
plot(y)
""")

Note that you can send many R commands in a multi-line string.

Option 2: Use a method of the r object

Here, we start with a Python list, and then send it as the argument to the r.plot() method.

from  rpy import *
y = [1,2,3]
r.plot(y)

Option 3: Get a list from R and plot it with matplotlib in Python

This trivial because you don’t gain anything from making a list in R instead of Python, but it shows that you can send data both ways.

from r import *
import pylab as p
y = r('c(1,2,3)')
p.plot(y)
p.show()

Option 4: Use r.assign() to get data to R, then call it inside R

I tend to use this method a lot with large data sets. The idea is to pass the data into R once, then you can use it from inside R. The trick is to use the r.assign() method.

from rpy import *
y = [1,2,3]
r.assign('Y', y)
r('plot(Y)')

Getting help on R functions

Use the r.help() function. For example, to view the help on anova:

r.help(anova)

This displays the help on screen; it doesn’t return a string.

Non-trivial examples

Plotting and printing things are not what you’d want to use R and RPy for. Instead, you’d want to use them for things that you can’t do in available packages for Python.

Here are some examples where R can really fill in the gaps in Python’s statistical functionality. Anything you can do in R, you can do from Python. Given the wide variety of packages available for R, this is some stupendous power at your fingertips. Now to learn how to wield it!

Linear models in R

Say I have a Python script already up and running, and it returns some data . . . and I want to know if the slope of two variables is significant. I haven’t found any statistics libraries for Python, but in R this kind of functionality comes standard, in the function lm().

Viewing the help for lm(), you can see that it takes a model specification, like “y~x” which means “y on x”. Now, the components of this model specification, y and x, can either refer to variables in the R workspace (which is separate from Python, remember) or they can be variables in a dataframe which is supplied in an optional argument to lm().

So first we need to figure out how to send the data to R; performing the linear regression should be trivial, then we need to get the data back out.

First, let’s set up some test data in Python:

import numpy as npy
x = npy.arange(10)
y = npy.arange(10) + npy.random.standard_normal(x.shape)</pre>

Now send it to R:
<pre>r.assign('x',x)
r.assign('y',y)

(exercise for the reader: instead of assigning x and y individually, how would you get them into R as a dataframe?)

In R, run the linear model and save it as a variable in R. Here, I’m simultaneously saving it as a Python dictionary (sneaky!)

LM = r('linear_model = lm(y~x)')

OK, here’s where it take a little exploring. The dictionary you get back may take some navigating. Looking at it for a little bit, you might notice the ‘coefficients’ key of the dictionary LM, which in turn has two more keys: ‘(Intercept)’ and ‘x’.

{'assign': [0, 1],
 'call': <Robj object at 0xb7d3e790>,
 'coefficients': {'(Intercept)': 0.28490682478866736,
                  'x': 0.86209804871669171},
 'df.residual': 8,
 'effects': array([-13.16882479,   7.83039439,   1.22245056,   0.18398967,
         0.51108108,   0.8141431 ,  -0.45120018,  -1.1985602 ,
         1.54636612,   0.51341949]),
 'fitted.values': array([ 0.28490682,  1.14700487,  2.00910292,  2.87120097,  3.73329902,
        4.59539707,  5.45749512,  6.31959317,  7.18169121,  8.04378926]),
 'model': {'x': array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]),
           'y': array([-0.64212347,  1.39389811,  3.06676323,  2.84957073,  3.99793052,
        5.12226093,  4.67818603,  4.7520944 ,  8.3182891 ,  8.10661086])},
 'qr': {'pivot': [1, 2],
        'qr': array([[ -3.16227766, -14.23024947],
       [  0.31622777,   9.08295106],
       [  0.31622777,   0.15621147],
       [  0.31622777,   0.0461151 ],
       [  0.31622777,  -0.06398128],
       [  0.31622777,  -0.17407766],
       [  0.31622777,  -0.28417403],
       [  0.31622777,  -0.39427041],
       [  0.31622777,  -0.50436679],
       [  0.31622777,  -0.61446316]]),
        'qraux': [1.316227766016838, 1.2663078500948464],
        'rank': 2,
        'tol': 9.9999999999999995e-08},
 'rank': 2,
 'residuals': array([-0.92703029,  0.24689324,  1.05766031, -0.02163025,  0.2646315 ,
        0.52686386, -0.77930909, -1.56749877,  1.13659789,  0.0628216 ]),
 'terms': <Robj object at 0xb7d3e780>,
 'xlevels': {}}

So if all we were after were the slope and intercept, then

slope = LM['coefficients']['x']
intercept = LM['coefficients']['(Intercept)']

But what about a P-value for the slope? It’s nowhere to be seen in that dictionary. Turns out, you need the summary() function in R, and it takes as its input a linear model (among other possible inputs, but here we’re just using a linear model). So save it in R (just in case) and simultaneously save it in Python:

summary = r('LM_summary = summary(linear_model)')

Hmm.

{'adj.r.squared': 0.88847497651170382,
 'aliased': {'(Intercept)': False, 'x': False},
 'call': <Robj object at 0xb7d3e770>,
 'coefficients': array([[  2.84906825e-01,   5.39776217e-01,   5.27823968e-01,
          6.11943659e-01],
       [  8.62098049e-01,   1.01109349e-01,   8.52639301e+00,
          2.75251311e-05]]),
 'cov.unscaled': array([[ 0.34545455, -0.05454545],
       [-0.05454545,  0.01212121]]),
 'df': [2, 8, 2],
 'fstatistic': {'dendf': 8.0, 'numdf': 1.0, 'value': 72.699377758431851},
 'r.squared': 0.90086664578818121,
 'residuals': array([-0.92703029,  0.24689324,  1.05766031, -0.02163025,  0.2646315 ,
        0.52686386, -0.77930909, -1.56749877,  1.13659789,  0.0628216 ]),
 'sigma': 0.9183712712215929,
 'terms': <Robj object at 0xb7d3e7c0>}

There’s the r-squared and adjusted r-squared,

R_squared = summary['adj.r.squared']

but no P value. What gives? Turns out Python can’t convert everything perfectly, and a little more exploration is in order. Try printing the summary from R:

r('print(LM_summary)')

Well, that makes more sense, and you can see the P value for the slope is 2.75E-5. But how to extract it from Python?

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max
-1.5675 -0.5899  0.1549  0.4613  1.1366 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.2849     0.5398   0.528    0.612
x             0.8621     0.1011   8.526 2.75e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9184 on 8 degrees of freedom
Multiple R-squared: 0.9009,	Adjusted R-squared: 0.8885
F-statistic:  72.7 on 1 and 8 DF,  p-value: 2.753e-05

The trick is to match output from the summary printout in R with the dictionary returned to Python. Here, it looks like the key ‘coefficients’ in the summary dictionary in Python gives the numbers in the 2nd row, 3rd column:

P = summary['coefficients'][1,2]

Whew, and there you have it. See, it takes some digging around to get what you need, but now since I’ve done the work for you, you can now do linear regressions from Python. All together it looks like this (can be wrapped in a function or class for your own reuse):

r.assign('x', x)
r.assign('y', y)
LM = r('linear_model = lm(y~x)')
summary = r('summary_LM = summary(linear_model)')
slope = LM['coefficients']['x']
intercept = LM['coefficients']['(Intercept)']
P = summary['coefficients'][1,2]

Redundancy analysis

OK, say you have this data set to perform redundancy analysis (RDA) on. First, you need the package vegan installed, which is fantastic for multivariate stats. It’s probably best to fire up R proper (from a command line, or the GUI if you have it in Windows or OSX) and run

install.packages("vegan", dep=T)

Here’s a heavily commented script, rpy-demo.py, that will:

  • load and format the data included in the script
  • send the data to R
  • perform an RDA in R
  • plot the ordination
  • save the ordination as a PNG
  • print the variance explained by constrained and unconstrained axes as well as each RDA axis.

If you have RPy installed and the vegan package installed, you should be able to just run this Python script.

Often-run analyses that you need R for can be wrapped in a class or module to encapsulate your data analysis needs, so you don’t need to clutter your code with it. Once things are set up that way, it would be as easy as

from myRstuff import lm, rda
results = lm(x,y)
ordination = rda(data)

For much, much more see the online documentation for RPy, but hopefully I gave you enough to at least get started.

Polar bar plot in Python

Here’s how to create a polar bar plot in matplotlib.


The trick is just to specify that you want polar coordinates when you create the axis. Then create a bar plot as normal.

from matplotlib.pyplot import figure, show
from math import pi

fig = figure()
ax = fig.add_subplot(111, polar=True)
x = [30,60,90,120,150,180]
x = [i*pi/180 for i in x]  # convert to radians

ax.bar(x,[1,2,3,4,5,6], width=0.4)
show()

Note that in the above example the “right” or “clockwise-most” edge is lined up with each specified x value. You can change this by subtracting width / 2 to each of the x values to center the bars on the x-values, like this:

from matplotlib.pyplot import figure, show
from math import pi

width = 0.4  # width of the bars (in radians)

fig = figure()
ax = fig.add_subplot(111, polar=True)
x = [30,60,90,120,150,180]

# Convert to radians and subtract half the width
# of a bar to center it.
x = [i*pi/180 - width/2 for i in x]
ax.bar(x,[1,2,3,4,5,6], width=width)
show()

Get funky . . .

The following is slightly modifed from the matplotlib examples:

import numpy as npy
import matplotlib.cm as cm
from matplotlib.pyplot import figure, show, rc

# force square figure and square axes (looks better for polar, IMHO)
fig = figure(figsize=(8,8))
ax = fig.add_axes([0.1, 0.1, 0.8, 0.8], polar=True)

N = 20
theta = npy.arange(0.0, 2*npy.pi, 2*npy.pi/N)  # random angles
radii = 10*npy.random.rand(N)  # random bar heights
width = npy.pi/4*npy.random.rand(N) # random widths

# Create the bar plot
bars = ax.bar(theta, radii, width=width, bottom=0.0)

# Step through bars (a list of Rectangle objects) and
# change color based on its height and set its alpha transparency
# to 0.5

for r,bar in zip(radii, bars):
    bar.set_facecolor( cm.jet(r/10.))
    bar.set_alpha(0.5)

show()

And the result:

Interactive subplots: make all x-axes move together

It’s very easy to make subplots that share an x-axis, so that when you pan and zoom on one axis, the others automatically pan and zoom as well. The key to this functionality is the sharex keyword argument, which is used when creating an axis. Here’s some example code and a video of the resulting interaction. Continue reading ‘Interactive subplots: make all x-axes move together’

Calculate sunrise and sunset with PyEphem

PyEphem (from the Greek word ephemeris) is the way to calculate the positions of all sorts of astronomical bodies in Python. Continue reading ‘Calculate sunrise and sunset with PyEphem’

Use Sphinx for documentation

Update: After some folks requested it in the comments, I wrote another post, A minimal Sphinx setup for autodocumenting Python modules. You might want to check this out if you’re specifically interested in automatically documenting your code with Sphinx.

I’ve been doing quite a bit of code documentation lately, and I decided to try and figure out the best tool to use. I found it. It’s called Sphinx, and you can see what the documentation looks like by checking out the documentation for Python itself (v. 2.6 and 3.0).
Here’s how to get started using Sphinx. Continue reading ‘Use Sphinx for documentation’

Insert content into TiddlyWikis with this Python script

I’ve been generating many figures, and I want to be able to find them again and browse them easily. Organizing them on disk just isn’t cutting it. My solution for now is to use a local TiddlyWiki as the glue for my figures, since I can embed figures in tiddlers (the microcontent entries that are the bread and butter of TiddlyWikis), and tag and search those entries. Bonus: I can zip everything up and send TiddlyWiki + images to my advisor so he can browse and search them as well.

Try this Python script, addtiddler.py, to insert tiddlers into an existing TiddlyWiki. You can optionally specify an image name (relative to the output file, see the documentation in the source code) to be embedded. You can use this script from the command line using options, or import it into another script.

I tried to add lots of comments so you can modify it for your own needs. Let me know if you find bugs so I can fix them.

Advanced sorting: sorting by key

The sort() method of list objects in Python is quite flexible. By default, it sorts on the first thing in each item of the list, which is exactly what you would expect. For example, a list of strings is sorted by the first letter of each string. What if you wanted to sort by the second letter of each string? Or sort a list of people’s names by last name? Continue reading ‘Advanced sorting: sorting by key’