Sort one list by another list
Here are a couple of ways of sorting one list by another list in Python. The first uses plain ol’ Python, and the others use NumPy.
In each case imagine we want to sort a list of peoples names by their ages.
Method 1
Zip the lists together, making sure that the one to sort by is passed first to zip(). The result is a list of tuples. When you sort a list of tuples, it sorts using the first item in each tuple. Then use the zip* trick to unzip the now sorted tuples into separate variables.
people = ['Jim', 'Pam', 'Micheal', 'Dwight']
ages = [27, 25, 4, 9]
agesAndPeople = zip(ages, people)
agesAndPeople.sort()
sortedAges, sortedPeople = zip(*agesAndPeople)
Note that if you want to sort in reverse, you can use agesAndPeople.sort(reverse=True).
Method 2
This method uses NumPy, and you don’t have to convert the lists into arrays. The argsort() function doesn’t return the sorted ages . . . instead, it returns the indices that each item would if it were in an already sorted array (try it to see what I mean). take() is a way of using useful NumPy indexing on a list. See the next example for something that might be more straigtforward for Matlab users.
people = ['Jim', 'Pam', 'Micheal', 'Dwight']
ages = [27, 25, 4, 9]
import numpy
inds = numpy.argsort(ages)
sortedPeople = numpy.take(people, inds)
Method 3
This method also uses NumPy, but first it converts the lists into arrays. Then it uses the argsort() of one to index into the other.
people = ['Jim', 'Pam', 'Micheal', 'Dwight']
ages = [27, 25, 4, 9]
import numpy
people = numpy.array(people)
ages = numpy.array(ages)
inds = ages.argsort()
sortedPeople = people[inds]
April 11th, 2008 at 9:42 pm
How isn’t the first way the best?
April 13th, 2008 at 1:49 am
Thanks, nice quick tutorial. I’ve really got to use Numpy more often, I’m sure it would make my life easier.
Just playing with the examples … here’s how to sort the list alphabetically by “people” rather than by age (for Method 1). Change the .sort() line to:
agesAndPeople.sort(key=lambda x: x[1])
The key argument can be passed any function which when passed any element of the list being sorted, returns a value to actually sort on. In this case, the key function is a quick on-the-fly lamba function which returns the second element of the tuple (rather than defaulting to the first element, as in Method 1 above), in effect sorting on the peoples names rather than the age.
April 13th, 2008 at 5:03 pm
@Matt:
You got me thinking about whether or not the first way is best. Turns out sometimes the first method is best . . . but sometimes it’s not. For more: http://scienceoss.com/test-the-speed-of-your-code-interactively-in-ipython/.
@Andrew Perry:
Good call on using the key function to sort by. Now that I think about it, it might be worth a separate post with some more examples of using “sort(key=someFunc)”. Stay tuned.
April 14th, 2008 at 8:22 pm
Inspired by Andrew I learned some more about using a key function to sort by. My notes are here: http://scienceoss.com/advanced-sorting-sorting-by-key/