In the previous post I wrote about the measures of central tendency and how to calculate the mean in Python.
Now let’s see how doing it for the median (the number in the middle).

The core part is that the input list is sorted (using the standard list function called sorted) and then it is easy to get the number in the middle: newList = sorted(currentList)  creates a new list based on the input and sort it; this is useful if you want to re-use the input list later for other scopes. There is also a function called sort() which will do the sorting in place on the input list. But then the original list will be sorted …. [See also the note at the end of the post]
Here is the complete function, with initial description in a comment section. A couple of examples / unit tests follow.

def median(dataPoints):
 dataPoints: a list of data points, int or float
 returns: the middle number in the sorted list, a float or an int
 sortedPoints = sorted(dataPoints)
 mid = len(sortedPoints) / 2
 if (len(sortedPoints) % 2 == 0):
   # even: need to get the average of the two mid numbers
   return (sortedPoints[mid-1] + sortedPoints[mid]) / 2.0
   # odd: there is only one number 
   return sortedPoints[mid]

Unit tests 
X = [10.3, 4.1, 12, 15.5, 20.2, 5.8, 15.5, 4.1]
Y = [10, 4, 12, 15, 20, 5, 7]
print ("median X = ", median(X)) # returns 11.15 = mean(10.3, 12)
print ("median Y = ", median(Y)) # returns 10

Note: this is a somehow different topic but the behaviour above is due by how the arguments are evaluated when passed to a function. Usually languages (as C or Java) call by-value: the function receives a copy of the argument objects passed to it by the caller; other languages (as PHP or Perl) call by-reference:  inside the function scope, the argument is essentially a complete alias for the variable passed in by the caller. Python calls by-object reference: the function receives a reference to the same object as used by the caller; however, it does not receive the variable that the caller is storing this object in – as in pass-by-value, the function provides its own box and creates a new variable for itself. Wikipedia has more about it.


2 thoughts on “Median

  1. Pingback: Introduction to Python package NumPy | Look back in respect

  2. Pingback: Quartiles and summary statistics in Python | Look back in respect

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s