In the previous post I wrote about the measures of central tendency and how to calculate the mean in Python.
Now let’s see how doing it for the median (the number in the middle).
The core part is that the input list is sorted (using the standard list function called sorted) and then it is easy to get the number in the middle: newList = sorted(currentList) creates a new list based on the input and sort it; this is useful if you want to re-use the input list later for other scopes. There is also a function called sort() which will do the sorting in place on the input list. But then the original list will be sorted …. [See also the note at the end of the post]
Here is the complete function, with initial description in a comment section. A couple of examples / unit tests follow.
def median(dataPoints): """ dataPoints: a list of data points, int or float returns: the middle number in the sorted list, a float or an int """ sortedPoints = sorted(dataPoints) mid = len(sortedPoints) / 2 if (len(sortedPoints) % 2 == 0): # even: need to get the average of the two mid numbers return (sortedPoints[mid-1] + sortedPoints[mid]) / 2.0 else: # odd: there is only one number return sortedPoints[mid] ''' Unit tests ''' X = [10.3, 4.1, 12, 15.5, 20.2, 5.8, 15.5, 4.1] Y = [10, 4, 12, 15, 20, 5, 7] print ("median X = ", median(X)) # returns 11.15 = mean(10.3, 12) print ("median Y = ", median(Y)) # returns 10
Note: this is a somehow different topic but the behaviour above is due by how the arguments are evaluated when passed to a function. Usually languages (as C or Java) call by-value: the function receives a copy of the argument objects passed to it by the caller; other languages (as PHP or Perl) call by-reference: inside the function scope, the argument is essentially a complete alias for the variable passed in by the caller. Python calls by-object reference: the function receives a reference to the same object as used by the caller; however, it does not receive the variable that the caller is storing this object in – as in pass-by-value, the function provides its own box and creates a new variable for itself. Wikipedia has more about it.