Rounding error in Python

All the data analysis Python examples, are stored in GitHub in the repository called datascience.

The functions there are slightly differently coded, because of how floating point works in Python (and most of the computer languages): in short the fact that on computers all numbers are stored in finite number of bits introduces a rounding error (if you want to see the details: What Every Computer Scientist Should Know About Floating-Point Arithmetic, David Goldberg, March 1991).

Therefore I just introduced an argument “precision” with a default value. In Python arguments can have a default value associated that makes them optional: if you do not pass this argument when calling the function, the default value is taken.

Let’s see how the function to calculate the mean is different with precision added:

def mean(dataPoints, precision=3):
    return round(sum(dataPoints) / float(len(dataPoints)), precision)
  except ZeroDivisionError:
    raise StatsError('no data points passed')

The difference is the Python standard function round (number, n_digits) that rounds a number to n digits after the comma.
The number of rounding digits can be passed (through the extra argument “precision”) or otherwise the default (=3 digits) will be taken.

These are valid way to call the function:

mean(x)   # will return the mean with max 3 digits after the comma
mean(x,5) #  will return the mean with max 5 digits after the comma

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s