numpy.arange miscounts when step is large relative to value past stop #6238

kawochen · 2015-08-23T14:14:20Z

 In [14]: len(numpy.arange(0, 10000000000000001, 100000000000000))
Out[14]: 101

In [15]: len(numpy.arange(0, 100000000000000000001, 1000000000000000000))
Out[15]: 100

The text was updated successfully, but these errors were encountered:

jaimefrio · 2015-08-23T22:01:03Z

To calculate the number of items, a rounded up division of (stop - start) / step is performed. This is done in floating point, and the loss of precision is what caused your missing last entry. If all three values are integers, we could alternatively take advantage of Python's arbitrary precision integers and use the (stop - start - 1) // step + 1 rounded up integer division formula, which should fix your problem.

kawochen · 2015-08-24T02:53:17Z

Thanks. I suspected it was some kind of floating point arithmetic at work as well, but couldn't find the source code for that function. Can you point me to it? The doc says when all three are integers it does the same thing as the built-in range, but in this case it's different.

jaimefrio · 2015-08-24T03:13:10Z

It's written in C. The function that ultimately creates the arange array is PyArray_ArangeObj, but the length is calculated by the auxiliary function _calc_length called by it.

julianeagu · 2016-11-30T03:43:34Z

I am getting a (similar?) bug:

For example, I get an extra number in this case:

x = np.arange(0.75,0.8,0.01) 
print x
[ 0.75  0.76  0.77  0.78  0.79  0.8 ]

but this works fine:

x = np.arange(0.75,1.0,0.01)
print x
[ 0.75  0.76  0.77  0.78  0.79  0.8   0.81  0.82  0.83  0.84  0.85  0.86 0.87  
0.88  0.89  0.9   0.91  0.92  0.93  0.94  0.95  0.96  0.97  0.98 0.99]

I'm using numpy version 1.11.2.

charris · 2016-11-30T04:07:23Z

The problem is that .8 and .01 are not exactly representable in floating point, so the actual numbers used are not what they look like. For instance

 In [2]: print "%25.20f" % .8
   0.80000000000000004441

The way to do this sort of thing is to use linspace or integers and divide

In [1]: linspace(.75, .8, 5, endpoint=False)
Out[1]: array([ 0.75,  0.76,  0.77,  0.78,  0.79])

In [2]: arange(75, 80)/100.
Out[2]: array([ 0.75,  0.76,  0.77,  0.78,  0.79])

kawochen mentioned this issue Aug 24, 2015

BUG: GH10885 where an edge case in date_range produces an extra point pandas-dev/pandas#10887

Merged

charris closed this as completed Nov 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

numpy.arange miscounts when step is large relative to value past stop #6238

numpy.arange miscounts when step is large relative to value past stop #6238

kawochen commented Aug 23, 2015

jaimefrio commented Aug 23, 2015

Uh oh!

kawochen commented Aug 24, 2015

Uh oh!

jaimefrio commented Aug 24, 2015

Uh oh!

julianeagu commented Nov 30, 2016

Uh oh!

charris commented Nov 30, 2016

Uh oh!

Uh oh!

numpy.arange miscounts when step is large relative to value past stop #6238

numpy.arange miscounts when step is large relative to value past stop #6238

Comments

kawochen commented Aug 23, 2015

jaimefrio commented Aug 23, 2015

Uh oh!

kawochen commented Aug 24, 2015

Uh oh!

jaimefrio commented Aug 24, 2015

Uh oh!

julianeagu commented Nov 30, 2016

Uh oh!

charris commented Nov 30, 2016

Uh oh!