-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
groupby
in combination with rolling
provides unintuitve errors
#26462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls show. self contained reproduced example |
@jreback I updated my issue description. |
pls dont use a remote link |
I've encountered a similar error, so to help out here's a simple reproducible example. Thanks @jreback and @WillAyd for looking at this. import numpy as np
import pandas as pd
df = pd.DataFrame({'groups': ['g']*10,
'data': np.sin(np.arange(10))})
groups = df[['data', 'groups']].groupby('groups')
# Rolling mean with uniform weights
df['uniform'] = df['data'].rolling(4, win_type=None).mean()
df['grp_uniform'] = groups.rolling(4, win_type=None).mean().values
# Rolling mean with blackman window
df['blackman'] = df['data'].rolling(4, win_type='blackman').mean()
df['grp_blackman'] = groups.rolling(4, win_type='blackman').mean().values
assert df['uniform'].equals(df['grp_uniform'])
assert df['blackman'].equals(df['grp_blackman'])
print(df) The last assert statement fails. In that last calculation, it looks like the win_type='blackman' paramater was ignored and it was back to uniform weights. The last two columns ought to be identical: groups data uniform grp_uniform blackman grp_blackman 0 g 0.000000 NaN NaN NaN NaN 1 g 0.841471 NaN NaN NaN NaN 2 g 0.909297 NaN NaN NaN NaN 3 g 0.141120 0.472972 0.472972 0.875384 0.472972 4 g -0.756802 0.283771 0.283771 0.525209 0.283771 5 g -0.958924 -0.166327 -0.166327 -0.307841 -0.166327 6 g -0.279415 -0.463506 -0.463506 -0.857863 -0.463506 7 g 0.656987 -0.334539 -0.334539 -0.619170 -0.334539 8 g 0.989358 0.102001 0.102001 0.188786 0.102001 9 g 0.412118 0.444762 0.444762 0.823172 0.444762 |
@Connossor that looks different from what @bgruening reported. I think the minimal example here is
Can you confirm @bgruening? Apparently, passing In [40]: df.rolling(3, win_type='gaussian').mean(std=1)
Out[40]:
A
0 NaN
1 NaN
2 1.0
3 2.0
4 3.0
5 4.0
6 5.0
7 6.0
8 7.0
9 8.0
10 9.0
11 10.0 An example in the docs, and / or a better error message would be nice. pandas/pandas/core/window/rolling.py Line 569 in 32b4710
|
I confirm :) Thanks @TomAugspurger! |
@TomAugspurger should I make a separate issue for the broken "groupby" behaviour? |
Probably best to search, and if you don't find something then yes.
…On Wed, Aug 21, 2019 at 12:14 PM Connor Tann ***@***.***> wrote:
@TomAugspurger <https://github.com/TomAugspurger> should I make a
separate issue for the broken "groupby" behaviour?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26462?email_source=notifications&email_token=AAKAOIQ3IKMHIPCN4W46OYDQFVZWLA5CNFSM4HN5L2E2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD42N32Y#issuecomment-523558379>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIQZYXHW5OLKCPYCIELQFVZWLANCNFSM4HN5L2EQ>
.
|
Uh oh!
There was an error while loading. Please reload this page.
I think the minimal example here is
Can you confirm @bgruening?
Apparently, passing
std
is required.An example in the docs, and / or a better error message would be nice.
pandas/pandas/core/window/rolling.py
Line 569 in 32b4710
original example below
Code Sample, a copy-pastable example if possible
Problem description
According to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html the
win_type='gaussian'
needs astd
. The above example without the groupby errors if nostd
is given and this is the expected behaviour. However, the first example with thegroupby
does not error. I'm not even sure what happens in the first case.Expected Output
The groupby example should give an error.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 5.0.0-15-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: de_DE.UTF-8
pandas: 0.24.2
pytest: None
pip: 19.1
setuptools: 41.0.1
Cython: None
numpy: 1.16.3
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: