Skip to content

Groupby Agg not working if different partials with same name on the same column #28570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
charlesdong1991 opened this issue Sep 22, 2019 · 4 comments · May be fixed by #57706
Open

Groupby Agg not working if different partials with same name on the same column #28570

charlesdong1991 opened this issue Sep 22, 2019 · 4 comments · May be fixed by #57706
Assignees
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby

Comments

@charlesdong1991
Copy link
Member

charlesdong1991 commented Sep 22, 2019

related to #28426 from PR #28428

from functools import partial

import pandas as pd
import numpy as np

quant50 = partial(np.percentile, q=50)
quant70 = partial(np.percentile, q=70)

test = pd.DataFrame({'col1': ['a', 'a', 'b', 'b', 'b'], 'col2': [1,2,3,4,5]})
test.groupby('col1').agg({'col2': [quant50, quant70]})

However, quant50 result is rewritten by quant70, so will get the same output on both columns.

output:

col1 percentile percentile
a 1.7 1.7
b 4.4 4.4

expected output:

col1 percentile percentile
a 1.5 1.7
b 4.0 4.4
@rileypeterson
Copy link

This works fine for me:

from functools import partial

import pandas as pd
import numpy as np

quant50 = partial(np.percentile, q=50)
quant50.__name__ = "quant50"
quant70 = partial(np.percentile, q=70)
quant70.__name__ = "quant70"

test = pd.DataFrame({'col1': ['a', 'a', 'b', 'b', 'b'], 'col2': [1,2,3,4,5]})
test.groupby('col1').agg({'col2': [quant50, quant70]})
Out[110]: 
        col2        
     quant50 quant70
col1                
a        1.5     1.7
b        4.0     4.4

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Darwin
OS-release: 18.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml.etree: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions labels Jul 21, 2021
@aliraeisdanaei
Copy link

take

@rhshadrach
Copy link
Member

rhshadrach commented Feb 5, 2022

I view specifying the __name__ as a temporary workaround, I don't think the behavior identified in the OP is correct, and needs to be fixed. It is quite subtle and silently leads to incorrect results.

@mroeschke for any thoughts.

@mroeschke
Copy link
Member

That's a good point. Agreed it probably doesn't fix the original naming issue.

@mroeschke mroeschke added Apply Apply, Aggregate, Transform, Map Bug and removed good first issue Needs Tests Unit test(s) needed to prevent regressions labels Feb 5, 2022
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@rhshadrach rhshadrach linked a pull request Mar 2, 2024 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby
Projects
None yet
7 participants