Groupby Agg not working if different partials with same name on the same column #28570

charlesdong1991 · 2019-09-22T20:09:15Z

related to #28426 from PR #28428

from functools import partial

import pandas as pd
import numpy as np

quant50 = partial(np.percentile, q=50)
quant70 = partial(np.percentile, q=70)

test = pd.DataFrame({'col1': ['a', 'a', 'b', 'b', 'b'], 'col2': [1,2,3,4,5]})
test.groupby('col1').agg({'col2': [quant50, quant70]})

However, quant50 result is rewritten by quant70, so will get the same output on both columns.

output:

col1	percentile	percentile
a	1.7	1.7
b	4.4	4.4

expected output:

col1	percentile	percentile
a	1.5	1.7
b	4.0	4.4

rileypeterson · 2019-10-01T04:53:09Z

This works fine for me:

from functools import partial

import pandas as pd
import numpy as np

quant50 = partial(np.percentile, q=50)
quant50.__name__ = "quant50"
quant70 = partial(np.percentile, q=70)
quant70.__name__ = "quant70"

test = pd.DataFrame({'col1': ['a', 'a', 'b', 'b', 'b'], 'col2': [1,2,3,4,5]})
test.groupby('col1').agg({'col2': [quant50, quant70]})

Out[110]: 
        col2        
     quant50 quant70
col1                
a        1.5     1.7
b        4.0     4.4

Output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Darwin
OS-release: 18.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml.etree: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

aliraeisdanaei · 2021-12-26T04:32:48Z

take

rhshadrach · 2022-02-05T20:59:27Z

I view specifying the __name__ as a temporary workaround, I don't think the behavior identified in the OP is correct, and needs to be fixed. It is quite subtle and silently leads to incorrect results.

@mroeschke for any thoughts.

mroeschke · 2022-02-05T23:20:23Z

That's a good point. Agreed it probably doesn't fix the original naming issue.

jbrockmendel added the Groupby label Oct 16, 2019

mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions labels Jul 21, 2021

github-actions bot assigned aliraeisdanaei Dec 26, 2021

aliraeisdanaei mentioned this issue Dec 26, 2021

Groupby Agg not working if different partials with same name on the same column #45075

Closed

4 tasks

jreback modified the milestones: 1.4, Contributions Welcome Dec 27, 2021

mroeschke added Apply Apply, Aggregate, Transform, Map Bug and removed good first issue Needs Tests Unit test(s) needed to prevent regressions labels Feb 5, 2022

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

rhshadrach linked a pull request Mar 2, 2024 that will close this issue

BUG: groupby.agg should always agg #57706

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Groupby Agg not working if different partials with same name on the same column #28570

Groupby Agg not working if different partials with same name on the same column #28570

charlesdong1991 commented Sep 22, 2019 •

edited

Loading

rileypeterson commented Oct 1, 2019

aliraeisdanaei commented Dec 26, 2021

rhshadrach commented Feb 5, 2022 •

edited

Loading

mroeschke commented Feb 5, 2022

Groupby Agg not working if different partials with same name on the same column #28570

Groupby Agg not working if different partials with same name on the same column #28570

Comments

charlesdong1991 commented Sep 22, 2019 • edited Loading

rileypeterson commented Oct 1, 2019

Output of pd.show_versions()

aliraeisdanaei commented Dec 26, 2021

rhshadrach commented Feb 5, 2022 • edited Loading

mroeschke commented Feb 5, 2022

charlesdong1991 commented Sep 22, 2019 •

edited

Loading

Output of `pd.show_versions()`

rhshadrach commented Feb 5, 2022 •

edited

Loading