Skip to content

Commit 311a52e

Browse files
committed
DOC: Add details to DataFrame groupby transform
Add requirements for user function in groupby transform closes #13543 [skip ci]
1 parent 233d51d commit 311a52e

File tree

2 files changed

+24
-5
lines changed

2 files changed

+24
-5
lines changed

doc/source/groupby.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -521,9 +521,17 @@ Transformation
521521
--------------
522522

523523
The ``transform`` method returns an object that is indexed the same (same size)
524-
as the one being grouped. Thus, the passed transform function should return a
525-
result that is the same size as the group chunk. For example, suppose we wished
526-
to standardize the data within each group:
524+
as the one being grouped. The transform function must:
525+
526+
* Return a result that is either the same size as the group chunk or
527+
broadcastable to the size of the group chunk.
528+
* Operate column-by-column on the group chunk. A fast path is used if the
529+
transform function also operates on the entire group chunk as a DataFrame.
530+
* Does not perform in-place operations on the group chunk. Group chunks should
531+
be treated as immutable, and changes to a group chunk may produce unexpected
532+
results.
533+
534+
For example, suppose we wished to standardize the data within each group:
527535

528536
.. ipython:: python
529537
@@ -605,8 +613,9 @@ and that the transformed data contains no NAs.
605613
606614
.. note::
607615

608-
Some functions when applied to a groupby object will automatically transform the input, returning
609-
an object of the same shape as the original. Passing ``as_index=False`` will not affect these transformation methods.
616+
Some functions when applied to a groupby object will automatically transform
617+
the input, returning an object of the same shape as the original. Passing
618+
``as_index=False`` will not affect these transformation methods.
610619

611620
For example: ``fillna, ffill, bfill, shift``.
612621

pandas/core/groupby.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3507,6 +3507,16 @@ def transform(self, func, *args, **kwargs):
35073507
Each subframe is endowed the attribute 'name' in case you need to know
35083508
which group you are working on.
35093509
3510+
The current implementation imposes three requirements on f:
3511+
3512+
* f must return a value that either has the same shape as the input
3513+
subframe or can be broadcast to the shape of the input subframe.
3514+
* f must support application column-by-column in the subframe. An
3515+
optional fast path is used if f also supports application to the
3516+
entire subframe.
3517+
* f must not mutate subframes. Mutation is not supported and may
3518+
produce unexpected results.
3519+
35103520
Examples
35113521
--------
35123522
>>> grouped = df.groupby(lambda x: mapping[x])

0 commit comments

Comments
 (0)