ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with Series and Dict #30922

proost · 2020-01-11T16:01:47Z

closes column-wise fillna with Series/dict NotImplemented #4514
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

description:
Access "DataFrame" each column, fills up NA values using "Series.fillna"
and what i found is if dataframe is duplicated, ".fillna" not guarantee filling NA. So i change them can fill NA.

Q. Why access column base not index base?
A. problem of Index base f".illna" is not preserve column's dtype. so i use column base.

Q. Why assign in column new values not use inplace=True ?
A. To avoid chained indexing. If column and index are both duplicated, chained indexing happens, i want to avoid this situation.

doc/source/whatsnew/v1.0.0.rst

WillAyd · 2020-01-13T17:41:33Z

pandas/core/generic.py

-                        "with dict/Series column "
-                        "by column"
-                    )
+                    for label in self.columns:


Accessing by label like typically causes issues when users have duplicate label names - have you tested that case by chance?

@WillAyd
I didn't. Thanks for pointing out. i change the logic and add test cases.

pandas/core/generic.py

pandas/tests/frame/test_missing.py

WillAyd · 2020-02-12T01:01:34Z

pandas/tests/frame/test_missing.py

+            [[100, 100, 3], [100, 5, 100], [7, 200, 200]],
+            columns=list("ABB"),
+            index=[0, 0, 1],
+            dtype="float64",


This should be integers no?

@WillAyd
There may be disagreement, what i intended is keeping data type same before. Although column's type in df is float because of "np.nan", anyway each column's in df is float. So if "fillna" changes data type, it means "fillna" fills NA and also changes data type. I think this is not right and if user want to change data types, then it is user's share.
Nevertheless if you think this must be integers, i will change it.
For that reason, "downcast" parameter is.

pep8speaks · 2020-02-21T13:53:37Z

Hello @proost! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-08-10 17:20:16 UTC

… Series and Dict (pandas-dev#30922)

pandas/core/generic.py

jreback · 2020-02-23T16:09:50Z

pandas/core/generic.py

+
+                if axis == 1:
+                    result = result.T
+
                return result if not inplace else None


this doesn't make any sense if we have transposed

rather do

self._update_inplace(result._data)

@jreback
Okay i change logic. but two ".T" operations need. because what i try to do is axis==1 and axis==0 both can handle. current ".fillna" can't handle duplicated columns.

jreback · 2020-02-23T16:10:00Z

pandas/tests/frame/test_missing.py

-        # disable this for now
-        with pytest.raises(NotImplementedError, match="column by column"):
-            df.fillna(df.max(1), axis=1)
+        expected = DataFrame(


make a new test

@jreback
I'm confused. you mean

just delete those part of codes which fails to pass test and make a new test

change this "test_fillna_dict_series" as a new test

give me more specific what you wrote.

pandas/tests/frame/test_missing.py

… Series and Dict (pandas-dev#30922)

WillAyd · 2020-03-03T01:14:54Z

pandas/core/generic.py

@@ -6005,20 +6005,25 @@ def fillna(
                )

            elif isinstance(value, (dict, ABCSeries)):
+                new_data = self.copy()


copy / transpose are both potentially very expensive operations - is there a way to do this without requiring both of those?

@WillAyd
Okay. copy conditional. remove transpose.

pandas/core/generic.py

jreback · 2020-03-15T00:29:52Z

pandas/core/generic.py

-                    obj = result[k]
-                    obj.fillna(v, limit=limit, inplace=True, downcast=downcast)
-                return result if not inplace else None
+                    new_data.iloc[:, i] = new_data.iloc[:, i].fillna(


i would rather build up a list of the results here in a list comprehension

pandas/core/generic.py

… Series and Dict (pandas-dev#30922)

pandas/core/generic.py

… Series and Dict (pandas-dev#30922)

pandas/core/generic.py

… Series and Dict (pandas-dev#30922)

pandas/core/generic.py

… Series and Dict (pandas-dev#30922)

jreback

pls merge master as well

jreback · 2020-06-14T22:44:51Z

pandas/core/generic.py

-                        "by column"
-                    )
+                    for i, item in enumerate(temp_data.items()):
+                        label, content = item


this doesn't make sense with the axis here; you are updating the same column whether axis==0 or 1

@jreback
Yes. but, filled value is different whether axis==0 or 1. And 'downcast' works properly when execute column-based.

… Series and Dict (pandas-dev#30922)

jreback

can you merge master and move the release note

doc/source/whatsnew/v1.1.0.rst

… Series and Dict (pandas-dev#30922)

WillAyd · 2020-09-10T19:02:49Z

@proost can you fix merge conflict and try to fix CI failure?

arw2019 · 2020-11-21T03:38:35Z

@proost is this still active? If yes can you merge master and fix the CI failure?

arw2019 · 2020-12-08T04:28:33Z

Closing in favor of #38352

proost force-pushed the enh-column-wise-fillna branch from 2aa3738 to cc7037e Compare January 12, 2020 15:14

WillAyd requested changes Jan 13, 2020

View reviewed changes

gfyoung added Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jan 15, 2020

proost force-pushed the enh-column-wise-fillna branch from 1b4a89e to ad97507 Compare January 19, 2020 12:14

proost changed the title ~~ENH:column-wise DataFrame.fillna with Series and Dict~~ ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with Series and Dict Jan 19, 2020

WillAyd requested changes Jan 20, 2020

View reviewed changes

pandas/core/generic.py Outdated Show resolved Hide resolved

proost force-pushed the enh-column-wise-fillna branch 4 times, most recently from 6f79400 to 6060212 Compare January 24, 2020 14:35

WillAyd requested changes Feb 12, 2020

View reviewed changes

proost force-pushed the enh-column-wise-fillna branch from 6bc8fe2 to 8a25980 Compare February 21, 2020 13:53

proost force-pushed the enh-column-wise-fillna branch 2 times, most recently from fde8087 to 8346780 Compare February 23, 2020 14:28

proost added a commit to proost/pandas that referenced this pull request Feb 23, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

ec012e9

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from 8346780 to ec012e9 Compare February 23, 2020 14:46

jreback requested changes Feb 23, 2020

View reviewed changes

proost added a commit to proost/pandas that referenced this pull request Feb 26, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

258659f

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from ec012e9 to 258659f Compare February 26, 2020 17:09

proost added a commit to proost/pandas that referenced this pull request Feb 26, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

a3245f9

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from 258659f to a3245f9 Compare February 26, 2020 17:12

WillAyd requested changes Mar 3, 2020

View reviewed changes

jreback requested changes Mar 15, 2020

View reviewed changes

proost added a commit to proost/pandas that referenced this pull request Mar 18, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

621d2a7

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from a3245f9 to 621d2a7 Compare March 18, 2020 13:54

WillAyd requested changes Apr 7, 2020

View reviewed changes

pandas/core/generic.py Outdated Show resolved Hide resolved

proost added a commit to proost/pandas that referenced this pull request Apr 8, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

36bfd78

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from 5e5d272 to 36bfd78 Compare April 8, 2020 15:40

proost added a commit to proost/pandas that referenced this pull request Apr 12, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

3763e31

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch 2 times, most recently from 387ca34 to 3763e31 Compare April 14, 2020 03:19

jbrockmendel reviewed Apr 16, 2020

View reviewed changes

pandas/core/generic.py Outdated Show resolved Hide resolved

proost added a commit to proost/pandas that referenced this pull request Apr 17, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

c434366

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from f66afef to c434366 Compare April 17, 2020 10:36

jbrockmendel reviewed Apr 17, 2020

View reviewed changes

pandas/core/generic.py Outdated Show resolved Hide resolved

proost added a commit to proost/pandas that referenced this pull request Apr 17, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

5b3c363

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from 0c64e73 to 5b3c363 Compare April 17, 2020 17:03

jreback requested changes Jun 14, 2020

View reviewed changes

proost added a commit to proost/pandas that referenced this pull request Jun 17, 2020

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

587f619

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from de6d4f7 to 587f619 Compare June 17, 2020 11:48

jreback requested changes Aug 7, 2020

View reviewed changes

doc/source/whatsnew/v1.1.0.rst Outdated Show resolved Hide resolved

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with…

38f3657

… Series and Dict (pandas-dev#30922)

proost force-pushed the enh-column-wise-fillna branch from 587f619 to 38f3657 Compare August 10, 2020 17:20

arw2019 added the Stale label Nov 21, 2020

arw2019 mentioned this pull request Dec 8, 2020

ENH: column-wise DataFrame.fillna with Series/Dict value #38352

Closed

5 tasks

arw2019 closed this Dec 8, 2020

Uh oh!

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with Series and Dict #30922

ENH:column-wise DataFrame.fillna and duplicated DataFrame.fillna with Series and Dict #30922

Uh oh!

Conversation

proost commented Jan 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

proost Feb 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pep8speaks commented Feb 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-08-10 17:20:16 UTC

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

proost Feb 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WillAyd Mar 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WillAyd commented Sep 10, 2020

Uh oh!

arw2019 commented Nov 21, 2020

Uh oh!

arw2019 commented Dec 8, 2020

Uh oh!

Uh oh!

proost commented Jan 11, 2020 •

edited

Loading

proost Feb 21, 2020 •

edited

Loading

pep8speaks commented Feb 21, 2020 •

edited

Loading

proost Feb 26, 2020 •

edited

Loading

WillAyd Mar 3, 2020 •

edited

Loading