-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Converting from categorical to int ignores NaNs #28406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I would expect To get a nullable integer, |
It seems like the weird value gets introduced here: https://github.com/pandas-dev/pandas/blob/master/pandas/core/arrays/categorical.py#L523 . Would this be a
|
Well, but that's a bit sad because it means that once you convert to categoricals, you cannot get back :-( . |
Thanks for tracking it down. I think that's just NumPy's defined behavior. We might want
What do you mean? |
Well, I meant something like if you did I understand it's tricky, especially when dealing with nullable integers. It's true that raising is probably better than trying to be somehow clever. |
Sorry, I'm still not understanding. In your example, s = pd.Series([1, None]).astype(int) raises. The conversion from float to int is what raises, not to or from categorical. |
Sorry, I wrote it wrongly. I meant if you do:
and now wanted to get back to numerics, or more specifically e.g.
while it would be super cool to do just |
That should work just fine. As I said earlier
|
Seems that |
Hmm, OK. It's a bit unfortunate, but I we'll need to include something like if is_integer_dtype(dtype):
if self.isna().any():
raise ValueError(...) |
Uh oh!
There was an error while loading. Please reload this page.
Code Sample, a copy-pastable example if possible
Problem description
When converting categorical series back into Int column, it converts
NaN
to incorect integer negative value.Expected Output
I would expect that
NaN
in category converts toNaN
inIntX
(nullable integer) orfloat
.When trying to use
d.astype('Int8')
, I get an errordtype not understood
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: