-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: fix Series.argsort
#42090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
attack68
wants to merge
43
commits into
pandas-dev:main
from
attack68:argsort_labelling_index_sorted
Closed
BUG: fix Series.argsort
#42090
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
8061f94
BUG: reordering the index labels
attack68 f0c4d1a
BUG: reordering the index labels
attack68 fa848c0
amend tests
attack68 701d3c8
update method and amend tests
attack68 c86f78d
special if for SparseArray
attack68 2d92c34
mypy fix
attack68 9d4be70
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 3146640
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 482b8ad
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 228aa1f
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 2281d8d
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 28e0c5e
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 82b6092
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 94fefdc
whats new 1.4.0
attack68 944f08b
add possible return formats with "first" "last" and tests
attack68 372a8db
default
attack68 980d1d4
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 4f5777a
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 5585d01
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 1e2f0de
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 49f9137
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 b3ae107
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 da0dcf2
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 fdf1bac
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 c1a6e18
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 9db6f97
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 82fbb76
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 2552d8f
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 40c94e2
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 0e76ee3
whats new
attack68 eb6d499
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 a78b9ff
whats new
attack68 dfcdad7
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 0bcacb9
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 220fc67
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 560298f
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 5381a0f
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 30dee32
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 ec453af
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 5d3124e
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 c8e69d8
fix tests
attack68 4f4e8ce
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 6004737
Merge remote-tracking branch 'upstream/master' into argsort_labelling…
attack68 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3649,12 +3649,12 @@ def sort_index( | |
key, | ||
) | ||
|
||
def argsort(self, axis=0, kind="quicksort", order=None) -> Series: | ||
def argsort(self, axis=0, kind="quicksort", order=None, na_position=None) -> Series: | ||
""" | ||
Return the integer indices that would sort the Series values. | ||
|
||
Override ndarray.argsort. Argsorts the value, omitting NA/null values, | ||
and places the result in the same locations as the non-NA values. | ||
Override ndarray.argsort. The index is also sorted so that index labels | ||
correspond to the integer indices. | ||
|
||
Parameters | ||
---------- | ||
|
@@ -3665,29 +3665,107 @@ def argsort(self, axis=0, kind="quicksort", order=None) -> Series: | |
information. 'mergesort' and 'stable' are the only stable algorithms. | ||
order : None | ||
Has no effect but is accepted for compatibility with numpy. | ||
na_position : {None, "first", "last"} | ||
Puts NaNs at the beginning if *first*; *last* puts NaNs at the end. | ||
Defaults to *None*, which puts NaNs at the end an gives them all a sorting | ||
index of '-1'. | ||
|
||
.. versionadded:: 1.4.0 | ||
|
||
Returns | ||
------- | ||
Series[np.intp] | ||
Positions of values within the sort order with -1 indicating | ||
nan values. | ||
Positions of values within the sort order with associated sorted index. | ||
|
||
See Also | ||
-------- | ||
numpy.ndarray.argsort : Returns the indices that would sort this array. | ||
""" | ||
values = self._values | ||
mask = isna(values) | ||
Series.idxmax : Return the row label of the maximum value. | ||
Series.idxmin : Return the row label of the minimum value. | ||
|
||
if mask.any(): | ||
result = np.full(len(self), -1, dtype=np.intp) | ||
notmask = ~mask | ||
result[notmask] = np.argsort(values[notmask], kind=kind) | ||
Examples | ||
-------- | ||
Argsorting a basic Series. | ||
|
||
>>> series = Series([30, 10, 20], index=["high", "low", "mid"], name="xy") | ||
>>> series.argsort() | ||
low 1 | ||
mid 2 | ||
high 0 | ||
Name: xy, dtype: int64 | ||
|
||
Argsorting a Series with null values. | ||
|
||
>>> series = Series([30, 10, np.nan, 20], name="xy", | ||
... index=["high", "low", "null", "mid"]) | ||
>>> series.argsort() | ||
low 1 | ||
mid 3 | ||
high 0 | ||
null -1 | ||
Name: xy, dtype: int64 | ||
|
||
Argsorting a Series using ``na_position`` | ||
|
||
>>> series.argsort(na_position="first") | ||
null 2 | ||
low 1 | ||
mid 3 | ||
high 0 | ||
Name: xy, dtype: int64 | ||
""" | ||
values = self.values | ||
na_mask = isna(values) | ||
n_na = na_mask.sum() | ||
if n_na == 0: # number of NaN values is zero | ||
res = np.argsort(values, kind=kind) | ||
res_ser = Series(res, index=self.index[res], dtype=np.intp, name=self.name) | ||
return res_ser.__finalize__(self, method="argsort") | ||
else: | ||
result = np.argsort(values, kind=kind) | ||
# GH 42090 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would be very clear that we are ordering nulls at the |
||
if isinstance(na_mask, pandas.core.arrays.sparse.SparseArray): | ||
# avoid RecursionError | ||
na_mask = np.asarray(na_mask) | ||
|
||
# Do the not_na argsort: | ||
# count the missing index values within arrays added to not_na results | ||
notna_na_cumsum = na_mask.cumsum()[~na_mask] | ||
# argsort the values excluding the nans | ||
notna_argsort = np.argsort(values[~na_mask]) | ||
# add to these the indexes where nans have been removed | ||
notna_argsort += notna_na_cumsum[notna_argsort] | ||
|
||
# Do the na argsort: | ||
if na_position is None: | ||
na_argsort = -1 | ||
elif na_position == "first" or na_position == "last": | ||
# count the missing index values within arrays added to na results | ||
na_notna_cumsum = (~na_mask).cumsum()[na_mask] | ||
# argsort the nans | ||
na_argsort = np.arange(n_na) | ||
# add to these the indexes where not nans have been removed | ||
na_argsort += na_notna_cumsum | ||
else: | ||
raise ValueError("`na_position` must be one of {'first', 'last', None}") | ||
|
||
res = self._constructor(result, index=self.index, name=self.name, dtype=np.intp) | ||
return res.__finalize__(self, method="argsort") | ||
# create and combine the Series: | ||
na_res_ser = Series( | ||
na_argsort, index=self.index[na_mask], dtype=np.intp, name=self.name | ||
) | ||
notna_res_ser = Series( | ||
notna_argsort, | ||
index=self.index[notna_argsort], | ||
dtype="int64", | ||
name=self.name, | ||
) | ||
from pandas.core.reshape.concat import concat | ||
|
||
concat_order = [notna_res_ser, na_res_ser] | ||
if na_position == "first": | ||
concat_order = [na_res_ser, notna_res_ser] | ||
ret_ser = concat(concat_order).__finalize__(self, method="argsort") | ||
assert isinstance(ret_ser, Series) # mypy: concat 2 Series so is OK | ||
return ret_ser | ||
|
||
def nlargest(self, n=5, keep="first") -> Series: | ||
""" | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.