You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice if it were also supported for dataframes, returning indices to maintain order in a lexicographically sorted dataframe. This is supported by cuDF:
The addition of a searchsorted(value, ...) method to pandas.DataFrame, returning the indices to places the elements of values to maintain order in a dataframe lexicographically sorted by all columns.
API breaking implications
None that I can think of.
Describe alternatives you've considered
This could probably be accomplished by converting the dataframe and values to a series of tuples and then using Series.searchsorted, but I imagine there's a more performant way to do this.
Additional context
If this functionality were added, along with a multi-columnar quantiles (#43881), it would enable Dask dataframes to compute sort_values with multiple sort-by columns, using an algorithm roughly similar to that of dask-cudf.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
Pandas currently only supports
searchsorted
for series:It would be nice if it were also supported for dataframes, returning indices to maintain order in a lexicographically sorted dataframe. This is supported by cuDF:
Describe the solution you'd like
The addition of a
searchsorted(value, ...)
method topandas.DataFrame
, returning the indices to places the elements ofvalues
to maintain order in a dataframe lexicographically sorted by all columns.API breaking implications
None that I can think of.
Describe alternatives you've considered
This could probably be accomplished by converting the dataframe and values to a series of tuples and then using
Series.searchsorted
, but I imagine there's a more performant way to do this.Additional context
If this functionality were added, along with a multi-columnar
quantiles
(#43881), it would enable Dask dataframes to computesort_values
with multiple sort-by columns, using an algorithm roughly similar to that of dask-cudf.The text was updated successfully, but these errors were encountered: