-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
API: Implement new indexing behavior for intervals #27100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
1918d5f
API: Implement new interval behavior
jschendel cf433b3
add whatsnew
jschendel 55cf803
additional fixed issues
jschendel 667acfc
fix failed checks
jschendel 50c257c
review edits
jschendel e713a7e
review edits 2
jschendel 7329730
review edits 3
jschendel 6c001d4
Merge remote-tracking branch 'upstream/master' into ii-new-behavior
jschendel 10a7249
Merge remote-tracking branch 'upstream/master' into ii-new-behavior
jschendel 339e394
Merge remote-tracking branch 'upstream/master' into ii-new-behavior
jorisvandenbossche 091547e
Merge remote-tracking branch 'upstream/master' into ii-new-behavior
jorisvandenbossche 9c2f3a9
remove try/except
jorisvandenbossche 6226bdd
add note on version
jorisvandenbossche File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -484,6 +484,142 @@ This change is backward compatible for direct usage of Pandas, but if you subcla | |
Pandas objects *and* give your subclasses specific ``__str__``/``__repr__`` methods, | ||
you may have to adjust your ``__str__``/``__repr__`` methods (:issue:`26495`). | ||
|
||
.. _whatsnew_0250.api_breaking.interval_indexing: | ||
|
||
|
||
Indexing an ``IntervalIndex`` with ``Interval`` objects | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Indexing methods for :class:`IntervalIndex` have been modified to require exact matches only for :class:`Interval` queries. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a link the docs you added above in indexing |
||
``IntervalIndex`` methods previously matched on any overlapping ``Interval``. Behavior with scalar points, e.g. querying | ||
with an integer, is unchanged (:issue:`16316`). | ||
jschendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. ipython:: python | ||
|
||
ii = pd.IntervalIndex.from_tuples([(0, 4), (1, 5), (5, 8)]) | ||
ii | ||
|
||
The ``in`` operator (``__contains__``) now only returns ``True`` for exact matches to ``Intervals`` in the ``IntervalIndex``, whereas | ||
this would previously return ``True`` for any ``Interval`` overlapping an ``Interval`` in the ``IntervalIndex``. | ||
|
||
*Previous behavior*: | ||
|
||
.. code-block:: python | ||
|
||
In [4]: pd.Interval(1, 2, closed='neither') in ii | ||
Out[4]: True | ||
|
||
In [5]: pd.Interval(-10, 10, closed='both') in ii | ||
Out[5]: True | ||
|
||
*New behavior*: | ||
|
||
.. ipython:: python | ||
|
||
pd.Interval(1, 2, closed='neither') in ii | ||
pd.Interval(-10, 10, closed='both') in ii | ||
|
||
The :meth:`~IntervalIndex.get_loc` method now only returns locations for exact matches to ``Interval`` queries, as opposed to the previous behavior of | ||
returning locations for overlapping matches. A ``KeyError`` will be raised if an exact match is not found. | ||
|
||
*Previous behavior*: | ||
|
||
.. code-block:: python | ||
|
||
In [6]: ii.get_loc(pd.Interval(1, 5)) | ||
Out[6]: array([0, 1]) | ||
|
||
In [7]: ii.get_loc(pd.Interval(2, 6)) | ||
Out[7]: array([0, 1, 2]) | ||
|
||
*New behavior*: | ||
|
||
.. code-block:: python | ||
|
||
jschendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
In [6]: ii.get_loc(pd.Interval(1, 5)) | ||
Out[6]: 1 | ||
|
||
In [7]: ii.get_loc(pd.Interval(2, 6)) | ||
--------------------------------------------------------------------------- | ||
KeyError: Interval(2, 6, closed='right') | ||
|
||
Likewise, :meth:`~IntervalIndex.get_indexer` and :meth:`~IntervalIndex.get_indexer_non_unique` will also only return locations for exact matches | ||
to ``Interval`` queries, with ``-1`` denoting that an exact match was not found. | ||
|
||
These indexing changes extend to querying a :class:`Series` or :class:`DataFrame` with an ``IntervalIndex`` index. | ||
|
||
.. ipython:: python | ||
|
||
s = pd.Series(list('abc'), index=ii) | ||
s | ||
|
||
Selecting from a ``Series`` or ``DataFrame`` using ``[]`` (``__getitem__``) or ``loc`` now only returns exact matches for ``Interval`` queries. | ||
|
||
*Previous behavior*: | ||
|
||
.. code-block:: python | ||
|
||
In [8]: s[pd.Interval(1, 5)] | ||
Out[8]: | ||
(0, 4] a | ||
(1, 5] b | ||
dtype: object | ||
|
||
In [9]: s.loc[pd.Interval(1, 5)] | ||
Out[9]: | ||
(0, 4] a | ||
(1, 5] b | ||
dtype: object | ||
|
||
*New behavior*: | ||
|
||
.. ipython:: python | ||
|
||
s[pd.Interval(1, 5)] | ||
s.loc[pd.Interval(1, 5)] | ||
|
||
Similarly, a ``KeyError`` will be raised for non-exact matches instead of returning overlapping matches. | ||
|
||
*Previous behavior*: | ||
|
||
.. code-block:: python | ||
|
||
In [9]: s[pd.Interval(2, 3)] | ||
Out[9]: | ||
(0, 4] a | ||
(1, 5] b | ||
dtype: object | ||
|
||
In [10]: s.loc[pd.Interval(2, 3)] | ||
Out[10]: | ||
(0, 4] a | ||
(1, 5] b | ||
dtype: object | ||
|
||
*New behavior*: | ||
|
||
.. code-block:: python | ||
jschendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
In [6]: s[pd.Interval(2, 3)] | ||
--------------------------------------------------------------------------- | ||
KeyError: Interval(2, 3, closed='right') | ||
|
||
In [7]: s.loc[pd.Interval(2, 3)] | ||
--------------------------------------------------------------------------- | ||
KeyError: Interval(2, 3, closed='right') | ||
|
||
The :meth:`~IntervalIndex.overlaps` method can be used to create a boolean indexer that replicates the | ||
previous behavior of returning overlapping matches. | ||
|
||
*New behavior*: | ||
|
||
.. ipython:: python | ||
|
||
idxr = s.index.overlaps(pd.Interval(2, 3)) | ||
idxr | ||
s[idxr] | ||
s.loc[idxr] | ||
|
||
.. _whatsnew_0250.api_breaking.deps: | ||
|
||
Increased minimum versions for dependencies | ||
|
@@ -686,7 +822,7 @@ Categorical | |
|
||
- Bug in :func:`DataFrame.at` and :func:`Series.at` that would raise exception if the index was a :class:`CategoricalIndex` (:issue:`20629`) | ||
- Fixed bug in comparison of ordered :class:`Categorical` that contained missing values with a scalar which sometimes incorrectly resulted in ``True`` (:issue:`26504`) | ||
- | ||
- Bug in :meth:`DataFrame.dropna` when the :class:`DataFrame` has a :class:`CategoricalIndex` containing :class:`Interval` objects incorrectly raised a ``TypeError`` (:issue:`25087`) | ||
|
||
Datetimelike | ||
^^^^^^^^^^^^ | ||
|
@@ -764,6 +900,7 @@ Interval | |
|
||
- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`) | ||
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`) | ||
- Bug in :meth:`IntervalIndex.get_loc` where a ``KeyError`` would be incorrectly raised for a decreasing :class:`IntervalIndex` (:issue:`25860`) | ||
- Bug in :class:`Index` constructor where passing mixed closed :class:`Interval` objects would result in a ``ValueError`` instead of an ``object`` dtype ``Index`` (:issue:`27172`) | ||
|
||
Indexing | ||
|
@@ -778,6 +915,7 @@ Indexing | |
- Fixed bug where assigning a :class:`arrays.PandasArray` to a :class:`pandas.core.frame.DataFrame` would raise error (:issue:`26390`) | ||
- Allow keyword arguments for callable local reference used in the :meth:`DataFrame.query` string (:issue:`26426`) | ||
- Bug which produced ``AttributeError`` on partial matching :class:`Timestamp` in a :class:`MultiIndex` (:issue:`26944`) | ||
- Bug in :class:`Categorical` and :class:`CategoricalIndex` with :class:`Interval` values when using the ``in`` operator (``__contains``) with objects that are not comparable to the values in the ``Interval`` (:issue:`23705`) | ||
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.iloc` on a :class:`DataFrame` with a single timezone-aware datetime64[ns] column incorrectly returning a scalar instead of a :class:`Series` (:issue:`27110`) | ||
- | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.