Skip to content

BUG: asfreq silently drops rows when index is not sorted #39805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Gerenuk opened this issue Feb 14, 2021 · 3 comments · Fixed by #40384
Closed

BUG: asfreq silently drops rows when index is not sorted #39805

Gerenuk opened this issue Feb 14, 2021 · 3 comments · Fixed by #40384
Assignees
Labels
Datetime Datetime data dtype Frequency DateOffsets
Milestone

Comments

@Gerenuk
Copy link

Gerenuk commented Feb 14, 2021

pd.DataFrame.asfreq seems to silently drop rows when the data is not sorted.

d=pd.DataFrame(range(3), index=pd.to_datetime(["2021-01-01", "2021-03-01", "2021-02-01"])).asfreq("MS")
d
# output:
# 2021-01-01  0
# 2021-02-01  2

Silent data loss is very dangerous in data analysis as it's hard to detect.

Expected behavior:

  • an exception, because dropping rows is not useful
  • or automatic sorting (with possibly a parameter in asfreq to be explicit)
INSTALLED VERSIONS
------------------
commit           : 9d598a5e1eee26df95b3910e3f2934890d062caa
python           : 3.8.5.final.0
pandas           : 1.2.1
numpy            : 1.19.2
@Gerenuk Gerenuk added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 14, 2021
@nmay231
Copy link
Contributor

nmay231 commented Feb 16, 2021

The source of the issue seems to be on this line.

I'm willing to take on this issue. I just need to which behavior is preferred.

I would lean towards handling this implicitly. I don't see a need to raise an error or warning.

@MarcoGorelli MarcoGorelli added Datetime Datetime data dtype Frequency DateOffsets and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 17, 2021
@MarcoGorelli
Copy link
Member

Thanks @Gerenuk for the excellent report - I don't know what the correct solution would be, but can confirm it reproduces

@nmay231
Copy link
Contributor

nmay231 commented Mar 11, 2021

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants