Skip to content

Re-implemented parametrization of test_frame_from_json_to_json #28510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 20, 2019

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Sep 18, 2019

This piece was broken off of #27838 as it made the diff much larger, so hopefully easier to digest on its own.

As is, parametrization here has brought up a lot of rough edges which are responsible for some of the complexity. These are noted with TODOs and summarized as follows (save Py35 issues, which aren't worth addressing at this point):

  • Frame order is not maintained when numpy=False (default) and orient="index"
  • On windows or 32 bit platforms it appears that np.int64 roundtrips as np.int32 (maybe not an issue?)
  • orient="split" does not preserve strings in the index if those strings are numeric, though it should be able to
  • convert_axes may have surprising behavior when dealing with empty DataFrames
  • DTI seem to roundtrip as strings when written with epoch format for all but `orient="split"

Not all of these are the same priority, but figure worth leaving as follow ups

@WillAyd WillAyd added IO JSON read_json, to_json, json_normalize Testing pandas testing functions or related to the test suite labels Sep 18, 2019
@WillAyd WillAyd changed the title Reverted parametrization of test_frame_from_json_to_json Re-implemented parametrization of test_frame_from_json_to_json Sep 18, 2019
@jbrockmendel
Copy link
Member

I'll take a look. Thanks for separating this out.

)

expected = df.copy()
expected = expected.assign(**expected.select_dtypes("number").astype(np.int64))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont use assign or select_dtypes very often. Is this just casting to int64 for columns A and B?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's correct. Happy to change to another way of casting if you prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think df[["A", "B"]] = df[["A", "B"]].astype(np.int64) would be clearer. Might just be me.

if orient == "records" or orient == "values":
expected = expected.reset_index(drop=True)
if orient == "values":
expected.columns = range(len(expected.columns))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-iterating request to share code for ~344-360. As is I have to look at each version for "is this subtly different and if so why"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't know I fully understand what you are asking for. This particular test lines up pretty well to the left (lines 463 - 472) but is more explicit about the expected shape of the result. This idiom of resetting particular axes for values and records is used throughout the module (probably worth a dedicated function, but leaving to a follow up)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idiom of resetting particular axes for values and records is used throughout the module

Yah, thats what I'm asking for. OK for follow-up.

self.empty_frame, check_index_type=False, check_column_type=False
@pytest.mark.parametrize("convert_axes", [True, False])
@pytest.mark.parametrize("numpy", [True, False])
def test_roundtrip_timestamp(self, orient, convert_axes, numpy):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, corresponds to 452-453


@pytest.mark.parametrize("convert_axes", [True, False])
@pytest.mark.parametrize("numpy", [True, False])
def test_roundtrip_empty(self, orient, convert_axes, numpy):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, corresponds to 447-449

@pytest.mark.parametrize("dtype", [False, np.int64])
@pytest.mark.parametrize("convert_axes", [True, False])
@pytest.mark.parametrize("numpy", [True, False])
def test_roundtrip_intframe(self, orient, convert_axes, numpy, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func LGTM. corresponds to 417-418

@pytest.mark.parametrize("dtype", [False, float])
@pytest.mark.parametrize("convert_axes", [True, False])
@pytest.mark.parametrize("numpy", [True, False])
def test_roundtrip_simple(self, orient, convert_axes, numpy, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func LGTM. corresponds to 413-415

@jbrockmendel
Copy link
Member

Thanks @WillAyd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants