Skip to content

Adding a new DataFrame row using dict() gives unexpected behaviour #17072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mp-v2 opened this issue Jul 25, 2017 · 5 comments
Closed

Adding a new DataFrame row using dict() gives unexpected behaviour #17072

mp-v2 opened this issue Jul 25, 2017 · 5 comments
Labels
Bug Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@mp-v2
Copy link

mp-v2 commented Jul 25, 2017

Code Sample, a copy-pastable example if possible

Link to StackOverflow example

Using pd.series works fine:

df = pd.DataFrame(np.random.randint(1, 10, (3, 3)), index=['one', 'one', 'two'], columns=['col1', 'col2', 'col3'])
new_data = pd.Series({'col1': 'new', 'col2': 'new', 'col3': 'new'})
df.iloc[0] = new_data

# resulting df looks like:

#       col1    col2    col3
#one    new     new     new
#one    9       6       1
#two    8       3       7

But if I try to add a dictionary instead, I get this:

new_data = {'col1': 'new', 'col2': 'new', 'col3': 'new'}
df.iloc[0] = new_data
#
#         col1  col2    col3
#one      col2  col3    col1
#one      2     1       7
#two      5     8       6

Problem description

Considering that a dataframe can be created using a dictionary, it seems odd that adding to a dataframe using a dictionary would result in shuffled columns when they have been explicitly labeled.

This seems like an unexpected behaviour which, although may not be a bug, doesn't seem like the optimal way of making this process function.

Expected Output

Adding a row to a dataframe using a dictionary which gives column headers results in the same thing as adding a row using pd.series

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-83-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C LANG: en_GB.UTF-8 LOCALE: None.None

pandas: 0.20.2
pytest: None
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: None
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.6.1
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: 3.7.3
bs4: None
html5lib: 0.999
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None

@mp-v2 mp-v2 changed the title Adding a new DataFrame row using dict() shuffles columns Adding a new DataFrame row using dict() gives unexpected behaviour Jul 25, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 25, 2017

@mat2py : Thanks for the issue! I tried your example using 0.20.2 on Windows, and I can't seem to reproduce your error (I just copied and pasted your code). If you could reconfirm that you get this error by copying and pasting the code you provided above?

Also, if anyone else can reproduce this (cc @jreback ), that would be great to know.

@gfyoung gfyoung added Can't Repro Indexing Related to indexing on series/frames, not to indexes themselves labels Jul 25, 2017
@mp-v2
Copy link
Author

mp-v2 commented Jul 25, 2017

I can replicate it yes. However the issue doesn't occur if you apply the dict to the df after already having applied the pd.series. You have to recreate the dataframe and only try to add the dict.

What you probably did:

  • create df
  • test with df.series (works)
  • test with dict (works)

What doesn't work:

  • create df
  • test with dict (doesn't work)

@mp-v2
Copy link
Author

mp-v2 commented Jul 25, 2017

image

@gfyoung gfyoung added Bug and removed Can't Repro labels Jul 25, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 25, 2017

Not on my computer ATM, but can't argue with pictures. Something funny is going on there. PR to patch this is welcome!

@chris-b1
Copy link
Contributor

chris-b1 commented Jul 25, 2017

Thanks for the example, this is a duplicate of #16724

@chris-b1 chris-b1 marked this as a duplicate of #16724 Jul 25, 2017
@chris-b1 chris-b1 added the Duplicate Report Duplicate issue or pull request label Jul 25, 2017
@chris-b1 chris-b1 added this to the No action milestone Jul 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

3 participants