Skip to content

Empty dataframe does not keep index name when index is used #26325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarekHauzr opened this issue May 9, 2019 · 3 comments
Closed

Empty dataframe does not keep index name when index is used #26325

MarekHauzr opened this issue May 9, 2019 · 3 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@MarekHauzr
Copy link

Empty dataframe does not keep index name when index is used

import pandas as pd
df = pd.DataFrame({'a': [], 'b':[], 'c': []})
df = df.set_index('c')
print(df.index.name) # prints 'c'
# using index to generate new column
df['d'] = df.index
# looking at the name of the index
print(df.index.name) # shows None

Problem description

I tested this for pandas==0.24.2

When generating a new column based on the data in index (not necessarily equality but any transformation of the index) I lose the index name in a special case where the dataframe is empty.

It becomes a problem when I reset the index and it becomes a column with name index instead of c.

Expected Output

Expected output is 'c' in both cases (before and after using the index).

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-47-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.utf-8
LANG: en_US.utf-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 3.1.3
pip: 19.0.3
setuptools: 40.5.0
Cython: 0.29
numpy: 1.15.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.1.1
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.5
bs4: 4.6.3
html5lib: None
sqlalchemy: 1.2.12
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: Non

@TomAugspurger
Copy link
Contributor

Duplicate of #17101

Somewhere in the DataFrame.__setitem__ call, we convert the length-zero Index to a RangeIndex. LMK if you're interested in fixing!

@TomAugspurger TomAugspurger added the Duplicate Report Duplicate issue or pull request label May 10, 2019
@TomAugspurger TomAugspurger added this to the No action milestone May 10, 2019
@MarekHauzr
Copy link
Author

@TomAugspurger Yes, I'd be interested in fixing it. It would make my life a little bit easier and I think I'm not the only one. I will be available in few days, so I can have a look at it then. Is there a standardized process to do this?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented May 10, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants