Skip to content

read_csv : string type not used for multiindex column #9849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
julienvienne opened this issue Apr 10, 2015 · 1 comment
Closed

read_csv : string type not used for multiindex column #9849

julienvienne opened this issue Apr 10, 2015 · 1 comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions IO CSV read_csv, to_csv

Comments

@julienvienne
Copy link

Hello,
I have a CSV like this :

POSTE;DATE;RR;TN;TX;SIGMA
06088001;20150316;8,1;7,4;14,1;11
06088001;20150317;0,6;9,8;15,6;0

First column designate a 8 character code which should be handled as a string.
Second column is a datetime. The two columns are designated as index column with index_col argument :

>>>df = read_csv(fileame, sep=";",
              header=0, 
              parse_dates=[1],
              decimal=",",  
              index_col=[0,1],
              dtype={"POSTE":str})
>>>df 
                       RR    TN    TX  SIGMA
POSTE    DATE                               
6088001  2015-03-16   8.1   7.4  14.1     11
         2015-03-17   0.6   9.8  15.6      0

As you can see, there is a missing "0" for POSTE column because it has been casted to integer, despite of dtype={"POSTE":str} argument. I also tried with dtype={"POSTE":object} instruction as indicated on forums it still gives the same result.

Apparently this only occurs only if column is used in an index column.
Column is not converted if I only use DATE for the index :

>>> df = read_csv(fileame, sep=";",
              header=0, 
              parse_dates=[1],
              decimal=",",  
              index_col=[1],
              dtype={"POSTE":str})

>>> df 
            POSTE       RR    TN    TX  SIGMA
DATE                                         
2015-03-16  06088001   8.1   7.4  14.1     11
2015-03-17  06088001   0.6   9.8  15.6      0

I'm using Pandas v 0.14.1 so it may have been fixed recently.
Do you confirm this is not the correct behaviour ?

Regards

@jreback
Copy link
Contributor

jreback commented Apr 11, 2015

this is a dupe of #9435 thanks for the report

work-around is to read in w/o an index column, then .set_index()

@jreback jreback closed this as completed Apr 11, 2015
@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions IO CSV read_csv, to_csv labels Apr 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

2 participants