Skip to content

In read_csv, setting index_col breaks functionality of converters when engine='python' #14379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gilwolff opened this issue Oct 8, 2016 · 1 comment
Labels
Bug Duplicate Report Duplicate issue or pull request IO CSV read_csv, to_csv

Comments

@gilwolff
Copy link

gilwolff commented Oct 8, 2016

I am experiencing an issue with read_csv
When setting index_col, and a converters dict, output of engine='c' is correct and output of engine='python' is wrong

def first(s):
    return s[0]
io = StringIO('col1,col2\n1_,a\n2_,b\n')
print pd.read_csv(io, index_col=0, converters = {0 : first})
     col2
col1
1       a
2       b
io = StringIO('col1,col2\n1_,a\n2_,b\n')
print pd.read_csv(io, index_col=0, converters = {0 : first}, engine='python')
     col2
col1
1_      a
2_      b

Expected output: for second example to be like the first
Note: If I leave out index_col=0, converter works as expected

Output of pd.show_versions():

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.11-23.53.amzn1.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 12.2
Cython: None
numpy: 1.11.1

@jreback jreback added Bug Duplicate Report Duplicate issue or pull request IO CSV read_csv, to_csv labels Oct 9, 2016
@jreback jreback added this to the No action milestone Oct 9, 2016
@jreback
Copy link
Contributor

jreback commented Oct 9, 2016

this is detailed in #9435, last example by @gfyoung; the logic for handling convererts is a bit convoluted ATM. pull-requests to fix would help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

2 participants