-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Can't handle duplicate column names in sql read #3487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@tavistmorph we just pushed updates for dup_columns. Can you try this wil latest master (I am not sure this is the issue, but let's see) |
Can't repro with 0.10.1 or 0.11.0 or master In [1]: import sqlite3
...:
...: import pandas.io.sql as psql
...: con = sqlite3.connect("a2.db")
...: c = con.cursor()
...: try:
...: c.execute('''CREATE TABLE tab (a,b,c);''')
...: except Exception as e:
...: pass
...: try:
...: c.execute("INSERT INTO tab VALUES (1,2,3);")
...: c.execute("INSERT INTO tab VALUES (4,5,6);")
...: c.commit()
...: except Exception as e:
...: pass
...: # print e
...: sql = "select a,b as a from tab where a=1"
...: #c.execute(sql).fetchall()
...: pandas.io.sql.read_frame(sql, con)
Out[1]:
a a
0 1 2
In [3]: print pd.__version__
0.10.1 |
Your code above works fine for me too. But I was running a different case than you. Try it this way and it gives an error for me on 0.11. (BTW -- not sure if it matters, but my database is sql server) import pandas Result: |
I can reproduce that error, but only when the select query returns 0 rows. import sqlite3
import pandas.io.sql as psql
con = sqlite3.connect("a9.db")
c = con.cursor()
try:
c.execute('''CREATE TABLE tab (a,b,c)''')
except:
pass
try:
c.execute("INSERT INTO tab VALUES (1,2,3)")
c.execute("INSERT INTO tab VALUES (4,5,6)")
c.commit()
except:
pass
sql = 'select a as a2 , b , c as b from tab where a="meow"'
pandas.io.sql.read_frame(sql, con)
/home/user1/src/pandas/pandas/core/internals.py in _verify_integrity(self)
1126
1127 if len(self.items) != tot_items:
-> 1128 raise AssertionError('Number of manager items must equal union of '
1129 'block items')
1130
AssertionError: Number of manager items must equal union of block items |
Works with pyodbc as well, unless the select query returns no rows. In [45]: import sqlite3
...: import pyodbc
...:
...: import pandas.io.sql as psql
...: con = sqlite3.connect("a.db")
...: c = con.cursor()
...: try:
...: c.execute('''CREATE TABLE tab (a,b,c)''')
...: except:
...: pass
...: try:
...: c.execute("INSERT INTO tab VALUES (1,2,3)")
...: c.execute("INSERT INTO tab VALUES (4,5,6)")
...: c.commit()
...: except:
...: pass
...: con.close()
...:
...: con = pyodbc.connect('DRIVER={SQLite3};DATABASE=a.db;')
...: sql = 'select a as a2 , b , c as b from tab'
...: pandas.io.sql.read_frame(sql, con)
Out[45]:
a2 b b
0 1 2 3
1 4 5 6 |
@y-p what is actually passed in the 0-case (to |
Interesting.
But this fails with the obscure message:
The main ctor doesn't let you construct an empty df with non-zero dims at all:
but allows it if there are no dims (I remember opening that issue, actually):
In any case, the original issue probably interpreted the behaviour on an empty queryset |
I think we can close this in deference to #3562. |
Possibly related to #3468
Using pandas 0.10.1.
It's legal for sql (at least Sql Server) to have queries that return two columns with the same name. And it's legal for pandas to have dataframes that have two columns with the same name. But pandas gives a very misleading error message when you try it with read_frame:
Query that has 2 columns with the same name:
sql = "select 1 as whatever, 2 as name, 3 as name where 1=0"
pos = pandas.io.sql.read_frame(sql, dbconn)
That gives very misleading error message
The text was updated successfully, but these errors were encountered: