pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError #18304

toobaz · 2017-11-15T14:17:53Z

Code Sample, a copy-pastable example if possible

pietro@debiousci:~$ PYTHONHASHSEED=5 python3 -c "import pandas as pd; s1 = pd.Series([0,1], name='a'); s2 = pd.Series([2,3], name=('b', 'c')); print(pd.crosstab(s1, s2))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/pietro/nobackup/repo/pandas/pandas/core/reshape/pivot.py", line 466, in crosstab
    dropna=dropna, **kwargs)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/frame.py", line 4462, in pivot_table
    margins_name=margins_name)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/reshape/pivot.py", line 82, in pivot_table
    agged = grouped.agg(aggfunc)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/groupby.py", line 4191, in aggregate
    return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/groupby.py", line 3632, in aggregate
    return self._python_agg_general(arg, *args, **kwargs)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/groupby.py", line 873, in _python_agg_general
    return self._wrap_aggregated_output(output)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/groupby.py", line 4254, in _wrap_aggregated_output
    agg_labels = self._obj_with_exclusions._get_axis(agg_axis)
  File "pandas/_libs/properties.pyx", line 39, in pandas._libs.properties.cache_readonly.__get__ (pandas/_libs/properties.c:1604)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/base.py", line 235, in _obj_with_exclusions
    return self.obj.drop(self.exclusions, axis=1)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/generic.py", line 2517, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/generic.py", line 2549, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/indexes/base.py", line 3750, in drop
    labels = _index_labels_to_array(labels)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/common.py", line 417, in _index_labels_to_array
    labels = _asarray_tuplesafe(labels)
  File "/home/pietro/nobackup/repo/pandas/pandas/core/common.py", line 386, in _asarray_tuplesafe
    result = np.asarray(values, dtype=dtype)
  File "/home/pietro/.local/lib/python3.5/site-packages/numpy/core/numeric.py", line 531, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence

Compare to:

pietro@debiousci:~$ PYTHONHASHSEED=6 python3 -c "import pandas as pd; s1 = pd.Series([0,1], name='a'); s2 = pd.Series([2,3], name=('b', 'c')); print(pd.crosstab(s1, s2))"
('b', 'c')  2  3
a               
0           1  0
1           0  1

Problem description

The above happens (pseudo-)randomly with python 3 and, it seems, always with python 2.

Expected Output

The case PYTHONHASHSEED=6.

Output of `pd.show_versions()`

In [2]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.22.0.dev0+131.g63e8527d3
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

The text was updated successfully, but these errors were encountered:

toobaz · 2017-11-15T14:24:40Z

The above happens (pseudo-)randomly with python 3 and, it seems, always with python 2.

Sometimes it works also in Python 2.

closes pandas-dev#18304

closes #18304

closes pandas-dev#18304

toobaz changed the title ~~pd.crosstab on Series with tuple name randomly throws ValueError~~ pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError Nov 15, 2017

toobaz added a commit to toobaz/pandas that referenced this issue Nov 15, 2017

BUG: cast to correct dtype in Index.drop()

0336266

closes pandas-dev#18304

toobaz added a commit to toobaz/pandas that referenced this issue Nov 15, 2017

BUG: cast to correct dtype in Index.drop()

a599512

closes pandas-dev#18304

toobaz added a commit to toobaz/pandas that referenced this issue Nov 15, 2017

BUG: cast to correct dtype in Index.drop()

7614bf3

closes pandas-dev#18304

toobaz mentioned this issue Nov 15, 2017

BUG: cast to correct dtype in Index.drop() #18309

Merged

4 tasks

jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 16, 2017

jreback added this to the 0.22.0 milestone Nov 16, 2017

toobaz mentioned this issue Nov 16, 2017

pd.crosstab(s1, s2) keeps dummy MultiIndex as columns if both s1 and s2 have tuple name #18321

Closed

toobaz added a commit to toobaz/pandas that referenced this issue Dec 28, 2017

BUG: cast to correct dtype in Index.drop()

69cf672

closes pandas-dev#18304

jreback closed this as completed in #18309 Dec 29, 2017

jreback pushed a commit that referenced this issue Dec 29, 2017

BUG: cast to correct dtype in Index.drop() (#18309)

4883a43

closes #18304

hexgnu pushed a commit to hexgnu/pandas that referenced this issue Jan 1, 2018

BUG: cast to correct dtype in Index.drop() (pandas-dev#18309)

7c38c20

closes pandas-dev#18304

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError #18304

pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError #18304

toobaz commented Nov 15, 2017

INSTALLED VERSIONS

toobaz commented Nov 15, 2017

Uh oh!

Uh oh!

pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError #18304

pd.Index([ ('b', 'c'), 'a']).drop(['a', ('b', 'c')]) raises ValueError #18304

Comments

toobaz commented Nov 15, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

toobaz commented Nov 15, 2017

Uh oh!

Output of `pd.show_versions()`