Skip to content

BUG: Pandas crashes when asked to display a dataframe containing a column with specific content #49195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
perfectly-preserved-pie opened this issue Oct 20, 2022 · 5 comments
Labels
Bug Needs Info Clarification about behavior needed to assess issue Output-Formatting __repr__ of pandas objects, to_string

Comments

@perfectly-preserved-pie
Copy link

perfectly-preserved-pie commented Oct 20, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
df = pd.read_pickle(filepath_or_buffer='https://github.com/perfectly-preserved-pie/larentals/raw/master/dataframe.pickle')
df

Issue Description

I have a column that contains Dash by Plotly HTML code for a Dash Leaflet app. This column's rows each contain a nested list of lists and dictionaries. Here's an example:

>>> df.iloc[4].popup_html
[Div([A(children=Img(id='mls_photo_div', referrerPolicy='noreferrer', src='https://ik.imagekit.io/theoldesthouse/22208359.jpg?tr=h-300%2Cw-400&ik-sdk-version=python-2.2.8', style={'display': 'block', 'width': '100%', 'margin-left': 'auto', 'margin-right': 'auto'}), href='https://www.bhhscalifornia.com/listing-detail/225-w-kelso-street-7-inglewood-ca-90301_5416097', referrerPolicy='noreferrer', target='_blank')]), Table([Tbody([Tr([Td('Listed Date'), Td('2022-10-13')]), Tr([Td('Street Address'), Td('225 W KELSO ST   #7, Inglewood 90301')]), Tr([Td(A(children='Listing ID (MLS#)', href='https://github.com/perfectly-preserved-pie/larentals/wiki#listing-id', target='_blank')), Td(A(children='22208359', href='https://www.bhhscalifornia.com/listing-detail/225-w-kelso-street-7-inglewood-ca-90301_5416097', referrerPolicy='noreferrer', target='_blank'))]), Tr([Td('List Office Phone'), Td(A(children='310-251-6888', href='tel:310-251-6888'))]), Tr([Td('Rental Price'), Td('$1700')]), Tr([Td('Security Deposit'), Td('$1700')]), Tr([Td('Pet Deposit'), Td('Unknown')]), Tr([Td('Key Deposit'), Td('Unknown')]), Tr([Td('Other Deposit'), Td('Unknown')]), Tr([Td('Square Feet'), Td('800 sq. ft')]), Tr([Td('Price Per Square Foot'), Td('$2.12')]), Tr([Td(A(children='Bedrooms/Bathrooms', href='https://github.com/perfectly-preserved-pie/larentals/wiki#bedroomsbathrooms', target='_blank')), Td('1/1,0,0,0')]), Tr([Td('Garage Spaces'), Td('Unknown')]), Tr([Td('Pets Allowed?'), Td('No')]), Tr([Td('Furnished?'), Td('Furnished Or Unfurnished')]), Tr([Td('Year Built'), Td('1961')]), Tr([Td(A(children='Rental Terms', href='https://github.com/perfectly-preserved-pie/larentals/wiki#rental-terms', target='_blank')), Td('<NA>')]), Tr([Td(A(children='Physical Sub Type', href='https://github.com/perfectly-preserved-pie/larentals/wiki#physical-sub-type', target='_blank')), Td('CONDO')])])])]

When I try to view the dataframe, I get this:

>>> df.head()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 1063, in __repr__
    return self.to_string(**repr_params)
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 1244, in to_string
    return fmt.DataFrameRenderer(formatter).to_string(
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1136, in to_string
    string = string_formatter.to_string()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/string.py", line 30, in to_string
    text = self._get_string_representation()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/string.py", line 45, in _get_string_representation
    strcols = self._get_strcols()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/string.py", line 36, in _get_strcols
    strcols = self.fmt.get_strcols()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 617, in get_strcols
    strcols = self._get_strcols_without_index()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 883, in _get_strcols_without_index
    fmt_values = self.format_col(i)
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 897, in format_col
    return format_array(
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1328, in format_array
    return fmt_obj.get_result()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1359, in get_result
    fmt_values = self._format_strings()
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1422, in _format_strings
    fmt_values.append(f" {_format(v)}")
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1402, in _format
    return str(formatter(x))
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 222, in pprint_thing
    result = _pprint_seq(
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 119, in _pprint_seq
    r = [
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 120, in <listcomp>
    pprint_thing(next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 222, in pprint_thing
    result = _pprint_seq(
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 119, in _pprint_seq
    r = [
  File "/home/straying/.local/lib/python3.10/site-packages/pandas/io/formats/printing.py", line 120, in <listcomp>
    pprint_thing(next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
StopIteration

When I drop the column in question, the dataframe is returned successfully:

>>> df = df.drop('popup_html', axis=1)
>>> df.head()
     mls_number  subtype street_number       street_name           City  PostalCode      Br/Ba  DepositKey  ...    Latitude    Longitude  Bedrooms  Total Bathrooms  Full Bathrooms  Half Bathrooms  Three Quarter Bathrooms date_generated
992  TR22223368   RMRT/D          1735     Redgate CIR      Diamond Bar       91765  4/3,0,0,0          50  ...  33.9902729 -117.8321273         4                3               0               0                        0            NaN
993    22208909      APT          6373    Yucca ST   #18    Los Angeles       90028  0/1,0,0,0        <NA>  ...  34.1042777 -118.3289136         0                1               0               0                        0            NaN
994  SB22222749  CONDO/A        7765 W  91st ST   #A2126  Playa del Rey       90293  2/2,0,0,0           0  ...  33.9577482 -118.4317758         2                2               0               0                        0            NaN
995  SR22220740   RMRT/A         23340  Schoolcraft ST       West Hills       91307  1/1,0,0,0           0  ...  34.1959911 -118.6362073         1                1               0               0                        0            NaN
996    22208359    CONDO         225 W     KELSO ST   #7      Inglewood       90301  1/1,0,0,0        <NA>  ...  33.9590288 -118.3584763         1                1               0               0                        0            NaN

[5 rows x 34 columns]

Expected Behavior

The dataframe should be printed out to the terminal successfully without any errors.

I'm 95% sure it has something to do with the way the content in the 'popup_html' column is structured, but that content is critical to my app. Anyone have ideas on what I can do here?

Installed Versions

INSTALLED VERSIONS

commit : 91111fd
python : 3.10.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-128-generic
Version : #144-Ubuntu SMP Tue Sep 20 11:00:04 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.5.1
numpy : 1.23.2
pytz : 2022.2.1
dateutil : 2.8.2
setuptools : 65.3.0
pip : 22.2.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : 1.0.9
fastparquet : None
fsspec : 2022.8.2
gcsfs : None
matplotlib : None
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 9.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : 3.7.0
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : None

@perfectly-preserved-pie perfectly-preserved-pie added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 20, 2022
@mzeitlin11
Copy link
Member

Thanks for reporting this @perfectly-preserved-pie! To investigate this further it would helpful to narrow down where things are going wrong. First, can you reproduce this in a more minimal example? (without pickle, with as few rows/columns as possible, ideally just the column with rows that fail to display). Does that column contain a dash/plotly specific type or can it be reproduced without?

It's possible that the issue is with printing whatever type is held in that column (does just displaying df["popup_html"] fail?) Also, does the error reproduce with each individual row of your dataframe? It could be that some specific rows somehow contain corrupted data that are raising an error when pandas iterates over them and calls __repr__ on each value.

@mzeitlin11 mzeitlin11 added Output-Formatting __repr__ of pandas objects, to_string Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 20, 2022
@vamsi-verma-s
Copy link
Contributor

vamsi-verma-s commented Oct 20, 2022

it seems to be because of dash object's behavior. The below dash object's len(seq) = 2 even though the object has only one element. Therefore raises StopIteration on second iteration.

ipdb> print(seq)
Div([A(children=Img(id='mls_photo_div', referrerPolicy='noreferrer', src='https://ik.imagekit.io/theoldesthouse/22208359.jpg?tr=h-300%2Cw-400&ik-sdk-version=python-2.2.8', style={'display': 'block', 'width': '100%', 'margin-left': 'auto', 'margin-right': 'auto'}), href='https://www.bhhscalifornia.com/listing-detail/225-w-kelso-street-7-inglewood-ca-90301_5416097', referrerPolicy='noreferrer', target='_blank')])
ipdb> len(seq)
2

I am not sure if there is a better work around. but, it would be better if you could change it to string before printing it would work.

In [16]: df.popup_html = df.popup_html.astype('str')

In [17]: df
Out[17]: 

      mls_number  subtype street_number           street_name           City  ...  Full Bathrooms Half Bathrooms  Three Quarter Bathrooms                                         popup_html  date_generated
992   TR22223368   RMRT/D          1735         Redgate CIR      Diamond Bar  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...             NaN
993     22208909      APT          6373        Yucca ST   #18    Los Angeles  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...             NaN
994   SB22222749  CONDO/A        7765 W      91st ST   #A2126  Playa del Rey  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...             NaN
995   SR22220740   RMRT/A         23340      Schoolcraft ST       West Hills  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...             NaN
996     22208359    CONDO         225 W         KELSO ST   #7      Inglewood  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...             NaN
...          ...      ...           ...                   ...            ...  ...             ...            ...                      ...                                                ...             ...
2265    22204627      SFR        1718 N     Occidental BLVD      Los Angeles  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...           False
2266  WS22210260    SFR/D           125         Robinson ST      Los Angeles  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...           False
2267  SR22212601    SFR/A         18047            Erwin ST           Encino  ...               0              0                        0  [Div([A(children=Img(id='mls_photo_div', refer...           False
2268    22204091    CONDO          4310  CAHUENGA BLVD   #303    Toluca Lake  ...               0              1                        0  [Div([A(children=Img(id='mls_photo_div', refer...           False
2269    22203913      SFR          2646          Tilden AVE      Los Angeles  ...               0              1                        0  [Div([A(children=Img(id='mls_photo_div', refer...           False

[1278 rows x 35 columns]

@vamsi-verma-s
Copy link
Contributor

this needs to be fixed in dash I guess. __len__ does not match __iter__

https://github.com/plotly/dash/blob/ac7b37d0a40ffd7e48780e3c84f645c0554eb26f/dash/development/base_component.py#L358-L384

@perfectly-preserved-pie
Copy link
Author

Thanks for reporting this @perfectly-preserved-pie! To investigate this further it would helpful to narrow down where things are going wrong. First, can you reproduce this in a more minimal example? (without pickle, with as few rows/columns as possible, ideally just the column with rows that fail to display). Does that column contain a dash/plotly specific type or can it be reproduced without?

It's possible that the issue is with printing whatever type is held in that column (does just displaying df["popup_html"] fail?) Also, does the error reproduce with each individual row of your dataframe? It could be that some specific rows somehow contain corrupted data that are raising an error when pandas iterates over them and calls __repr__ on each value.

First, can you reproduce this in a more minimal example?

That 'popup_html' code is generated from a CSV spreadsheet like this one using this function.. If I go line by line in dataframe.py (just basically importing that spreadsheet and executing code from line 1 to line 507) the error is still the same when I print the df. I can't give an example as I'm about to hit my Google Maps API quota for the month :(

Does that column contain a dash/plotly specific type or can it be reproduced without?

I don't believe so. It's just a list containing other lists and dicts. It's an object dtype.

does just displaying df["popup_html"] fail?

Yes.

Also, does the error reproduce with each individual row of your dataframe?

Also yes. I've spot checked single rows and used random .iloc ranges.

With all that being said, this might be more of a Dash issue than Pandas like @vamsi-verma-s said. If that's the case I'm happy to close out this issue since I don't think it's something you can fix, and @vamsi-verma-s provided a nice workaround by casting the column as a string. Which totally works for me; I just need to print the df when I'm debugging something. My webapp reads the whole dataframe just fine.

@huzecong
Copy link

huzecong commented May 29, 2024

Hi, sorry for reopening an old issue, but I don't believe this was fixed. I actually have a minimal reproducible example of the same bug that doesn't involve any other library:

import pandas as pd
print(pd.__version__)
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df_outer = pd.DataFrame({"a": [{"x": df}]})
print(df_outer)

This is the stack trace I'm seeing, using pandas version 2.2.2:

Stack Trace

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[1], line 5
      3 df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
      4 df_outer = pd.DataFrame({"a": [{"x": df}]})
----> 5 print(df_outer)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/core/frame.py:1214, in DataFrame.__repr__(self)
   1211     return buf.getvalue()
   1213 repr_params = fmt.get_dataframe_repr_params()
-> 1214 return self.to_string(**repr_params)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/util/_decorators.py:333, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    327 if len(args) > num_allow_args:
    328     warnings.warn(
    329         msg.format(arguments=_format_argument_list(allow_args)),
    330         FutureWarning,
    331         stacklevel=find_stack_level(),
    332     )
--> 333 return func(*args, **kwargs)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/core/frame.py:1394, in DataFrame.to_string(self, buf, columns, col_space, header, index, na_rep, formatters, float_format, sparsify, index_names, justify, max_rows, max_cols, show_dimensions, decimal, line_width, min_rows, max_colwidth, encoding)
   1375 with option_context("display.max_colwidth", max_colwidth):
   1376     formatter = fmt.DataFrameFormatter(
   1377         self,
   1378         columns=columns,
   (...)
   1392         decimal=decimal,
   1393     )
-> 1394     return fmt.DataFrameRenderer(formatter).to_string(
   1395         buf=buf,
   1396         encoding=encoding,
   1397         line_width=line_width,
   1398     )

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:962, in DataFrameRenderer.to_string(self, buf, encoding, line_width)
    959 from pandas.io.formats.string import StringFormatter
    961 string_formatter = StringFormatter(self.fmt, line_width=line_width)
--> 962 string = string_formatter.to_string()
    963 return save_to_buffer(string, buf=buf, encoding=encoding)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/string.py:29, in StringFormatter.to_string(self)
     28 def to_string(self) -> str:
---> 29     text = self._get_string_representation()
     30     if self.fmt.should_show_dimensions:
     31         text = f"{text}{self.fmt.dimensions_info}"

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/string.py:44, in StringFormatter._get_string_representation(self)
     41 if self.fmt.frame.empty:
     42     return self._empty_info_line
---> 44 strcols = self._get_strcols()
     46 if self.line_width is None:
     47     # no need to wrap around just print the whole frame
     48     return self.adj.adjoin(1, *strcols)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/string.py:35, in StringFormatter._get_strcols(self)
     34 def _get_strcols(self) -> list[list[str]]:
---> 35     strcols = self.fmt.get_strcols()
     36     if self.fmt.is_truncated:
     37         strcols = self._insert_dot_separators(strcols)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:476, in DataFrameFormatter.get_strcols(self)
    472 def get_strcols(self) -> list[list[str]]:
    473     """
    474     Render a DataFrame to a list of columns (as lists of strings).
    475     """
--> 476     strcols = self._get_strcols_without_index()
    478     if self.index:
    479         str_index = self._get_formatted_index(self.tr_frame)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:740, in DataFrameFormatter._get_strcols_without_index(self)
    736 cheader = str_columns[i]
    737 header_colwidth = max(
    738     int(self.col_space.get(c, 0)), *(self.adj.len(x) for x in cheader)
    739 )
--> 740 fmt_values = self.format_col(i)
    741 fmt_values = _make_fixed_width(
    742     fmt_values, self.justify, minimum=header_colwidth, adj=self.adj
    743 )
    745 max_len = max(*(self.adj.len(x) for x in fmt_values), header_colwidth)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:754, in DataFrameFormatter.format_col(self, i)
    752 frame = self.tr_frame
    753 formatter = self._get_formatter(i)
--> 754 return format_array(
    755     frame.iloc[:, i]._values,
    756     formatter,
    757     float_format=self.float_format,
    758     na_rep=self.na_rep,
    759     space=self.col_space.get(frame.columns[i]),
    760     decimal=self.decimal,
    761     leading_space=self.index,
    762 )

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:1161, in format_array(values, formatter, float_format, na_rep, digits, space, justify, decimal, leading_space, quoting, fallback_formatter)
   1145     digits = get_option("display.precision")
   1147 fmt_obj = fmt_klass(
   1148     values,
   1149     digits=digits,
   (...)
   1158     fallback_formatter=fallback_formatter,
   1159 )
-> 1161 return fmt_obj.get_result()

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:1194, in _GenericArrayFormatter.get_result(self)
   1193 def get_result(self) -> list[str]:
-> 1194     fmt_values = self._format_strings()
   1195     return _make_fixed_width(fmt_values, self.justify)

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:1259, in _GenericArrayFormatter._format_strings(self)
   1257 for i, v in enumerate(vals):
   1258     if (not is_float_type[i] or self.formatter is not None) and leading_space:
-> 1259         fmt_values.append(f" {_format(v)}")
   1260     elif is_float_type[i]:
   1261         fmt_values.append(float_format(v))

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/format.py:1239, in _GenericArrayFormatter._format_strings.<locals>._format(x)
   1236     return repr(x)
   1237 else:
   1238     # object dtype
-> 1239     return str(formatter(x))

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/printing.py:219, in pprint_thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings, max_seq_items)
    215     return str(thing)
    216 elif isinstance(thing, dict) and _nest_lvl < get_option(
    217     "display.pprint_nest_depth"
    218 ):
--> 219     result = _pprint_dict(
    220         thing, _nest_lvl, quote_strings=True, max_seq_items=max_seq_items
    221     )
    222 elif is_sequence(thing) and _nest_lvl < get_option("display.pprint_nest_depth"):
    223     result = _pprint_seq(
    224         thing,
    225         _nest_lvl,
   (...)
    228         max_seq_items=max_seq_items,
    229     )

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/printing.py:155, in _pprint_dict(seq, _nest_lvl, max_seq_items, **kwds)
    149     nitems = max_seq_items or get_option("max_seq_items") or len(seq)
    151 for k, v in list(seq.items())[:nitems]:
    152     pairs.append(
    153         pfmt.format(
    154             key=pprint_thing(k, _nest_lvl + 1, max_seq_items=max_seq_items, **kwds),
--> 155             val=pprint_thing(v, _nest_lvl + 1, max_seq_items=max_seq_items, **kwds),
    156         )
    157     )
    159 if nitems < len(seq):
    160     return fmt.format(things=", ".join(pairs) + ", ...")

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/printing.py:223, in pprint_thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings, max_seq_items)
    219     result = _pprint_dict(
    220         thing, _nest_lvl, quote_strings=True, max_seq_items=max_seq_items
    221     )
    222 elif is_sequence(thing) and _nest_lvl < get_option("display.pprint_nest_depth"):
--> 223     result = _pprint_seq(
    224         thing,
    225         _nest_lvl,
    226         escape_chars=escape_chars,
    227         quote_strings=quote_strings,
    228         max_seq_items=max_seq_items,
    229     )
    230 elif isinstance(thing, str) and quote_strings:
    231     result = f"'{as_escaped_string(thing)}'"

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/printing.py:120, in _pprint_seq(seq, _nest_lvl, max_seq_items, **kwds)
    118 s = iter(seq)
    119 # handle sets, no slicing
--> 120 r = [
    121     pprint_thing(next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
    122     for i in range(min(nitems, len(seq)))
    123 ]
    124 body = ", ".join(r)
    126 if nitems < len(seq):

File ~/Library/Python/3.10/lib/python/site-packages/pandas/io/formats/printing.py:121, in <listcomp>(.0)
    118 s = iter(seq)
    119 # handle sets, no slicing
    120 r = [
--> 121     pprint_thing(next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
    122     for i in range(min(nitems, len(seq)))
    123 ]
    124 body = ", ".join(r)
    126 if nitems < len(seq):

StopIteration:

My interpretation is that this happens because pandas treats the nested DataFrame as a normal sequence and tries to iterate on it, but for DataFrames len(df) != len(list(df)) because the former is #rows and the latter is #columns.

EDIT: Seems like I can't reopen the issue, so I'll wait for someone with permissions to respond. Please also let me know if I should open a new issue instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Info Clarification about behavior needed to assess issue Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

4 participants