Skip to content

doc generation fails on windows #5142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jankatins opened this issue Oct 7, 2013 · 23 comments
Closed

doc generation fails on windows #5142

jankatins opened this issue Oct 7, 2013 · 23 comments
Labels
Docs Windows Windows OS
Milestone

Comments

@jankatins
Copy link
Contributor

[...]
reading sources... [ 75%] io
Exception occurred while building, starting debugger:
Traceback (most recent call last):
  File "c:\portabel\python27\lib\site-packages\sphinx\cmdline.py", line 246, in
main
    app.build(force_all, filenames)
  File "c:\portabel\python27\lib\site-packages\sphinx\application.py", line 212,
 in build
    self.builder.build_update()
  File "c:\portabel\python27\lib\site-packages\sphinx\builders\__init__.py", lin
e 214, in build_update
    'out of date' % len(to_build))
  File "c:\portabel\python27\lib\site-packages\sphinx\builders\__init__.py", lin
e 234, in build
    purple, length):
  File "c:\portabel\python27\lib\site-packages\sphinx\builders\__init__.py", lin
e 134, in status_iterator
    for item in iterable:
  File "c:\portabel\python27\lib\site-packages\sphinx\environment.py", line 470,
 in update_generator
    self.read_doc(docname, app=app)
  File "c:\portabel\python27\lib\site-packages\sphinx\environment.py", line 621,
 in read_doc
    raise SphinxError(str(err))
SphinxError: 'utf8' codec can't decode byte 0x84 in position 25: invalid start b
yte

The problem is the following line in io.rst:

   data = 'word,length\nTr\xe4umen,7\nGr\xfc\xdfe,5'
@jreback
Copy link
Contributor

jreback commented Oct 8, 2013

I have seen this error when the local encoding is used (and it ascii). need to run the sphinx build with utf8 encoding enabled....not sure how to do that?

@cpcloud
Copy link
Member

cpcloud commented Oct 8, 2013

@JanSchulz Can you post the output of

import locale
print(locale.getlocale())
print(locale.getdefaultlocale())

Thanks.

@jorisvandenbossche
Copy link
Member

I have the same problem, and was also already discussed in the mailing list:
https://groups.google.com/forum/#!searchin/pydata/joris$20van$20den$20bossche/pydata/E90BXNLF-_E/DosVkLr_cNIJ
Conclusion form @jseabold was that there isn't really a solution.

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

closing this as not a bug

@jreback jreback closed this as completed Oct 14, 2013
@jankatins
Copy link
Contributor Author

just read the mailinglist discussion: pity :-(

@jorisvandenbossche
Copy link
Member

@JanSchulz I have a standard change to io.rst (remove the line with bytecode) in my git stash that I can apply each time I am working on the docs ... Not a real solution, but it works.

@jankatins
Copy link
Contributor Author

An idea would be to add this to html():

import os, io
_bad = "'word,length\nTr\xe4umen,7\nGr\xfc\xdfe,5'"
_good = "'word,length\nTraumen,7\nGruse,5'"
if os.name == 'nt':
    with io.open("source/io.rst", 'rw', encoding='ascii') as f:
        io_doc = f.read()

        io_doc = io_doc.replace(_bad, _good)
        f.write(io_doc)
# old stuff
if os.name == 'nt':
    with io.open("source/io.rst", 'rw', encoding='ascii') as f:
        io_doc = f.read()

        io_doc = io_doc.replace(_good, _bad)
        f.write(io_doc)

@jankatins
Copy link
Contributor Author

@jorisvandenbossche @jreback Would you take such a addition to html() to make doc generation on windows working?

@jorisvandenbossche
Copy link
Member

@JanSchulz I would be interested, but I don't know if it could be included.

However, the above code snippet is not working for me.

@jankatins
Copy link
Contributor Author

I think the real cause of this bug is in https://github.com/pydata/pandas/blob/master/doc/sphinxext/ipython_directive.py#L356

Surounding that line with a try: ... except: print(output) results in some dataframe output with umlauts. So it seems that the problem is not the input but the output of that call in the doc.

On my windows, sys.stdout.encoding is 'cp850' (in cmd -> python, everything else replaces that...) and I have the feeling (ipython internals make me dizzy... :-) ) that the ipython interpreter uses that to encode the result and write that to the redirected sys.stdout (during code execution a cStringIO).

So changing that line to

-            ret.append(output.decode('utf-8'))
+           ret.append(output.decode(sys.stdout.encoding))

will compile the docs for me.

So if this is right, everybody who has sys.stdout.encoding == "utf-8" will be able to compile the docs and everybody with something else should not be able to compile the docs.

@jorisvandenbossche: could you try this patch?

On the other hand, the ipython_directive.py included with ipython has changed that line simple to ret.append(output) (https://github.com/ipython/ipython/blob/master/IPython/sphinxext/ipython_directive.py#L350). I wonder if it would be better to use this directly and not a duplicated copy?

Anyway: could someone reopen this bug so that it can be fixed properly? @jreback

@jreback jreback reopened this Oct 21, 2013
@jorisvandenbossche
Copy link
Member

@JanSchulz It does work for me on windows (the ret.append(output.decode(sys.stdout.encoding)), the ret.append(output) as in ipython does not work).

About the duplicated copy, I opened an issue recently: #5221.

@jankatins
Copy link
Contributor Author

It worked when I replaced the embedded copy with the normal IPython installed one (simple add IPython.sphinxext. in front of both the ipython lines in doc/source/conf.py and delete both ipython*.py in doc/spinxext).

So what's better: replacing the embedded copy (if it works for the rest as well) or just adding the above fix (if that works for all)?

@jorisvandenbossche
Copy link
Member

And you didn't get other warnings because of using IPython's own sphinxext? (then that would already go a long way to solving #5221 for ipython_directive)

@jankatins
Copy link
Contributor Author

There were lots of warnings, but there were already lots of warnings when the embedded extension was used. The api doc seemed to be ok...

I will have a look into the diff of iypthon_directive (between the current one from pandas and the current one in ipython) and will have a look if something should be upstreamed.

@jorisvandenbossche
Copy link
Member

How is this coming? I think the better option would be to go with IPython's own sphinxext, but that may be some more work. So maybe your fix could be a quick temporary fix to at least fix the doc building on windows for now?

@jtratner
Copy link
Contributor

jtratner commented Nov 6, 2013

I think the problem is actually numpydoc (ie that it's producing a malformed table) . But we could follow this SO answer and prevent the building of any method called 'flags'

@jorisvandenbossche
Copy link
Member

@jtratner I think you are talking about #5331 instead of this one?

@ghost
Copy link

ghost commented Jan 24, 2014

@jorisvandenbossche , are we good here now?

@jorisvandenbossche
Copy link
Member

yes, I was also thinking this could be closed. It's not fixed as such, as the created output on Windows is not 'nice', but the actual issue here ('the building of the docs that fails') is solved.
You said you were going to give it another look (#5925 (comment)), but for me it's not that important anymore (the docs look nice in linux/website and builds on windows, so everybody happy).

So I think we can say: closed by #5925

@ghost
Copy link

ghost commented Jan 24, 2014

you have awesome collab powers @jorisvandenbossche, use them!

@ghost ghost closed this as completed Jan 24, 2014
@jorisvandenbossche
Copy link
Member

hmm, @y-p and you can sometimes write such cryptic messages :-)

@ghost
Copy link

ghost commented Jan 24, 2014

What I meant is: you can and should exercise your collab privileges, such as closing
issues and merging PRs. You have them because we trust your judgment. So close and
merge stuff. often.

@jorisvandenbossche
Copy link
Member

ok!

Luisruizbcn pushed a commit to Luisruizbcn/Onerepositorio that referenced this issue May 3, 2024
The doc generation failed under windows due to problems with sphinx
and encoded umlauts in code (see links in
pandas-dev/pandas#5142).

The workaround is to replace the offending text with one which does not fail
(but which makes the example a bit pointless), build the docs and restore the
old text.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Windows Windows OS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants