My rst README is not formatted on pypi.python.org

Solution 1:

It turns out that the answer from @sigmavirus regarding the links was close. I started a discussion on the distutils mailing list and found out that in-page links (i.e. #minimum-cash) are not allowed by the pypi reStructuredText parser and will invalidate the entire document.

It seems that pypi uses a whitelist to filter link protocols (http vs ftp vs gopher), and sees '#' as an invalid protocol. It seems like this would be pretty easy to fix on their end, but until then, I'll be removing my in-page anchor links.

Solution 2:

  • You may use collective.checkdocs package to detect invalid constructs:

    pip install collective.checkdocs python setup.py checkdocs

  • You may then use the following python function to filter-out sphinx-only constructs (it might be necessary to add more regexes, to match your content):

#!/usr/bin/python3
"""
Cleans-up Sphinx-only constructs (ie from README.rst),
so that *PyPi* can format it properly.

To check for remaining errors, install ``sphinx`` and run::

        python setup.py --long-description | sed -file 'this_file.sed' | rst2html.py  --halt=warning

"""

import re
import sys, io


def yield_sphinx_only_markup(lines):
    """
    :param file_inp:     a `filename` or ``sys.stdin``?
    :param file_out:     a `filename` or ``sys.stdout`?`

    """
    substs = [
        ## Selected Sphinx-only Roles.
        #
        (r':abbr:`([^`]+)`',        r'\1'),
        (r':ref:`([^`]+)`',         r'`\1`_'),
        (r':term:`([^`]+)`',        r'**\1**'),
        (r':dfn:`([^`]+)`',         r'**\1**'),
        (r':(samp|guilabel|menuselection):`([^`]+)`',        r'``\2``'),


        ## Sphinx-only roles:
        #        :foo:`bar`   --> foo(``bar``)
        #        :a:foo:`bar` XXX afoo(``bar``)
        #
        #(r'(:(\w+))?:(\w+):`([^`]*)`', r'\2\3(``\4``)'),
        (r':(\w+):`([^`]*)`', r'\1(``\2``)'),


        ## Sphinx-only Directives.
        #
        (r'\.\. doctest',           r'code-block'),
        (r'\.\. plot::',            r'.. '),
        (r'\.\. seealso',           r'info'),
        (r'\.\. glossary',          r'rubric'),
        (r'\.\. figure::',          r'.. '),


        ## Other
        #
        (r'\|version\|',              r'x.x.x'),
    ]

    regex_subs = [ (re.compile(regex, re.IGNORECASE), sub) for (regex, sub) in substs ]

    def clean_line(line):
        try:
            for (regex, sub) in regex_subs:
                line = regex.sub(sub, line)
        except Exception as ex:
            print("ERROR: %s, (line(%s)"%(regex, sub))
            raise ex

        return line

    for line in lines:
        yield clean_line(line)

and/or in your setup.py file, use something like this::

def read_text_lines(fname):
    with io.open(os.path.join(mydir, fname)) as fd:
        return fd.readlines()

readme_lines = read_text_lines('README.rst')
long_desc = ''.join(yield_sphinx_only_markup(readme_lines)),

Alternatively you can use the sed unix-utility with this file:

## Sed-file to clean-up README.rst from Sphinx-only constructs,
##   so that *PyPi* can format it properly.
##   To check for remaining errors, install ``sphinx`` and run:
##
##          sed -f "this_file.txt" README.rst | rst2html.py  --halt=warning
##

## Selected Sphinx-only Roles.
#
s/:abbr:`\([^`]*\)`/\1/gi
s/:ref:`\([^`]*\)`/`\1`_/gi
s/:term:`\([^`]*\)`/**\1**/gi
s/:dfn:`\([^`]*\)`/**\1**/gi
s/:\(samp\|guilabel\|menuselection\):`\([^`]*\)`/``\1``/gi


## Sphinx-only roles:
#        :foo:`bar` --> foo(``bar``)
#
s/:\([a-z]*\):`\([^`]*\)`/\1(``\2``)/gi


## Sphinx-only Directives.
#
s/\.\. +doctest/code-block/i
s/\.\. +plot/raw/i
s/\.\. +seealso/info/i
s/\.\. +glossary/rubric/i
s/\.\. +figure::/../i


## Other
#
s/|version|/x.x.x/gi

Solution 3:

EDIT: You can use the following to find errors in your RST that will show up on PyPI:

twine check

You'll need twine version 1.12.0 or higher. If you don't have it you can install or update it using:

pip install --upgrade twine

Source


Deprecated answer:

python setup.py check --restructuredtext

Source

Solution 4:

The first thing that pops out at me (after a quick scan) is that in your Advanced Filters section you use two underscores after a link, e.g.,

`Link text <http://example.com>`__

Where it should be

`Link text <http://example.com>`_

It's odd that the reStructuredText checkers didn't catch that. If you have docutils installed as well, you can run rst2html.py README.rst and it should print out the HTML. If there are errors it will fail and tell you where the errors were.

Also, fair warning, lists should have no leading spaces, i.e., you have

 - foo
 - bar

Instead of

- foo
- bar

(To make it more visually clear)

- foo # correct
 - one too many for a regular list, it will show up as a quoted list

Also, relative linking doesn't work like so Text to link <#link>_. If you want to link to a separate section you have to do the following:

Here's my `link <section_name>`_ to the other section.

.. Other stuff here ...

.. _section_name:

Min/Max Investment Opportunities and Other Foo Biz Baz
------------------------------------------------------