Hi,
Some time ago, I introduced a dependency against the python-markdown2 module in Bodhi. [1][2]
Today, two bugs were opened against Bodhi, both related to the way the Markdown module parses the text we feed it: - https://fedorahosted.org/bodhi/ticket/395 - https://fedorahosted.org/fedora-infrastructure/ticket/2033
The first one is caused by the fact that « _ » is interpreted by Markdown: _foo_ is translated to <em>foo</em> and __foo__ to <strong>foo</strong>.
This doesn't seem like a problem, except in cases like $SOME_LONG_VAR or http://foo-bar.com/foo_3_2.tar.gz, which are rather likely to appear in updates description.
I fixed it by disabling the interpretation of « _ » and « __ ». Not a great solution though, as this makes us not follow completely the Markdown syntax.
The second one however is way more tricky, as it comes from the regexp used to parse the string to translate. Basically, a string like « ***** Important ***** » can be translated in several ways, depending on the regexp/machine state used: 1. <strong><em>* Important *</em></strong> 2. <strong><em></strong> Important <strong><em></strong> 3. <em></em><strong><em> Important **</em></strong>
The parser could even try to be smart and close open tags properly (Trac does something like that, in its own syntax which is close to Markdown): 4. <strong><em></em></strong> Important <strong><em></em></strong> (</em> were added to close the tags)
The possibilities are basically endless.
Unfortunately, the python-markdown2 module does the second. The resulting HTML is thus not valid, so Kid gives an error 500.
I tried looking at the python-markdown2 module, but the fix won't be easy (at least not for me :). Also, I doubt it will be useful to report the bug and eventually submit a patch, as the project seems dead upstream since december (last commits/mail discussions, bugs are not answered even with patches attached,...).
One thing I'm really sad of, is that I had seen there was another Markdown python module: python-markdown [3], but I chose the other one.
It seems anterior to python-markdown2, however, it is still actively maintained (last release was in April 2009, but mails and bugs are still answered by the dev, development is active in Gitorious [4]).
Also, it has two nice side-effects:
1. it fixes our first issue in a much more elegant way, as it is able to translate _foo_ to <em>foo</em>, while recognizing that it shouldn't translate $SOME_LONG_VAR to $SOME<em>LONG</em>VAR (ignore underscores inside a word)
2. it interprets the string « ***** Important ***** » as « <strong><em>** test </em></strong>** », which (even if not really the best translation prettyness-wise), is valid HTML.
All in all, I would be more confident with this module than with the current one (and I'm not even counting the fact that it has an extensive test suite).
Should we rebase Bodhi on this module? The port is trivial, python-markdown is available both in Fedora and EPEL, but I don't really like moving to another dependency just because we encounter bugs, especially since I should have more carefully chosen the module to use in the first place. :(
What do you think?
[1] https://fedorahosted.org/bodhi/ticket/286 [2] http://code.google.com/p/python-markdown2/ [3] http://www.freewisdom.org/projects/python-markdown/ [4] http://gitorious.org/python-markdown
---------- Mathieu Bridon
On Sat, Mar 13, 2010 at 00:58, Mathieu Bridon bochecha@fedoraproject.org wrote:
All in all, I would be more confident with this module than with the current one (and I'm not even counting the fact that it has an extensive test suite).
Should we rebase Bodhi on this module? The port is trivial, python-markdown is available both in Fedora and EPEL, but I don't really like moving to another dependency just because we encounter bugs, especially since I should have more carefully chosen the module to use in the first place. :(
What do you think?
No reaction?
It's been almost 2 weeks already, so I'll move the Bodhi HEAD to the python-markdown module as I don't want to let those bugs alive, and for the reasons I mentioned in my previous email.
---------- Mathieu Bridon