This escapes all strings provided by add-ons server data to guarantee
they can't be used to get extraneous and potentially harmful HTML into
the generated web index.
However, and because I don't have time to look into the dense regex
contained in the relevant code right now, it also removes the hidden
feature of linkifying any URLs found in add-on descriptions. It's a
small price to pay for our safety, really.
First, a couple of extraneous quotes were left in the second regex around
"</a>".
Second, it is possible that a period or question mark could be used to end
a sentence, rather than be part of the URL. So check that these characters
are followed by an alphanumeric character to make them part of the URL.
After my last change dealing with this issue, I noticed that descriptions
with <pre> had an extra blank line at the top. Adding top-margin to the
CSS file made this go away, but it also made the <br/> superfluous. Thus,
it is simpler just to have every description use <pre> instead of <br/>.
This should finally resolve everything having to do with the add-on
descriptions.
If I'd noticed that the re module hadn't been imported, I probably wouldn't
have considered URL linking to be important enough to do so. Since I've
already written the code, however, I'll keep it.
This is one source of missing-image results.
There remain other reasons for missing icons. The script doesn't find images
in add-ons. And when resources are moved or renamed, they are no longer found
by the script, even if they had been found before.
Also, capitalize a sentence.
After looking into it some more, I think I've figured out how to handle <pre>
in the CSS. So, use that, when description has more than one line.
Also, go to re.sub for turning URLs into links. The version of Python I was
testing my code on wasn't properly handling backreferences in the replacement
string when in the form "\#", causing me to use finditer instead of sub. But
I've discovered that it does handle backreferences in the form "\g<#>". So
switch to much simpler re.sub code.
The description text does not get rendered very well on a webpage. One
solution might be to use pre-wrap/word-wrap in the CSS, but due to
differences between browsers, that's a can of worms (at least for me, I'm
not a web pro).
So, the not-so-elegant solution is to add <br/> to every line.
URLs are also not linked in the plain text. Although in modern browsers
you can select the text and right-click, it's still convenient to turn
them into actual links.
...that creates problems on the server atm:
no teamcolorize for the icon list till the server is fixed again what
should be done: serialize the teamcolorize call, so that the script
only starts one instance at a time
--tar Causes tarballs to be generated for each addon newly downloaded
with --download.
--url=... Adds download links to the --html output, with the given
base URL.