This will keep non-core usage types from being described as a "non-standard usage class" in recruitment error messages. These error messages will only be triggered if there is a mismatch between recruits and recruitment patterns, so this magic comment is low-priority.
It was suggested that wmllint could auto-detect new usage values in unit files, and automatically append them to the list of recognized usage types. This was rejected because of the possibility of adding misspelled or mistaken usage types.
When recruitment patterns include bogus usage classes, it will trigger a message during the consistency check. However, this check won't tell you whether your problem is with the usage classes or the recruits. For example, if you get a message that "no light fighter units are recruitable", your first thought might be that you didn't include a light fighter unit in the recruit list. The message doesn't clue you in directly that the light fighter simply doesn't exist in mainline.
Now a clause will be added to the message when non-mainline recruitment classes are involved, alerting designers when bogus usage classes are involved, but mild enough that it hopefully won't scare away authors legitimately seeking to use custom classes.
This replaces the old usage check. The next commit will add a magic comment for appending custom usage types.
The convention that [textdomain] uses "/translations" is strongly established, and I can't think of a legitimate reason for an add-on not to be following it.
The binary path check is a crude test. The names that took hold for menu image directories are "/public" and variations of "/external*", so we look for those strings. It does not catch the worst case of all - when all binaries are outside the campaign define, not just a set-aside directory.
I first thought of these checks while brainstorming ways to use the in_textdomain and in_binary_path code in hack_syntax(). However, realizing that these checks did not really hack any syntax, I wanted to find someplace in the sanity checks where the code would fit. I finally found it.
Here's the rationale for these additions:
* There is so much focus on wmllint's role in conversion, that many people may not think of it as a validator also (I didn't). So often, stumped authors ask in the forums about problems that would have been fixed or pointed out if they'd run wmllint. I want to encourage awareness of wmllint as a validator.
* Folded a line to fit normal 80-width CLI.
* Help contained no mention of this rather redundant option.
* How many people don't realize that ESR's long introduction is there?
* Some users may not understand why they're being dumped back to wmllint's help.
I used "inconsistency" for the actual variable name, because "known" seems more likely to be accidentally reused.
I pondered whether to allow the scenario check to go forward, but decided to just make a clean break.
Note that this does not prevent any of the information-gathering for the consistency check, just the check itself.
Why would you want to use this option? Of course, you should run the consistency check at some point. But if you simply want to recheck if you've fixed all the bugs in your campaign, you might not want to have wmllint slog through data/core again.
According to the introduction, stringfreeze does *not* suppress the warning, and the code bears this out.
I wonder how often this option is actually used.
I realized that as it stood, my dictionary would linger, bad if wmllint were being run on multiple campaigns. A special unwho keyword, 'all', clears the dictionary.
Now that we see how the whopairs are recognized, we can see that the magic comment accepts a comma-separated list, for macros that deal with more than one character.
We also see that if it is necessary to remove a character who leaves the party, this can be done with another entry prefixed by double minuses.
The "recognize" magic comment only covers one scenario; but what about macros that are used in many scenarios?
This new magic comment creates dictionary pairs of macros with the characters they are associated with. If this is not yet clear, hopefully the following commits will show the full picture.
Rather than overwhelming users with verbiage, I will hide most of my explanation unless it's asked for. My message is still not particularly brief, but it's no longer insanely long.
groggy: The ifdef_stack = [None] assignment made wmllint crash upon nested
#if blocks. The following block of wml should suffice to let it crash. The
inner #endif deletes the data about encountered #ifs.
...
#ifver WESNOTH_VERSION >= 1.11.0
#ifhave ~add-ons/UMC_Music_Book_1/_main.cfg
[binary_path]
path=data/add-ons/UMC_Music_Book_1
[/binary_path]
#endif
#else
#ifhave ~add-ons/UMC_Music/_main.cfg
[binary_path]
path=data/add-ons/UMC_Music
[/binary_path]
#endif
#endif
...
I noted last time that my "fix" could not cover all possibilities. After further thought, I decided that the best thing to do in the hard cases is to sys.exit, and give users a clear explanation in stderr so they can re-enter their paths correctly. I may have gotten carried away, but given that many users nowadays are unfamiliar with the command line (even moreso on Windows), I wanted to give them plenty of hand-holding.
I looked up this issue, and it turns out to be a Windows shell problem after all, not Python, which surprised me.
After more testing, I realized that I did not take into account the possibility that the wildcard pattern would not match anything. In that case, the following 'if not arguments' clause would run wmllint on the entire current directory - which could very well be something that you do not want!
Although the original purpose of the in_textdomain and in_binary_path code, an aborted effort to update their paths to "data/add-ons/", has been superseded by code that updates those paths on all lines, it can still be put to use.
Our first step is to move that section below the code that updated UMC paths, so our regexes won't have to deal with "@campaigns" and "data/campaigns" strings. Then we delete the 'if 0:' line that was neutralizing this section, as well as the obsolete path-changing code. The rest is de-indented one level.
Then we look for the use of "~" for userdata, which does not work for textdomains and binary paths.
Our regex object, 'tilde', is constituted thusly: (1) We make sure that the line starts with the "path" key. Any line we're interested in ought to start with this, and this will also keep this code from going wild on the campaign includes, if an author forgot a closing tag (no reset to False). (x) There shouldn't be any whitespace around the = sign, but we'll be kind. (2) On the value side, there shouldn't be anything before the tilde except perhaps a quote. Rather than underestimate the ingenuity of authors in coming up with weird code, however, I allow anything except a comment to match for a few characters. But if we haven't hit the '~' after five characters, I figure something's wrong, and bail. (3) Then we come to the tilde. Normally, it would be adjoining "add-ons/", but some authors interpolate a slash, or 'data/' (here represented as an optional string).
If we match, we rebuild the line, except 'data/add-ons' is substituted for group(3), and we log to stdout.
The Windows cmd shell does not expand wildcards by default, unlike UNIX shells. This imports glob.glob and runs arguments through it on Windows.
Frontported (in modified form) from my 1.4 work!
While testing my next commit, I discovered that EH's fix works when there is only one argument, or if the offender is the last argument, but doesn't work with multiple entries. His fix is meant to work on each argument, but the (unintentionally) escaped quote no longer serves to end the argument, causing following arguments to be considered part of the same argument.
Using split() allows us to break apart these misconjoined arguments. With rstrip(), we prevent an empty string that Windows will also complain it cannot find. However, if there are three or more arguments, there will still be lumped-together arguments unless all arguments up to the second from last also end with a backslash and quote. It is impossible to cover every possible case.
The re.sub handles the probably rare case where a backslash before a quote comes within the argument rather than at the end. However, it will only work if there is only one argument.
All this is unnecessary if the OS is not Windows (also, I haven't had the opportunity to test this on a non-Windows system to see if it has any side-effects there). So I've put it under a sys.platform condition.
This section assumed that "#ifdef" and "#endif" would come at the very start of a line. When an author would indent the #ifdef but not the #endif, ifdef_stack.pop() would kill the starting value of None, leaving an empty list. wmllint would then crash:
File "wmllint", line 1138, in global_sanity_check
recruit[ifdef_stack[-1]] = (i+1, map(lambda x: x.strip(), value.split(",")))
IndexError: list index out of range
Stripping the line not only stops the crashes, but allows wmllint to pick up #ifdefs that it wasn't before.
I then looked more closely at the pop(). #endif shouldn't just drop the last value in the stack, but reset the whole stack back to None. I realized that pop() was leading to wmllint occasionally assigning recruitment that wasn't inside an #ifdef to values from earlier #ifdef stacks, e.g.:
>> starting value: [None]
#ifdef EASY >> [None, 'EASY']
..
#else >> [None, 'EASY', '!EASY']
..
#endif >> pop(): [None, 'EASY']
..
recruit= >> ifdef_stack[-1]: EASY
wmllint crashes if the value for the map_data key consists of unfilled quotation marks (""), just as it does if there are no quotation marks:
----
Traceback (most recent call last):
File "data/tools/wmllint", line 1917, in translator
outmap = [outmap[0]] + outmap + [outmap[-1]]
IndexError: list index out of range
----
This simple fix checks to make sure that there are no side-by-side quotation marks on the line.
I also tidied up a double-space two lines down.
The format of this message appears to have been copied from the preceding "pango string" messages, which have a "%s" where "< or >" is. But it causes wmllint to crash with the traceback, "TypeError: not all arguments converted during string formatting".
With three new orc alternate portraits added during 1.9, it is no longer necessary to double-up any of the substitutions for the five old-style orcish warlord portraits.
In fact, with six portraits available, one has to draw the short straw, and I chose grunt-4. I prefer grunt-5 and grunt-6, and also find them more leaderly.
The existing error message is oblique enough that those developers who knew enough to understand what wmllint was driving at would normally know better than to use "die" for last words in the first place. While the message may help experienced developers porting the work of inexperienced authors, most developers getting this message are probably puzzled. A more direct message would be more likely to set newbie UMC authors on the right path.
Explanation:
x = 0: Starting point of value x; will be incremented upward with each replaced path.
for dc: With finditer, we turn up all paths in a line.
if dc.group(1): We want to skip campaigns that have a match in the tuple 'mainline'.
lines[i]: Otherwise, we reconstruct the line, substituting 'data/add-ons/' for 'data/campaigns/'. For lines where more than one path needs to be replaced, however, there is a complication. Since 'add-ons' is two characters shorter than 'campaigns', but start() and end() are based on the original location of the iteration, we have to offset start() and end() by two for each prior replacement.
x+2: Having made a replacement, we increment x by two.
print: Report the change to stdout.
When pangoize detects an old-style color spec, it prints a message that it needs a "manual fix." Unfortunately, the old markup used decimal values while pango uses hexadecimal, and authors were left to do the conversion themselves.
My modification not only does the hex conversion, it provides pango code ready to copy and paste into the line.
Going over this:
rgb =: First step is to turn the original regular expression into a regex object. The one change is that later on, wmllint turns non-pango "<" and ">" into "</>", so I have the regex match those too, in case we are dealing with a file that has already been through wmllint before.
if rgb: Having turned the original search into a regex object, we are ready for an if test again.
r, g, b =: We need Python to recognize these strings as numbers.
if > 255: At least one old campaign ("A Sortie") has color specs that include values over 255. Given the impossibility of deciphering what color the author may have intended, I think the proper thing to do is to print an error pointing to the problem.
else: This, of course, is the normal case.
hexed: Here we convert our numbers to hexadecimal, and back into a string. Because numbers up to 15 will only have one hex digit and we need two, we will leave a "0" when we remove the "0x" prefix; then we take the last two characters, lopping off the zero from the numbers greater than 15 that already have two digits.
print: The new error message. With the regex object, we can cite the color spec specifically, not just refer to it as being "in line". And at the end is pango code, ready to copy and paste.
The misguided authors who put userdata/ in their paths cause problems not just for non-Windows users, but fellow Windows users who chose not to put userdata in the install directory. This error can be removed by an approach similar to that just used to purge backslashes:
* if 'userdata': A basic filter to cut down the number of lines being run through complicated regex speeds up performance.
* while: It is possible, though rare, for a line to contain more than one path with userdata. Points about the regex: a) We continue to use precomment, though it would be well to correct commented-out old paths also, lest they mislead any more UMC writers. b) In case you're wondering why I made one 'data/' string optional, there's a set of add-ons in 1.4 that use "userdata/campaigns" instead of "userdata/data/campaigns". c) The '[ac]' at the end is something of an artifact of the time before I excluded comments, but it provides another safety measure insuring that the string is actually a value.
* regex object: This splits precomment into groups. Notes: a) Some authors begin with an unnecessary "../", might as well get rid of it as well. (As far as I can tell, this prefix has no effect anywhere I've seen it used, but I'd want to be positive that it ALWAYS does nothing before having wmllint replace it everywhere.) b) The first two groups have been made non-capturing; we will not need to refer to them. c) For reporting to stdout, group(1) is extended to the next '/', though this part of the match is optional, to insure that there's no way to get trapped in a while loop.
* precomment: Here, we reconstruct precomment based on the regex object, except we simply drop what's outside group(1).
* print: In case designers don't get the point from seeing the elimination logged in stdout, I include an all-caps admonition against "userdata/". This is a really irritating bug.
* This code was inserted before the reconstruction of lines[i] from precomment and comment.
The task is to replace Windows backslashes in paths, without indiscriminately replacing backslashes in legitimate use as escapes (or bridge terrain).
Breaking this down:
* "no-syntax-rewrite": I don't think this is really necessary, but I will follow the practice of the hack_syntax section.
* if lines[i].lstrip().startswith("#"): Excludes lines that are only comments or defines.
* precomment: Originally, I simply excluded "#" during the while statement, but I realized that this could wrongly mistake a Pango color code (or old-style markup for green) as a comment. I now look for whitespace before "#", and rewrote this section to operate on precomment rather than the whole line.
* comment: Simply going with the second field would exclude the separator itself, so I use len.
* if '\\': Technically, this code worked going straight to the while statement, but running every line through the complicated regex made it sluggish. Faster to make sure the line meets a simple filter first.
* while: It is possible, though very uncommon, that a line could contain more than one file path. Looking at the regex test: a) Match the backslash itself, then be on the lookout for a character used to set apart a file path from other text. By excluding these, we ensure that there is an unbroken chain from the backslash to the file extension. b) Then we have a list of file extensions, bracketed on the left by a period and the right by \b, to make sure they do not coincidentally match a string. These are the file types that might be referenced by a value (for instance, translation files are not referred to directly, so their extensions are not included). As a practical matter, EVERY instance in the wild I know of involves png, not counting one commented-out path in an ancient campaign. c) File extensions can include capitals, particularly on Windows, where the effects of DOS unicase linger. So we make the search case-insensitive.
* regex object: This splits the line into groups. The differences from the regex used in the while statement: a) We also look for a non-pathbreaking string to the left of the backslash as well as the right. This means that group(1) will match the entire file path except the extension. b) The \b boundary has been made a non-consuming look-ahead assertion, to simplify future references to the regex object and its groups.
* fronted: The regex object, except all backslashes in group(1) are replaced by frontslashes.
* precomment: Here, we simply reconstruct precomment with the modified regex object.
* print: Besides reporting the substitution to stdout, I include a plea for cross-platform compatibility.
* lines[i]: Now it's time to rebuild the whole line.
As part of a GSoC proposal I added a new aspect so a scenario editor can control advancements in two ways:
1. Define a aspect with a string-value like "Swordsman, Knight", so the units of interesst will always advance to this
2. Use the LUA-Engine and return a function of the form advance(x, y) which will itself return a string-value
like "Swordsman, Knight". Everytime a ai-unit advances advance(x, y) will be called.
The corresponding wikipage (http://wiki.wesnoth.org/AiWML) is going to be updated soon.