This compatibility behaviour implements the buggy behaviour of FOR /F
token parsing that can be observed in Windows' CMD, and that is tested
by the cmd_winetests.
It can be disabled at compile time via the MSCMD_FOR_QUIRKS define.
It fixes additional cmd_winetests, in concert with commit cb2a9c31.
Explanation of the implemented buggy behaviour
==============================================
In principle, the "tokens=x,y,m-n[*]" option describes a list of token
numbers (must be between 1 and 31) that will be assigned into variables.
Theoretically this option does not cumulate: only the latest 'tokens='
specification should be taken into account.
However things are not that simple in practice. First, not all of the
"tokens=" option state is reset when more than one specification is
provided. Second, when specifying a token range, e.g. "1-5", Windows'
CMD just ignores without error ranges that are not specified in
increasing order. Thus for example, a range "5-1" is ignored without
error. Then, token numbers strictly greater than 31 are just ignored,
and if they appear in a range, the whole range is ignored.
Another bug is the following one: suppose that the 'tokens'
specification reads:
"tokens=1-5,1-30" , or: "tokens=1-5,3" ,
i.e. more than one range, that overlap partially. Then the actual total
number of variables will not be of the larger range size, but will be
the sum, instead.
Thus, in the first example, a total of 5 + 30 == 35 variables (> 31) is
allocated, while in the second example, a total of 5 + 1 == 6 variables
is allocated, even if they won't all store data !!
In the first example, only the first 30 FOR variables will be used, and
the 5 others will contain an empty string. In the second example, only
the first 5 FOR variables will be used, and the other one will be empty.
We also see that due to that, the "Variables" buffer of fixed size
cannot always be used (since it can contain at most 32 variables).
Last but not least, when more than one "tokens=" specification is
provided, for example:
"tokens=1-31 tokens=1-20"
a total number of 31 FOR variables (because 31 is the max of 31 and 20)
is allocated, **but** only 20 are actually used, and the 11 others
return an empty string.
And in the specification: "tokens=1-31,* tokens=1-20", a total of
31 + 1 + 20 = 52 variables is initialized, but only the first 20 will
be used, and no "remaining-line" token (the '*' one) is used.
Suppose the following FOR-loop command, to be run from the command-line
(if using a batch file, double each percent '%' sign):
FOR %l IN ("a,b,c,d,e" "f,g,h,i,j") DO (
FOR /F "delims=, tokens=1-3*" %a IN (%l) DO @echo %a-%b-%c-%d
)
The outermost FOR-loop enumerates the two strings "a,b,c,d,e" and
"f,g,h,i,j" (placed in %l), and parse each of these in turn, splitting
them at each specified delimiter (here only one character) ',' and storing
the results in consecutive tokens %a, %b, %c, %d, with the last token %d
containing all the remaining string (non-split).
The expected result is:
a-b-c-d,e
f-g-h-i,j
However, due to the way the delimiters string specified by the "delims="
option is stored (no stack/heap duplication of the FOR-option substring,
but reading from it directly), during the first run of the innermost
FOR-loop, the option string "delims=, tokens=1-3*" was truncated to just
after the ',' due to the erroneous "delims=" parsing, so that when this
FOR-loop ran for a second time (to deal with the second string), the option
string was already erroneously truncated, without the "tokens=..." part,
so that the parsing results were not stored in the tokens and resulting in:
a-b-c-d,e
f-%b-%c-%d
instead. The solution is to save where the "delims=" string needs to be
cut, but wait until running the actual FOR-loop to terminate it (and
saving the original character too), run the FOR-loop body, and then
restore the original character where termination took place. This allows
having the FOR-loop option string valid for the next execution of the
FOR-loop.
CORE-13682
- Split SubstituteVars() into its main loop and a helper SubstituteVar()
that just substitutes only one variable.
- Use this new helper as the basis of the proper implementation of the
delayed expansion of variables.
- Fix a bug introduced in commit 495c82cc, when GetBatchVar() fails.
CORE-11857 CORE-13736
It will be followed with a separate fix for the FOR-loop code.
Fixes some cmd_winetests.
A NULL pointer can be returned for a valid existing batch/FOR variable,
in which case the enhanced-variable getter should return an empty string.
This situation can happen e.g. when forcing a FOR-loop to tokenize a
text line with not enough tokens in it.
They are currently specified for documentation purposes (i.e. what
Windows 8+ CMD.EXE can report) but not used yet, since ReactOS does not
support them.
- Display the names of the files being TYPEd only if more than one file
has been specified on the command-line, or if a file specification
(with wildcards) is present (even just for one).
These names are displayed on STDERR while the files are TYPEd on
STDOUT, therefore allowing concatenating files by just redirecting
STDOUT to a destination, without corrupting it with the displayed file
names. Also, add a /N option to force not displaying these file names.
- When file specifications (with wildcards) are being processed, silently
ignore any directories matching them. If no corresponding files have
been found, display a file-not-found error.
- When explicitly directory names are specified, don't do any special
treatment; the CreateFile() call will fail and return the appropriate
error.
- Fix the returned errorlevel values.
See https://ss64.com/nt/type.html for more information.
Fixes some cmd_winetests.
- When reading from a file, retrieve its original size so that
we can stop reading it once we are beyond its original ending.
This allows avoiding an infinite read loop in case the output of
the file is redirected back to it.
Fixes CORE-17208
- Move the FileGetString() helper to the only file where it is
actually used.
- Restore any truncated space in the name prefix, before displaying
any error message.
- When trimming the name prefix from "special" characters (spaces, comma
and semicolon), so that e.g. "set ,; ,;FOO" displays all the variables
starting by "FOO", save also a pointer to the original name prefix, that
we will use for variables lookup as well.
This is done, because the SET command allows setting an environment variable
whose name actually contains these characters (e.g. "set ,; ,;FOO=42"),
however, by trimming the characters, doing "set ,; ,;FOO" would not allow
seeing such variables.
With the fix, it is now possible to show them.
That's an actual fact, done on original MS-DOS COMMAND.COM, FreeCOM,
Windows' CMD.EXE, etc., but is strangely undocumented on MSDN documentation.
See https://www.dostips.com/forum/viewtopic.php?t=4436
Fixes some cmd_winetests.
This functionality is: case insensitivity comparisons (/I);
CMDEXTVERSION and DEFINED unary operators; EQU, NEQ, LSS, LEQ, GTR, GEQ
generic string comparators.
- First, the option and the APIs called by it can work directly on
paths relative to the current directory. So there is no need to
call GetFullPathName(), with the risk of going over MAX_PATH if the
current path is quite long (or nested) but the RMDIR is called on a
(short-length) relative sub-directory.
- Append a path-separator (backslash), only if the specified directory
does not have one already, and, that it does not specify a current
directory via the "drive-root" method, e.g. "C:" without any trailing
backslash.
- In case there are errors during deletion of sub-directories or
sub-files, print the error but continue deleting the other sub-dirs
or files.
- Monitor the Ctrl-C breaker as well, and stop deleting if it has been
triggered.
- When removing file/directory read-only attribute, just remove this
attribute, but keep the other ones.
- When deleting the directory, first try to do it directly; if it fails
with access denied, check whether it was read-only, and if so, remove
this attribute and retry deletion, otherwise fails.
- When recursively deleting a drive root directory, ultimately resolve
the dir pattern and check whether it's indeed a drive root, e.g.
"C:\\", and if so, just return success. Indeed, calling
RemoveDirectory() on such drive roots will return ERROR_ACCESS_DENIED
otherwise, but we want to succeed even if, of course, we won't
actually "delete" the drive root.
Use kernel32!lstrcmp(i) instead of CRT!_tcs(i)cmp, so as to use the correct
current thread locale information when comparing user-specific strings.
As a result, the following comparison: 'b LSS B' will return TRUE,
instead of FALSE as it would be by using the CRT functions (and by
naively considering the lexicographical order in ANSI).
This behaviour has been introduced in Windows 2000 onwards.
For MKDIR, also properly support the case of ERROR_FILE_EXISTS and
ERROR_ALREADY_EXISTS last-errors by displaying the standard error
"A subdirectory or file XXX already exists.\n"
CORE-10495 CORE-13672
- Fix how the ERRORLEVEL and the last returned exit code are set by
EXIT and CALL commands, when batch contexts terminate, and when CMD
runs in single-command mode (with /C).
Addendum to commit 26ff2c8e, and reverts commit 7bd33ac4.
See also commit 8cf11060 (r40474).
More information can be found at:
https://ss64.com/nt/exit.htmlhttps://stackoverflow.com/a/34987886/13530036https://stackoverflow.com/a/34937706/13530036
- Move the actual execution of the CMD command-line (in /C or /K
single-command mode) from Initialize() to _tmain(), to put it on par
with the ProcessInput() interactive mode.
- Make ProcessInput() also return the last command's exit code.
We note two things, when CMD searches for the corresponding label in the
batch file:
- the first character of the line is always ignored, unless it's a colon;
- the escape caret ^ is supported and interpreted.
Fixes some cmd_winetests.