Win Batch 嵌套返回“意外”错误答案

【问题标题】：Win Batch nested for returns "unexpected" errorWin Batch 嵌套返回“意外”错误
【发布时间】：2020-09-02 23:02:52
【问题描述】：

Win10，CMD。

编写一个冗长而复杂的脚本，并遇到一个不想合作的部分。这是一个说明问题的 sn-p：

set TESTSTR="abc def ghi jkl mno pqr stu vwx yz"
for /l %%a in (1,1,9) do (
    set I=%%a
    for /f "tokens=!I!" %%b in ("%TESTSTR%") do echo %%b
)

预期的结果是

abc
def
ghi
jkl
mno
pqr
stu
vwx
yz

但我只得到了这个的 9 倍：

!I!" was unexpected at this time.

我尝试了多种变体，包括 set VAR="tokens=%%a" - for /f !VAR! ...我在其中粘贴了各种回声，发现变量 %%a 正在正确递增，并且我尝试使用的任何中间变量都被正确设置。

它似乎无法在另一个循环的选项部分使用延迟扩展或循环变量。我已经完成了类似的循环前选项定义，它们通常工作得很好，但从未在嵌套的 for 中尝试过，像这样使用外部循环的变量作为内部循环的参数。

示例：此代码 sn-p 有效，仅显示名为“Update??”的目录在当前目录中（是的，我知道这比需要的要多，我只是抓住它并将其重新用于这个演示 - 原始代码有一个集合而不是 echo，目的是获取最新的目录）：

    set FORPARM="usebackq tokens=1"
    for /f %FORPARM% %%f in (`dir /b /a:d /o:d Update??`) do echo %%f

此示例不在 for 或其他需要延迟扩展的情况下，我认为这会导致上面的问题代码出现问题 - for 命令似乎无法处理延迟扩展或变量为它的命令的一部分。

我想我可以通过调用来做到这一点，将第一个 for 的 var 作为调用的参数，并且调用具有内部 for，但不希望 - 我从中提取的例程已经足够复杂 - 它读取数据文件并填充多个数组。我有一个可以工作的例程变体，但它非常笨拙并且需要在数据文件中的数组数量发生变化时进行编辑。我正在尝试加快速度，使其更整洁，并且在数据更改时不需要更改代码。

还有其他人见过并通过这个吗？

【问题讨论】：

不明白你为什么不这样做：set "TESTSTR=abc def ghi jkl mno pqr stu vwx yz" 和一个简单的FOR 命令。 FOR %%G IN (%TESTSTR%) DO ECHO %%G.
@Squashman 因为上面是一个例子来说明问题。实际代码比内部 for 的 echo 多得多，而且外部循环内部比内部 for 更多。

标签： for-loop batch-file indexing nested delay

【解决方案1】：

您的代码非常接近。内部 for 循环在其 for 参数中需要来自外部 for 循环的更新值。这可以通过将内部 for 循环移动到子程序并调用该子程序来完成。在下面的代码中，我将内部 for 循环更改为单行子例程。这使代码与您的代码非常接近。

@echo off
set TESTSTR="abc def ghi jkl mno pqr stu vwx yz"
for /l %%a in (1,1,9) do (
    cmd/c "@echo off&for /f "tokens=%%a" %%b in (%TESTSTR%) do echo %%b"
)

【讨论】：

是的，我知道我可以做到这一点，尽管我必须在同一个脚本中调用而不是单个命令 cmd/c - 正如我在上面告诉 @Squashman 的那样，代码 sn上面的 -p 仅说明了这个问题 - 实际代码还有很多内容。我希望避免使用这些方法，因为该部分已经足够复杂了。我想如果我必须，我必须...感谢您确认！

【解决方案2】：

我真的不认为这是答案，但它是一个答案，基于我试图避免的（和@OJBakker 的答案）。把它放在这里而不是评论，因为评论只是修改了代码部分 - 我可能只是不知道我在做什么......）

代码如下：

:nested_for_test
    set TESTSTR=abc def ghi jkl mno pqr stu vwx yz
    for /l %%a in (1,1,9) do (
        call :nested_for_test_inner %%a
    )
    exit /b
:nested_for_test_inner
    for /f "tokens=%1" %%b in ("%TESTSTR%") do (
        echo %%b
    )
    exit /b

是的，我知道“do”部分不需要用括号括起来 - 因为“真实”代码在每个 do 中不仅仅是一行...

结果如预期：

Start:  8:28:05.98
abc
def
ghi
jkl
mno
pqr
stu
vwx
yz
Start:  8:28:05.98
Stop:  8:28:07.76
Elapsed: 1.78 seconds

是的，我的测试文件中有一个计时器例程。这显示的速度比我希望的要慢，但在合理的范围内，考虑到额外的调用。

如果有人知道如何在没有那个电话的情况下做到这一点，请添加！

【讨论】：

【解决方案3】：

好的...澄清问题。这是原始的“kludgey”代码 - 实际代码，而不是简化的说明性示例（我让我的 cmets 解释它在做什么）：

:: 'call :array_load pathname [arry:val]'
::  INPUT - 1-2 parameters: datafile path and optional condition
::  RESULT - sets up arrays & UBOUNDs (array[0])
:: Read a datafile (identified in the first parameter). This file has an array
:: definition line and a number of data lines. The array definition line starts
:: with 'arry ' and lists the arrays to be populated from the data statements.
:: Data statements list the information to be put into the arrays, in the same
:: order as the arrays listed in the arry statement.
:: An optional second parameter causes only lines that have the value given for
:: that array to be included (ex: 'ZIP:23456' causes only addresses in zipcode
:: 23456 to be read).
:: The arry statement can be anywhere in the file. The data statements are read
:: in the order of appearance.
:: The Upper boundary (UBOUND) is stored in [0]
:array_load
    set _DATAFILE=%1
    set _COND=%2
    if not exist %_DATAFILE% (
        echo [E] - {array_load} Array datafile %1 not found.  Aborting run.
        pause
        exit
    )
:: Detect and set up for use of the second parameter.
    if defined _COND for /f "tokens=1-2 delims=:" %%a in ("%_COND%") do (set _ARY=%%a&set _VAL=%%b)
:: Gets the list of arrays from the "arry" statement
    for /f "usebackq tokens=1,*" %%a in (`findstr "^arry " %_DATAFILE%`) do ( set "ARRAYLIST=%%b")
:: Set up the array names and prep for the second loop. The "tokens" var sets
:: up to read the columns in the data statements. Token 1 is always "data" and
:: is skipped.
    set ndx=1
    for %%a in (%ARRAYLIST%) do (
        set COL!ndx!=%%a
        set /a ndx+=1
    )
:: "TOKENS" is not a mathematical calculation - it is a range definition used
:: in the next FOR statement. Do not add /a to the set!
    set TOKENS=2-%ndx%
    set /a ARYCOUNT=ndx-1
:: Read the data statements into the arrays. The "if defined" increments the
:: index for the next read only if there is no condition or the condition is
:: met. The "for" inside this "if" exists because the condition can't be tested
:: without it.
    set ndx=1
    for /f "usebackq tokens=%TOKENS%" %%a in (`findstr "^data " %_DATAFILE%`) do (
        set !COL1![!ndx!]=%%a
        if %ARYCOUNT% GEQ 2 set !COL2![!ndx!]=%%b
        if %ARYCOUNT% GEQ 3 set !COL3![!ndx!]=%%c
        if %ARYCOUNT% GEQ 4 set !COL4![!ndx!]=%%d
        if %ARYCOUNT% GEQ 5 set !COL5![!ndx!]=%%e
        if %ARYCOUNT% GEQ 6 set !COL6![!ndx!]=%%f
        if %ARYCOUNT% GEQ 7 set !COL7![!ndx!]=%%g
        if defined _COND (
            for /f "usebackq" %%x in (`echo %_ARY%[!ndx!]`) do set RES=!%%x!
            if !RES!==%_VAL% set /a ndx+=1
        ) else (
            set /a ndx+=1
        )
        set /a !COL1![0]=!ndx!-1
        if %ARYCOUNT% GEQ 2 set /a !COL2![0]=!ndx!-1
        if %ARYCOUNT% GEQ 3 set /a !COL3![0]=!ndx!-1
        if %ARYCOUNT% GEQ 4 set /a !COL4![0]=!ndx!-1
        if %ARYCOUNT% GEQ 5 set /a !COL5![0]=!ndx!-1
        if %ARYCOUNT% GEQ 6 set /a !COL6![0]=!ndx!-1
        if %ARYCOUNT% GEQ 7 set /a !COL7![0]=!ndx!-1
    )
    exit /b

此例程加载一组与索引相关的数组。这是一个示例数据文件。我将 cmets 留在文件中以帮助解释发生了什么。例程实际上只使用了“arry”和“data”行。

To be REFERENCED by :array_load function

The line starting with "arry " contains the names of the arrays to be
filled from the data provided. Columns do not need to line up; extra spaces
are ignored. There is a maximum of six arrays without revisiting the
:array_load functions. ANY change of the "arry" names requires revisiting
the code that uses them!

All Data lines must begin with "data ", and NO other lines can start that
way! The keywords "data" and "arry" are case-sensitive and must be followed
by a space.

All lines that do NOT start with "arry " or "data " are comments.

arry FIRST   LAST    PHONE        ZIP

data Joe     Wilson  202-417-2742 20122
data John    Doe     209-659-2482 10523
data Susan   Doe     209-659-2482 10523
data Bill    Johnson 619-384-2582 53737
data Cindy   Wahler  301-724-7496 20933
data Rebecca Tannis  410-473-2748 20536

This will result in FIRST[0]=6, FIRST[1]=Joe, LAST[0]=6, LAST[1]=Wilson, etc
on up to FIRST[6]=Rebecca ... ZIP[6]=20536  (all the [0]'s = 6, as the array size)

上面的代码可以工作（并且相当快），除非最后一行被条件检查排除在外，在这种情况下最后一行实际上并没有被排除。如果数组数量超过 if ... geq .. %%... 语句的数量，也必须对其进行编辑。

我想：1) 修复最后一行排除错误，2) 删除许多 IF 以实现数组计数的灵活性，3) 使其更快。

这就是我现在正在修补的东西。它确实是#2，我希望它在我完成该部分时能够做到#1，但它在#3 上失败了——FAR 慢得多。在这台计算机上，处理这个仅包含 4 个数组的 6 个值的列表需要 20-30 秒 - 我的实际数据文件大约有 110 行，每行 7 个值，21 行，每行 5 个。

:array_load
    set _DATAFILE=%1
    set _FILTER=
    if not exist %_DATAFILE% (
        echo [E] - {array_load} Array datafile %1 not found.  Aborting run.
        pause
        exit /b
    )
    if not "%2"=="" set _FILTER=%2
:: Gets the list of arrays from the "arry" statement
    for /f "usebackq tokens=1,*" %%a in (`findstr "^arry " %_DATAFILE%`) do ( set "ARRAYLIST=%%b")
:: Set up the array names and prep for the second loop.
    set ndx=1
    for %%a in (%ARRAYLIST%) do (
        set COL[!ndx!]=%%a
        set /a ndx+=1
    )
    set /a ARYCNT=ndx-=1
:: Token 1 is always "data" and is skipped.
    set ndx=1
    for /f "usebackq tokens=1,*" %%a in (`findstr "^data " %_DATAFILE%`) do (
        set _LN=%%b
        for /l %%x in (1,1,%ARYCNT%) do (
            set ARY=!COL[%%x]!
            call :array_load_inner %%x
        )
        set /a ndx+=1
    )
    exit /b
:array_load_inner
    for /f "tokens=%1" %%z in ("!_LN!") do set !ARY![!ndx!]=%%z
    set /a !ARY![0]=!ndx!
    exit /b

那么，这会让你和我一样眼花缭乱吗？ :)

上面的任何一个例程都是一个“答案”，而不是我想要的那个 - 第一个是我的起点，并且在原始问题的范围内起作用，其中不包括对数组或包含的引用。第二个太慢了，不可行。如果我不处理数组，@Stephan 答案会起作用。很抱歉将此材料放入“答案”块 - 它对于评论来说太大了（我想我应该从一开始就包含原始完整代码，而不是试图将示例代码保持在“最小公分母”，所以说话）。对不起...

【讨论】：

嗯 - 你知道Mary Lou Jones 会杀死你的脚本吗？（换句话说：不允许任何数据元素包含空格）
@Stephan - 是的，我知道。如果这是真实数据，她将是MaryLou Jones 或只是Mary Jones。我的真实数据不会遇到这个问题，而且就其性质而言，永远不会。
最简单也是迄今为止最快的解决方案是首先过滤数据：（findstr "^data\ .*<%_VAL_%>" "%_DATAFILE%"（请参阅findstr /? 了解\< 和\>)），但我不知道如果这对您的数据足够安全。当%_VAL% 出现在其他字段中时（想想可能是名字或姓氏的名字），它可能会产生误报。
@Stephan - 这实际上可以工作......哇，精神盲点！实际数据不受您提到的第一个/最后一个问题的影响（所有字段都是唯一的样式/模式）。我也不需要关心过滤条件出现在哪里——它们是独一无二的。在命令行上玩这个，我无法让它工作 - 我要么得到一切 - 包括 cmets 或不 - 要么什么都没有。我稍作改动：findstr "^data " %_DATAFILE% | findstr %_VAL% 按预期工作。只需在不使用条件时设置 _VAL=data，以确保第二个 findstr 获取所有条件。
@Stephan - 效果很好（必须在管道前面放一个 ^" 以使其在批处理而不是命令行中工作）。使用 ZIP 加载示例数据仍然需要 2.2 秒： 10523 过滤器和几乎 5 秒未过滤。我较大的数据文件需要 2 分钟以上才能读取未过滤，大约 100 秒，过滤器占用了大约 3/4 的数据。主要的减速在其他地方，所以 #1 被处理，但可以似乎没有同时获得 #2 和 #3（没有多重条件和速度）。

【解决方案4】：

EDIT由于新信息而完全改变。

请试试这个新代码。我想使用批处理不会变得更快。
由于未使用%_VAR%，您可以简化第二个参数（并跳过反汇编%_COND%）

@echo off
setlocal enabledelayedexpansion
set "file=%~1"
set "_COND=%~2"
if not defined file goto :eof

for /f "tokens=1* delims=:" %%a in ("%_COND%") do ( 
  set "_VAR=%%a"
  set "_VAL=%%b"
)
rem set _
echo Start: %time%
REM get field names:
for /f "tokens=1*" %%A in ('findstr /b "arry " "%file%"') do (
  set cnt=-1
  for %%C in (%%B) do (
    set /a cnt+=1
    set "%%A[!cnt!]=%%C"
  )
)

REM get Data-Array:
set "cnt=0"  & REM number of data sets 
for /f "tokens=1*" %%A in ('findstr /brc:"data .*%_VAL%\>" %file%') do (
  set /a cnt+=1
  set "cnt2=-1"  & REM number of entries in the dataset
  for %%C in (%%B) do (
    set /a cnt2+=1
    call set "%%arry[!cnt2!]%%[!cnt!]=%%C"
    call set /a %%arry[!cnt2!]%%[0]=cnt 
  )

)
echo End: %time%
set |find "["
echo there are entries 0 to %cnt%.

加快速度的因素：

过滤输入（要处理的行数更少）
根本没有IF
即使使用空的 %_VAL% 也可以工作，因此无需在循环的每次迭代中检查它

可能的对比：

filterstring 区分大小写（可以使用findstr 的/i 参数更改，但也会使关键字data 不区分大小写。
可能是误报（John / Johnson 没有问题，因为 \> 的字边界，但 JohnJohn 可以通过 John 过滤器找到（Johnjohn 不会））
我非常希望 5-token 数据集和 7-token 数据集在不同的文件中

数据集的标记没有硬编码限制。

带有示例文件的运行时间：200 毫秒（带过滤器），260 毫秒（不带）。多亏了过滤，过滤后的列表比未过滤的更快。

【讨论】：

哇。效果很好。有几件事我不明白，还有一件事我需要改变（不能因为我不明白其中的一些？）。 - 首先，7 和 5 令牌数据集是不同的。两者都在相同的代码中使用，但数据集位于不同的文件中。有些事情... - set "file=%~1" - 为什么是波浪号？ - 对，没有使用 VAR - 删除了该行；现在是单行 for /f "tokens=1* delims=:" %%a in ("%_COND%") do set "_VAL=%%b" ) 可能会完全删除 VAR: 部分，但这是在主代码中做的一些事情，所以以后再说
（续） - 不用担心您提到的过滤器匹配问题。由于真实数据的性质，这不会成为问题 - rem set _ 真的做了什么吗？ - findstr 在命令行中不起作用，但在批处理中起作用。这很奇怪——你知道为什么吗？ - 我将cnt 的所有实例更改为IDX（只是我的风格，以及我的眼睛习惯于寻找的东西-我不明白call 行。我确实得到了延迟扩展部分和%%C - 分解其余部分？
（续）- 这是程序的一部分，希望在每个数组的 [0] 元素中找到上边界 (cnt)。将数据从 0-to-(cnt-1) 移动到 1-to-cnt 很简单，但是由于我没有得到调用线，所以我看不到如何根据需要设置 [0] 元素它。在call set 的for 循环结束后，可能类似于for /l %%X in (0,1,%IDX2%) do call set "%%arry[%%x][0]=%IDX%"？我玩过这个的变种，但似乎无法击中它......
如果你好奇的话，我得到 2.21 到 2.37 秒的未过滤（103 行）读取 7 元素数据集，1.88 到 2.09 秒过滤（83 行） .. 5 元素集的未过滤时间为 1.04 到 1.65（19 行），过滤时间为 0.85 到 1.2 秒（7 行）。
这只是从哪里开始计数器的问题。请参阅我编辑的代码。明天我会回答你的其他问题 - 我现在时间有点短。