如何解决unix中的“Word too long”错误？答案

【问题标题】：How to solve "Word too long" error in unix?如何解决unix中的“Word too long”错误？
【发布时间】：2014-01-04 17:59:25
【问题描述】：

我正在 unix 中编写一个名为 pl_dict 的 TC shell 脚本，它接受单数形式的英文单词列表作为输入，并在单独的行中打印出每个单词的复数形式。它使用一个包含英语单词列表的文件和另一个接受单数形式的英语单词作为参数的 c 程序，并打印该单词的复数形式。这是我的代码：

set dictionary = (/usr/share/dict/words)

set irregular = (/share/files/irregular.txt)

 #go over all the input words

foreach word ($argv[*])

    set irregularWord = `grep $word $irregular | cut -d" " -f1`

    #the word is found in irregular.txt file
    if ("$irregularWord" != "") then
       gcc -o pluralize pluralize.c
       ./pluralize -f irregular.txt $word

    else #the word is not found in the irregular file

       #search for it in the dictionary
       set realEnglishWord = `grep $word $dictionary`

       #the word is a real English word
       if ("$realEnglishWord" != "") then
          gcc -o pluralize pluralize.c
          ./pluralize $word
       else
          echo "$word":" word not found in dictionary."
       endif
    endif
end

在我尝试运行它之前它工作得很好：pl_dict fish foot fox house mouse

我得到的输出是：

fish

feet

foox: word not found in dictionary.

Word too long.

问题是什么，我该如何解决？

谢谢。

【问题讨论】：

尝试一些调试回显以查看 $word 是什么... grep 可能匹配字典中的几行... 如果是这样，您可以使用 awk 并进行 $1 的精确匹配

标签： shell unix csh

【解决方案1】：

尝试以下步骤：

第 1 步：

sudo apt-get install tcsh

第 2 步：

sudo update-alternatives --config csh

从可用选项列表中选择 tcsh。

【讨论】：

【解决方案2】：

我想那是程序pluralize的消息，我们需要程序的文本来帮助你。

此外，您无需在每次运行脚本时都编译程序（gcc 行）。您可以执行一次，然后使用二进制文件。

【讨论】：

【解决方案3】：

在 tcsh 6.15 之前，每行的最大长度有限制。如果我没记错的话，它是 4K 字符。如果违反限制，则会显示该消息。

这通常是由 shell 扩展长变量引起的。当我尝试在同一行两次扩展复杂的 $PATH 时遇到了这个问题。

要解决它，首先找出long变量。使用

env | grep VARIABLE_NAME

和/或

set | grep VARIABLE_NAME

在变量扩展之前检查可疑变量。

另外，因为grep 的结果可能是数千行（例如is），如果你想要确切的结果，你可以使用<>来指定边界

grep "\<WORD\>" /usr/share/dict/words

或者使用 awk，正如 technosaurus 所评论的那样。

【讨论】：

【解决方案4】：

我刚刚遇到了同样的问题，这是扩展“太长”的 shell 变量的结果。我也像这样使用 grep：

设置 test_error = "grep -P '^UVM_(ERROR|FATAL)\s+[^:]' $mylog"

... 匹配 $mylog 中的多行并导致 $test_error 成为一个巨大的多行字符串。解决方法是使用“-m 1”让 grep 在第一次匹配后停止，如下所示：

设置 test_error = "grep -P -m 1 '^UVM_(ERROR|FATAL)\s+[^:]' $mylog"

在我的应用程序中，我只需要第一个匹配项。不确定这是否适用于您的使用。

【讨论】：