sed 切换第一个和最后一个单词的顺序答案

【问题标题】：sed switch the order of first and last wordsed 切换第一个和最后一个单词的顺序
【发布时间】：2015-07-18 06:17:05
【问题描述】：

我正在尝试使用 sed 来切换句子中第一个和最后一个单词的顺序，因为在这种情况下，我认为我不明白“贪婪”正则表达式的含义。仅仅因为三个字的一句话，我就惨败了。

$ echo hello world mike | sed 's/\([a-z]*\).* \([a-z]*\).*/\2 \1/'
mike hello

为什么输出不是“world hello mike”？一些可能有用的额外信息。

\1 \2 是第一个和第二个正则表达式匹配
我关注的是tutorial here。

我的最终目标是切换句子中第一个和最后一个单词的顺序，而不管其中有多少单词。

【问题讨论】：

标签： regex sed

【解决方案1】：

您没有将hello 部分包含为您的捕获组之一，因此它不会得到输出。试试：

$ sed -E 's/([a-z]+) (.+) ([a-z]+)/\3 \2 \1/' <<< "hello world mike"
mike world hello
$ sed -E 's/([a-z]+) (.+) ([a-z]+)/\3 \2 \1/' <<< "hello world foo bar baz mike"
mike world foo bar baz hello

（注意：我还删除了你的useless use of echo。）

您也可以将[a-z] 替换为[[:alpha:]] 来处理大写字母：

$ sed -E 's/([[:alpha:]]+) (.+) ([[:alpha:]]+)/\3 \2 \1/' <<< "Hello world Mike"
Mike world Hello

【讨论】：

善用[[:alpha:]]。您确实假设第一个非字母字符是空格。如果它是一个制表符怎么办...为什么不在第二组中捕获“所有内容”，并省略替换字符串中的空格：s/([[:alpha:]]+)(.*)([[:alpha:]])/\3\2\1/？
@Floris，OP 的帖子有空格，我只是想让它更易于阅读。

【解决方案2】：

另一个awk 版本

echo hello world mike | awk '{s=$1;$1=$NF;$NF=s}1'
mike world hello

只需交换最后一个文件和第一个文件就可以了。

【讨论】：

+1。这是解决问题的“正确”方法； sed 可能不是合适的工具。
@CarlNorum 可能没问题，但与 sed 解决方案或 sputnik 发布的 awk 解决方案相比，这确实具有改变所有空白空间的可能负面（并且可能是 OP 意想不到的）副作用在句子中间的单词之间到单个空白字符。

【解决方案3】：

$ echo "hello world mike" | sed -r 's/([^ ]+)(.* )([^ ]+)/\3\2\1/'
mike world hello
$ echo "this is a simple sentence" | sed -r 's/([^ ]+)(.+ )([^ ]+)/\3\2\1/'
sentence is a simple this

或在仅支持 BRE 而不是 ERE 的旧 sed 中：

$ echo "hello world mike" | sed 's/\([^ ]*\)\(.* \)\([^ ]*\)/\3\2\1/'
mike world hello
$ echo "this is a simple sentence" | sed 's/\([^ ]*\)\(.* \)\([^ ]*\)/\3\2\1/'
sentence is a simple this

【讨论】：

【解决方案4】：

带有单词边界的 sed 命令：

sed 's/\([A-Za-z]\+\)\(.\+\)\b\([A-Za-z]\+\)/\3\2\1/'

或在扩展模式下：

sed -r 's/([A-Za-z]+)(.+)\b([A-Za-z]+)/\3\2\1/'

【讨论】：

【解决方案5】：

awk：

$ echo 'hello world mike' | awk '{v1=$1;v2=$NF;$1=$NF="";print v2, $0, v1}'
mike  world  hello

【讨论】：

+1 是一个很好的解决方案，它使用 awk，同时保留字间距，但在 print 中丢失逗号，因此您不会添加不必要的 OFS 字符。

【解决方案6】：

您要求交换行中的第一个和最后一个单词 - 因此您需要确保捕获那些（而不是第一个和第二个单词，就像上面的许多答案一样）。

echo "hello cruel and unkind world" | sed 's/^\([^ ]*\) \(.*\) \([^ ]*\)$/\3 \2 \1/'

会导致

world cruel and unkind hello

这是它的工作原理：

^\([^ ]*\)  - starting at the beginning of the line (^), find as many non-space characters as you can (stops at first space)
              note - depending on the flavor of sed you use, there are special symbols to map "a non whitespace, e.g. \S
            - the next space is matched but not captured
\(.*\)      - capture "everything" after this, until...
 \([^ ]*\)$ - a space followed by all non-space characters followed by the end of string

当您以相反的顺序输出三个捕获组时，中间有一个空格，您会得到您所要求的。

【讨论】：

【解决方案7】：

我会使用另一种方法，例如更强大的语言的split()，但对于sed，您必须将两个边缘词之间的所有内容分组：

echo hello world mike | sed 's/\([a-z]*\)\(.*\) \([a-z]*\).*/\3\2 \1/'

它产生：

mike world hello

【讨论】：

@Floris：正确，但我没有警告说要执行强大的命令，只是修复了他的错误。我不会为此使用sed。我想这是一个虚拟的例子，他会知道相应地调整它以适应更复杂的数据。