Python按掩码排序[关闭]答案

【问题标题】：Python sorting by mask [closed]Python按掩码排序[关闭]
【发布时间】：2013-01-21 21:50:00
【问题描述】：

我有一些文字：

228;u;Ali;
129;cr;Daan;
730;c;Arton;
466;cr;Frynk;
314;c;Katuhkay;
9822;c;Kinberley;

我想将此文本写入文件，但我只想写入带有符号';cr;'的行

【问题讨论】：

你有一些文字是什么意思？作为列表？作为字符串？作为输入文件？
您要排序、写入文件还是提取文本？
作为输入文件，提取项目。
您不想排序（这意味着重新排序）。你想选择。在每一行中搜索匹配项，然后将其写入文件。
如果我输入我的名字为“Smith;cr;John”怎么办？

标签： python sorting text

【解决方案1】：

类似这样的：

with open("input.txt") as f,open("output.txt","w") as f2:
    for line in f:                #iterate over each line of input.txt
        if ";cr;" in line:        #if ';cr;' is found
            f2.write(line+'\n')      #then write that line to "output.txt"

在 python 中，您可以使用 in 轻松检查子字符串：

In [167]: "f" in "qwertyferty"
Out[167]: True

In [168]: "z" in "qwertyferty"
Out[168]: False

【讨论】：

当我使用本地文件执行此操作时 - 它可以工作，但是当我尝试使用来自互联网的文件时它失败了（

【解决方案2】：

with open("input.csv", "r") as inp, open("output","w") as out:
    inpList = inp.read().split()
    out.write('\n'.join(el for el in inpList if ';cr;' in el))

如果您希望从网络读取数据，请使用以下命令：

from urllib2 import urlopen
inp = urlopen("<URL>")
with open("output","w") as out:
    inpList = inp.read().split()
    out.write('\n'.join(el for el in inpList if ';cr;' in el))

read() 一次读取整个文件。 split() 将其拆分为一个由空格分隔的列表。

读取（...）
    read([size]) -> 最多读取 size 个字节，以字符串形式返回。

    如果 size 参数为负数或省略，则读取直到到达 EOF。

为了写入文件，'\n'.join([elem1,...]) 从所有包含 ';cr;' 的 inpList 元素中创建一个字符串。这个字符串被传递给write(str)，它将字符串打印到输出文件中。

【讨论】：

我正在读取的只是 csv 文件
[] 里面的join() 是多余的。
在某种程度上是的，但是 cProfile 表明我可以通过编写 [] 来减少对的几个函数调用。
@GermanShaich - 抱歉，打错了。固定。
@sidi 很有用，因为它们会进行惰性求值。对于大生活来说，list comprehension 会消耗大量内存，因为它会首先生成所有元素。