读取文本文件中的特定区域或字符串答案

【问题标题】：Read specific area or string in a text file读取文本文件中的特定区域或字符串
【发布时间】：2015-01-04 23:30:52
【问题描述】：

我有一个已写入用户数据的文本文件。用户名、电子邮件和密码。

这就是用户文件现在的样子

[>]

用户名：管理员

密码：12345678

电子邮件：hue@hue.hue

[>]

现在回答问题。

如何告诉 python 只读取密码？我的意思是，现在我们可能知道密码是什么以及它的长度是多少。但是当我加密它并得到一些 30 多个字符的乱码时，我应该如何读取密码呢？

【问题讨论】：

你试过什么？通常你只是继续阅读这些行并丢弃它们，直到你到达特定的行。
用户名和密码可能很多吗？？？
带有Password: 的行将是您想要的行，那么为什么您认为这很难找到？
我玩过 if 语句，但无法让事情正常进行。基本上我尝试了类似 [if 'Password: ' + login.loginUserPassword in readUF:] 或 [if readUF == 'Password: ' + login.loginUserPassword:] 的方法。我设法询问用户输入是否实际上在文本文件中，但仍然存在问题。比如密码为123，用户输入为1时，返回true，因为字符串1确实在123中。

标签： python-3.x string passwords area

【解决方案1】：

该行将包含密码，因此只需拆分一次并获取第二个元素：

In [20]: from simplecrypt import encrypt

In [21]: ciph = encrypt('password', "12345678")

In [22]: line = "Password: " + ciph

In [23]: line
Out[23]: 'Password: sc\x00\x01\x0cP\xa1\xee\'$"\xc1\x85\xe0\x04\xd2wg5\x98\xbf\xb4\xd0\xacr\xd3\\\xbc\x9e\x00\xf1\x9d\xbe\xdb\xaa\xe6\x863Om\xcf\x0fc\xdeX\xfa\xa5\x18&\xd7\xcbh\x9db\xc9\xbeZ\xf6\xb7\xd3$\xcd\xa5\xeb\xc8\xa9\x9a\xfa\x85Z\xc5\xb3%~\xbc\xdf'

In [24]: line.split(None,1)[1]
Out[24]: 'sc\x00\x01\x0cP\xa1\xee\'$"\xc1\x85\xe0\x04\xd2wg5\x98\xbf\xb4\xd0\xacr\xd3\\\xbc\x9e\x00\xf1\x9d\xbe\xdb\xaa\xe6\x863Om\xcf\x0fc\xdeX\xfa\xa5\x18&\xd7\xcbh\x9db\xc9\xbeZ\xf6\xb7\xd3$\xcd\xa5\xeb\xc8\xa9\x9a\xfa\x85Z\xc5\xb3%~\xbc\xdf'

In [25]: decrypt("password",line.split(None,1)[1])
Out[25]: '12345678'

In [26]: "12345678" == decrypt("password",line.split(None,1)[1])
Out[26]: True

当您遍历文件时，只需使用if line.startswith("Password")...

with open(your_file) as f:
    for line in f:
       if line.startswith("Password"):
            password = line.rstrip().split(None,1)[1]
            # do your check

您可以使用dict 和pickle 使用password 作为键，然后进行查找：

【讨论】：

if... 行的开头有一个杂散的反引号。

【解决方案2】：

如何告诉 python 只读取密码？

数据.txt：

[<< LOGIN >>]

Username: admin

Password: 12345678

E-Mail: hue@hue.hue

[<< LOGIN END >>]

[<< LOGIN >>]

Username: admin

Password: XxyYo345320945!@#!$@#!@#$%^%^^@#$%!@#$@!#41211 

E-Mail: hue@hue.hue

[<< LOGIN END >>]

...

import re

f = open('data.txt')

pattern = r"""
    Password     #Match the word 'Password', followed by...
    \s*          #whitespace(\s), 0 or more times(*), followed by...
    :            #a colon
    \s*          #whitespace, 0 or more times...
    (.*)         #any character(.), 0 or more times(*).  The parentheses 'capture' this part of the match.
"""

regex = re.compile(pattern, re.X)  #When you use a pattern over and over for matching, it's more efficient to 'compile' the pattern.

for line in f:
    match_obj = regex.match(line)

    if match_obj:  #then the pattern matched the line
        password = match_obj.group(1)  #group(1) is what matched the 'first' set of parentheses in the pattern
        print password

f.close()

--output:--
12345678
XxyYo345320945!@#!$@#!@#$%^%^^@#$%!@#$@!#41211

正则表达式（或 RE）指定一组与其匹配的字符串；这个模块中的函数可以让你检查一个特定的字符串是否匹配一个给定的正则表达式（或者一个给定的正则表达式是否匹配一个特定的字符串，这归结为同一件事）。

正则表达式可以串联起来形成新的正则表达式；如果 A 和 B 都是正则表达式，那么 AB 也是正则表达式。一般来说，如果一个字符串 p 匹配 A，另一个字符串 q 匹配 B，则字符串 pq 将匹配 AB。除非 A 或 B 包含低优先级操作，否则这将成立； A和B之间的边界条件；或有编号的组参考。因此，复杂的表达式可以很容易地从这里描述的更简单的原始表达式构造出来。有关正则表达式的理论和实现的详细信息，请参阅上面引用的 Friedl 书籍，或几乎所有有关编译器构造的教科书。

下面是正则表达式格式的简要说明。如需更多信息和更温和的介绍，请参阅正则表达式 HOWTO。

https://docs.python.org/3/library/re.html#module-re

【讨论】：