遍历字典，做同样的事情，如何优化？答案

【问题标题】：Traverse a dictionary and do the same thing, how to optimize?遍历字典，做同样的事情，如何优化？
【发布时间】：2020-11-17 06:36:43
【问题描述】：

我有一个文件字典，其结构如下所示

+-- folder1
| +-- folder2
    | +--A.py
    | +--A.txt
| +-- folder3 
    | +--folder4
        | +--B.py
        | +--B.txt
| +-- C.py
| +-- C.txt

我想知道的是找到folder1中的所有.py文件，并写出其通过_连接的相对路径。例如，B.py 可以是 folder1_folder3_folder4_B.py。这就是我所做的。

import os
file_list = os.listdir(folder1)
for file in file_list:
    if len(file.split('.')) ==1 and file.split('.')[-1]=='py': # C.py
       print(folder1 + file) 
    elif len(file.split('.')) ==1 and file.split('.')[-1]!='py':  # C.txt
       pass
    else:
       file1_list = os.listdir(file):
       for file1 in file1_list:
           if len(file1.split('.')) ==1 and file1.split('.')[-1]=='py': # A.py
               print(folder1 + file + file1) 
           elif len(file1.split('.')) ==1 and file1.split('.')[-1]!='py':  # A.txt
               pass
           else:
               file2_list = os.listdir(file1):
               for file2 in file2_list:
                   if len(file2.split('.')) ==1 and file2.split('.')[-1]=='py': # B.py
                       print(folder1 + file + file1 + file2) 
                   elif len(file2.split('.')) ==1 and file2.split('.')[-1]!='py':  # B.txt
                       pass
                   else: 
                       pass # Actually I dont know how to write

有两个缺点：

(1) 我不知道何时停止for 循环，尽管我可以获得folder1 的最大深度

(2)for循环有这么多重复操作，显然可以优化。

有人有好的答案吗？

【问题讨论】：

标签： python for-loop listdir

【解决方案1】：

os.walk 递归遍历目录树。 fnmatch.fnmatch 可以通配符匹配文件名。 os.path.relpath 可以将复杂的根路径限制为子文件夹的路径。

给定testdir：

C:\TESTDIR
└───folder1
    │   C.py
    │   C.txt
    ├───folder2
    │       A.py
    │       A.txt
    └───folder3
        └───folder4
                B.py
                B.txt

和代码：

import os
from fnmatch import fnmatch

def magic(root):
    for path,dirs,files in os.walk(root):
        # fixes paths that start with .
        relpath = '' if root == path else os.path.relpath(path,root)
        for file in files:
            if fnmatch(file,'*.py'):
                name = os.path.join(relpath,file)
                yield name.replace(os.path.sep,'_')

root = r'.\testdir' # A path that starts with . for testing

for name in magic(root):
    print(name)

输出：

folder1_C.py
folder1_folder2_A.py
folder1_folder3_folder4_B.py

如果文件名包含下划线，你应该考虑你想要发生的事情，但是?

【讨论】：

【解决方案2】：

您想使用递归，这是一个调用自身的函数的花哨名称。编写一个将文件夹名称作为参数的函数。它应该运行 os.listdir() 并循环遍历结果，就像你正在做的那样。当您到达子文件夹时，只需再次运行该函数！

另外，请查看 .endswith() 函数，它比所有拆分都容易。你可以问if file.endswith('.py')。

【讨论】：

谢谢安德鲁。我按照你的建议做了，但是有问题。 folder3我已经成功运行了，但是对于folder4，它的路径变成了folder1_folder4，因为我们不会保留中间文件夹。你有什么建议？