如何通过 Python 中的 for 循环传递列表列表？答案

【问题标题】：How to pass a list of lists through a for loop in Python?如何通过 Python 中的 for 循环传递列表列表？
【发布时间】：2017-02-19 18:22:24
【问题描述】：

我有一个列表：

sample = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
count = [[4,3],[4,2]]
correctionfactor  = [[1.33, 1.5],[1.33,2]]

我计算每个字符 (pi) 的频率，将其平方，然后求和（然后我计算 het = 1 - sum）。

The desired output [[1,2],[1,2]] #NOTE: This is NOT the real values of expected output. I just need the real values to be in this format.

问题：我不知道如何在此循环中传递列表列表（样本，计数）来提取所需的值。我之前使用此代码仅传递了一个列表（例如['TACT','TTTT'..]）。

我怀疑我需要添加一个更大的 for 循环，该循环对样本中的每个元素进行索引（即索引超过 sample[0] = ['TTTT', 'CCCZ'] 和 sample[1] = ['ATTA', 'CZZC']。我不知道如何将其合并到代码中。

** 代码

list_of_hets = []
for idx, element in enumerate(sample):
    count_dict = {}
    square_dict = {}
    for base in list(element):
         if base in count_dict:
            count_dict[base] += 1
        else:
            count_dict[base] = 1
    for allele in count_dict: #Calculate frequency of every character
        square_freq = (count_dict[allele] / count[idx])**2 #Square the frequencies
        square_dict[allele] = square_freq        
    pf = 0.0
    for i in square_dict:
        pf += square_dict[i]   # pf --> pi^2 + pj^2...pn^2 #Sum the frequencies
    het = 1-pf                    
    list_of_hets.append(het)
print list_of_hets

"Failed" OUTPUT:
line 70, in <module>
square_freq = (count_dict[allele] / count[idx])**2
TypeError: unsupported operand type(s) for /: 'int' and 'list'er

【问题讨论】：

错误消息告诉你确切地出了什么问题。：square_freq = (count_dict[allele] / counts[idx])**2 正在提高TypeError: unsupported operand type(s) for /: 'int' and 'list'。您不能将 int 除以 list。顺便说一句，这与您编写的代码不匹配，当您尝试将counts[idx] 传递给float 时，这可能会引发另一个TypeError。
我正在尝试使用像 square_freq = [[n/d for n, d in zip(subq, subr)] for subq, subr in zip(count_dict[allele], counts)] 这样的 zip 命令。但我仍然有错误。还有其他建议吗？
@PM2Ring 我已经更正了。谢谢指出
什么是subq、subr？？？
另外，我已经编辑了问题以突出真正的问题（我在进行故障排除时意识到）

标签： python list for-loop division

【解决方案1】：

我不完全清楚您希望如何处理数据中的“Z”项，但此代码复制了 https://eval.in/658468 中示例数据的输出

from __future__ import division

bases = set('ACGT')
#sample = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
sample = [['ATTA', 'TTGA'], ['TTCA', 'TTTA']]

list_of_hets = []
for element in sample:
    hets = []
    for seq in element:
        count_dict = {}
        for base in seq:
            if base in count_dict:
                count_dict[base] += 1
            else:
                count_dict[base] = 1
        print count_dict

        #Calculate frequency of every character
        count = sum(1 for u in seq if u in bases)
        pf = sum((base / count) ** 2 for base in count_dict.values())
        hets.append(1 - pf)
    list_of_hets.append(hets)

print list_of_hets

输出

{'A': 2, 'T': 2}
{'A': 1, 'T': 2, 'G': 1}
{'A': 1, 'C': 1, 'T': 2}
{'A': 1, 'T': 3}
[[0.5, 0.625], [0.625, 0.375]]

可以通过使用 collections.Counter 而不是 count_dict 来进一步简化此代码。

顺便说一句，如果不在“ACGT”中的符号总是“Z”，那么我们可以加快count 的计算。摆脱bases = set('ACGT')并改变

count = sum(1 for u in seq if u in bases)

到

count = sum(1 for u in seq if u != 'Z')

【讨论】：

我的最终输出必须采用 [[0.5, 0.625],[0.625, 0.375]] 的形式，因为我需要能够区分 set1 (['ATTA', 'TTGA']) 和 set2['TTCA' 中的第一个元素, 'TTTA']
另外，不用担心“Zs”，我已经想出了处理它的方法:)
@biogeek：这很容易做到。请参阅我的答案的新版本。
另外，我不想使用“外部”将列表转换为嵌套列表的函数（例如stackoverflow.com/a/6614975/6824986）。这只是一个示例数据，我需要能够根据用户指定的输入（例如，如果它是 [[AA','TT','GG'],[ 'GG','CC',TC'], [AA','TT','GG'] ] ...最终输出应该有 [[1,2,3],[1,2,3], [1,2,3],])
非常感谢！我现在一直在努力解决这个问题。