这个问题非常模棱两可,因此有很多不同的答案。我在这里假设您的意思是计算第一列和以下任何列的组合。
import csv
from collections import Counter
counter = Counter()
with open('data.csv') as csvfile:
reader = csv.reader(csvfile, delimiter='|')
for line in reader:
if line[0]:
for fragment in line[1:]:
entry = line[0] + '-' + fragment
counter.update({entry: 1})
print(counter)
输出:
Counter({'data1-data2': 2, 'data1-data3': 1, 'data1-data4': 1, 'data1-data6': 1, 'data2-data3': 1, 'data4-data5': 1, 'data4-data6': 1, 'data5-data7': 1})
编辑 1:
假设您想要现有数据字段的任何非零组合:
import csv
from collections import Counter
from itertools import combinations
counter = Counter()
with open('data.csv') as csvfile:
reader = csv.reader(csvfile, delimiter='|')
for line in reader:
counter.update(combinations(line, 2))
print(counter)
输出:
Counter({('data1', 'data2'): 2, ('data2', 'data3'): 2, ('data4', 'data6'): 2, ('data1', 'data3'): 1, ('data1', 'data4'): 1, ('data1', 'data6'): 1, ('data4', 'data5'): 1, ('data5', 'data6'): 1, ('data5', 'data7'): 1})
编辑 2:
假设您希望每个数据单元格与其他数据单元格相结合,包括那些不在同一行的任何位置显示的关系:
import csv
from collections import Counter
from itertools import combinations
counter = Counter()
unique = set()
with open('data.csv') as csvfile:
reader = csv.reader(csvfile, delimiter='|')
for line in reader:
unique.update(line)
counter.update(combinations(line, 2))
counter.update({entry: 0 for entry in combinations(unique, 2)})
print(counter)
输出:
Counter({('data1', 'data2'): 2, ('data2', 'data3'): 2, ('data4', 'data6'): 2, ('data1', 'data3'): 1, ('data1', 'data4'): 1, ('data1', 'data6'): 1, ('data4', 'data5'): 1, ('data5', 'data6'): 1, ('data5', 'data7'): 1, ('data7', 'data5'): 0, ('data7', 'data2'): 0, ('data7', 'data4'): 0, ('data7', 'data6'): 0, ('data7', 'data1'): 0, ('data7', 'data'): 0, ('data7', 'data3'): 0, ('data5', 'data2'): 0, ('data5', 'data4'): 0, ('data5', 'data1'): 0, ('data5', 'data'): 0, ('data5', 'data3'): 0, ('data2', 'data4'): 0, ('data2', 'data6'): 0, ('data2', 'data1'): 0, ('data2', 'data'): 0, ('data4', 'data1'): 0, ('data4', 'data'): 0, ('data4', 'data3'): 0, ('data6', 'data1'): 0, ('data6', 'data'): 0, ('data6', 'data3'): 0, ('data1', 'data'): 0, ('data', 'data3'): 0})