跟踪两个数据集（或矩阵）之间的数据移动答案

【问题标题】：Track movement of data between two data sets (or matrices)跟踪两个数据集（或矩阵）之间的数据移动
【发布时间】：2019-01-22 07:12:42
【问题描述】：

我有两个数据集，例如 7 月和 8 月的数据，每个数字表示下订单的数量。我想比较两组之间的数据并找出它们之间的任何差异。主要可以有三个不同
a)= 订单取消（数字存在于左侧数据集中，右侧不存在）
b) = 新顺序（数字存在于右侧数据集中但不存在于左侧）
c) = 延期订单（两个数据集中都存在数字（但正确数据集中的数字存在于后一个月 M0）
d) = preponed（为简化起见，我们假设这不会发生）

     July                       August              
TypeM09 M10 M11 M12     TypeM09 M10 M11 M12
A   0   1   2   0       A   0   1   2   0
B   1   0   0   0       B   0   0   1   0
C   1   2   0   0       C   0   1   0   0
D   1   2   2   0       D   1   0   0   2

注意 - 在上面的类型 D 中，数字 2 在 M10 和 M11 的左侧数据集中出现两次，在 M12 的右侧数据集中出现一次。由于程序是从左到右检查数据，只要程序发现 M10（在左数据集中）存在于 M12（在右数据集中），因此 M11 中的数字（左数据集实际上被取消，因为没有更多“免费”数字，以便在正确的数据集中匹配。

程序应该是这样的

Type    M09                   M10                     M11            M12
A       -                     no change no change     -              -
B       postponement to M11   -                       -              -
C       postponement to M10   cancellation  -   -
D       no change             postponement to M11      cancellation  -

作为输出，我想实现这样的目标：

Type    Detail          Previous month  New month
x       Postponement    M07             M11
x       Postponement    M08             M12
y       Cancellation    M08             -
z       New order       -               M12

【问题讨论】：

edid @Saeed 的格式很好，干得好！（您的个人资料上写着“做对了，做的快，做一次。”，而您已经做到了。）
@user23456 D 而不是推迟到 M11 应该推迟到 M12 对吗？

标签： python r excel vba

【解决方案1】：

我仍然没有什么疑问，所以只能生成中间输出.. 请为 A、B、C、D 类型提到的示例添加更多所需输出的解释

from collections import defaultdict
data_2dlist_1 = [
    ['Type', 'M09' ,'M10', 'M11', 'M12'],
    ['A' ,  0 , 1 , 2 , 0] ,     
    ['B' ,  1 , 0 , 0 , 0] ,     
    ['C' ,  1 , 2 , 0 , 0] ,     
    ['D' ,  1 , 2 , 2 , 0] ]

data_2dlist_2 =[
['Type', 'M09' ,'M10', 'M11', 'M12'],
['A', 0,   1,   2,   0],
['B', 0,   0,   1,   0],
['C', 0,   1,   0,   0],
['D', 1,   0,   0,   2]]

output_2d = [['Type', 'M09' ,'M10', 'M11', 'M12']]
final_output_2d = [['Type','Detail','Previous month'  ,'New month']]


for d1, d2 in zip(data_2dlist_1[1:],data_2dlist_2[1:]):
    #print(d1,d2)
    output_2d.append([d1[0]]) #new_type row creation
    looked = defaultdict()

    for i in range(1,len(d1),1):

        if d1[i] == 0: 
            output_2d[-1].append('-')

        elif d1[i] == d2[i]:
            output_2d[-1].append('no change')

        elif d1[i] != 0: 
            start = i+1
            if d1[i] in looked:
                start = looked[d1[i]] + 1
            try:
                found_at = d2.index(d1[i],start)
                output_2d[-1].append('postponement to '+ str(data_2dlist_2[0][found_at]))
                looked[d1[i]] = found_at

            except ValueError: #not found 
                output_2d[-1].append('cancellation')

        elif d2[i] not in looked: #and d1[i] == 0
            output_2d[-1].append('new order')

print(output_2d)

'''       

[['Type', 'M09', 'M10', 'M11', 'M12'],
['A', '-', 'no change', 'no change', '-'],
['B', 'postponement to M11', '-', '-', '-'],
['C', 'postponement to M10', 'cancellation', '-', '-'],
['D', 'no change', 'postponement to M12', 'cancellation', '-']]

'''

【讨论】：