【发布时间】:2016-09-21 13:05:14
【问题描述】:
如何合并两个矩阵的信息,其中一个是另一个矩阵的子矩阵,但具有不同的信息。
我有两个矩阵 (1223x1223) 和 (7096x7096)。 Bot这些矩阵包含从0到1的药库药物-药物距离分数。因此,较大的矩阵包含药物的化学结构相似性,较小的矩阵包含不同的相似性分数。
我想知道如何将这两个合并为一个矩阵(数据融合)以获取两个矩阵的信息。因此,如果药物 1 和药物 2 在两个矩阵中的得分分别为 0.5 和 0.7,那么融合数据的最佳方法是什么,以免丢失信息。
这是我的数据示例:
Data1
DB00006 DB00014 DB00035 DB00050 DB00091 DB00093 DB00104 DB00115
DB00006 1.0000000 0.8139535 0.8205128 0.7976190 0.6075949 0.6835443 0.6547619 0.6666667
DB00014 0.8139535 1.0000000 0.7500000 0.8111111 0.5617978 0.6292135 0.6966292 0.7200000
DB00035 0.8205128 0.7500000 1.0000000 0.7325581 0.5243902 0.8450704 0.7564103 0.6122449
DB00050 0.7976190 0.8111111 0.7325581 1.0000000 0.5764706 0.6091954 0.6976744 0.6700000
DB00091 0.6075949 0.5617978 0.5243902 0.5764706 1.0000000 0.4871795 0.5250000 0.5543478
DB00093 0.6835443 0.6292135 0.8450704 0.6091954 0.4871795 1.0000000 0.8028169 0.5360825
DB00104 0.6547619 0.6966292 0.7564103 0.6976744 0.5250000 0.8028169 1.0000000 0.5816327
Data2
DB07768 DB07886 DB07702 DB07465 DB08567 DB07129 DB08298
DB00014 0.260115607 0.19402985 0.22112211 0.11636364 0.26256983 0.18936877 0.29700855
DB00035 0.176344086 0.19935691 0.19545455 0.15606936 0.21489362 0.19523810 0.23456790
DB00050 0.037470726 0.05490196 0.05298013 0.09090909 0.03318584 0.05755396 0.03664921
DB00091 0.211974110 0.21052632 0.14814815 0.11666667 0.28192372 0.15856777 0.32452830
DB00104 0.200686106 0.20642202 0.15877437 0.12420382 0.26795096 0.19174041 0.31653226
DB00122 0.002469136 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
数据
Data1 <-
structure(c(1, 0.813953488, 0.820512821, 0.797619048, 0.607594937,
0.683544304, 0.654761905, 0.813953488, 1, 0.75, 0.811111111,
0.561797753, 0.629213483, 0.696629213, 0.820512821, 0.75, 1,
0.73255814, 0.524390244, 0.845070423, 0.756410256, 0.797619048,
0.811111111, 0.73255814, 1, 0.576470588, 0.609195402, 0.697674419,
0.607594937, 0.561797753, 0.524390244, 0.576470588, 1, 0.487179487,
0.525, 0.683544304, 0.629213483, 0.845070423, 0.609195402, 0.487179487,
1, 0.802816901, 0.654761905, 0.696629213, 0.756410256, 0.697674419,
0.525, 0.802816901, 1, 0.666666667, 0.72, 0.612244898, 0.67,
0.554347826, 0.536082474, 0.581632653), .Dim = 7:8, .Dimnames = list(
c("DB00006", "DB00014", "DB00035", "DB00050", "DB00091",
"DB00093", "DB00104"), c("DB00006", "DB00014", "DB00035",
"DB00050", "DB00091", "DB00093", "DB00104", "DB00115")))
Data2 <-
structure(c(0.260115607, 0.176344086, 0.037470726, 0.21197411,
0.200686106, 0.002469136, 0.194029851, 0.199356913, 0.054901961,
0.210526316, 0.206422018, 0, 0.221122112, 0.195454545, 0.052980132,
0.148148148, 0.158774373, 0, 0.116363636, 0.156069364, 0.090909091,
0.116666667, 0.124203822, 0, 0.262569832, 0.214893617, 0.033185841,
0.281923715, 0.267950963, 0, 0.189368771, 0.195238095, 0.057553957,
0.158567775, 0.191740413, 0, 0.297008547, 0.234567901, 0.036649215,
0.324528302, 0.316532258, 0), .Dim = 6:7, .Dimnames = list(c("DB00014",
"DB00035", "DB00050", "DB00091", "DB00104", "DB00122"), c("DB07768",
"DB07886", "DB07702", "DB07465", "DB08567", "DB07129", "DB08298"
)))
【问题讨论】:
-
也许最容易将两个矩阵重塑为长格式(请参阅
reshape2::melt),因此您有三列 - drug1、drug2 和 value。然后合并这些(见merge)。 -
所以合并将简单地添加来自 drug1-drug2 关系的值?
-
@Anurag ;如果你融化两个矩阵,你将有两个三列数据框。然后,您可以通过前两列合并这些列(其中将包含来自矩阵列和行的药物名称)。如需更具体的建议,您应该使用small, reproducible example 编辑您的问题
-
我刚刚分享了一小部分数据。请检查。
-
@user20650 - 我想要在 R 中。谢谢