【问题标题】:Ordering a character matrix by numerical column按数字列排序字符矩阵
【发布时间】:2015-01-13 01:10:12
【问题描述】:

我正在处理从包含数字和字符的 csv 读取的矩阵。这是一个较小的矩阵,但基本上是我正在使用的:

[,1] [,2] [,3]         [,4]    [,5]    [,6]    [,7]    [,8]    [,9]
V2  "A"  "1"  "Sample X1"  "34712" "39390" "38858" "38574" "38660" 
V3  "A"  "2"  "Sample X2"  "35333" "39940" "40533" "39936" "40669" 
V4  "A"  "3"  "Sample X3"  "33612" "39601" "38658" "39220" "39465" 
V5  "A"  "4"  "Sample X4"  "34309" "39200" "38597" "39820" "40081" 
V6  "A"  "5"  "Sample X5"  "33637" "39404" "40497" "39388" "40033" 
V7  "A"  "6"  "Sample X6"  "35314" "39522" "40345" "38624" "40306" 
V8  "A"  "7"  "Sample X7"  "35548" "39000" "41408" "38310" "39849" 
V9  "A"  "8"  "Sample X8"  "33972" "39930" "39777" "39582" "39570" 
V10 "A"  "9"  "Sample X9"  "34808" "39857" "39252" "39248" "38465" 
V11 "A"  "10" "Sample X10" "34316" "39798" "39776" "39516" "38812" 
V12 "A"  "11" "Sample X11" "34476" "38581" "39672" "38997" "38794" 
V13 "A"  "12" "Sample X12" "36246" "38809" "37872" "38100" "36925" 
V14 "B"  "1"  "Sample X13" "33642" "40201" "40202" "39320" "40426" 
V15 "B"  "2"  "Sample X14" "33381" "40624" "40349" "41350" "40490" 
V16 "B"  "3"  "Sample X15" "34465" "42096" "41194" "40613" "40416" 
V17 "B"  "4"  "Sample X16" "33957" "41905" "42273" "40710" "40681" 
V18 "B"  "5"  "Sample X17" "33877" "42040" "42226" "40788" "41261" 
V19 "B"  "6"  "Sample X18" "33970" "41860" "41149" "41093" "40877" 
V20 "B"  "7"  "Sample X19" "34745" "42040" "40186" "40862" "41044" 
V21 "B"  "8"  "Sample X20" "34140" "41274" "39880" "40356" "40496" 
V22 "B"  "9"  "Sample X21" "33929" "40652" "41410" "40760" "40718" 
V23 "B"  "10" "Sample X22" "33684" "39220" "40478" "41500" "40094"
V24 "B"  "11" "Sample X23" "33141" "41446" "41121" "40726" "41020"
V25 "B"  "12" "Sample X24" "33405" "38481" "37716" "38562" "38218" 
V26 "C"  "1"  "Sample X25" "71560" "86402" "85614" "84273" "83264" 
V27 "C"  "2"  "Sample X26" "72144" "86266" "88082" "87672" "87356" 
V28 "C"  "3"  "Sample X27" "71946" "90201" "89156" "88386" "88006" 
V29 "C"  "4"  "Sample X28" "71758" "89108" "88225" "86006" "88654" 
V30 "C"  "5"  "Sample X29" "71144" "86558" "88614" "87028" "88809" 
V31 "C"  "6"  "Sample X30" "70504" "89230" "88869" "86653" "86356" 
V32 "C"  "7"  "Sample X31" "67874" "88405" "84878" "84914" "85425" 
V33 "C"  "8"  "Sample X32" "70273" "87865" "87529" "87945" "86172" 

我想按不带标题的第二列对矩阵进行排序:

A 1 . . .
B 1
C 1
A 2
B 2
C 2
A 3
. 
.
.
A 12
B 12
C 12 . . .

我看了一圈,发现可以用order:

data <- data[order(data[,2],]

但结果是这样的:

A 1 . . .
B 1
c 1
A 10
B 10
C 10
A 11
B 11
C 11
A 12
B 12
C 12
A 2
B 2
C 2
.
.
.
A 9
B 9
C 9 . . .

是不是因为这个矩阵是字符矩阵?如何仅将第二列设为数字,以便根据它对其进行排序?

谢谢

【问题讨论】:

    标签: r matrix character


    【解决方案1】:

    当您想要跨列混合类(例如数字和字符)时,将数据放在矩阵中是一个坏主意。相反,您应该使用数据框。

    理想情况下,使用read.csvread.table 将数据读入数据帧。否则,使用as.data.frame 将您的矩阵强制转换为数据框。

    给定矩阵m(在你的情况下为data):

    d <- as.data.frame(m, stringsAsFactors=FALSE)
    d[, 3] <- as.numeric(d[, 3]) # coerce the relevant column to numeric
    d[order(d[, 3]), ]
    

    请注意,您可以根据需要使用m[order(as.numeric(m[, 3])), ] 对矩阵进行排序,但结果列仍将全部为character

    注意:您目睹的排序行为的解释是,对于字符向量,任何以 1 开头的内容(例如 10)都在 2 之前。

    【讨论】:

    • 感谢有关 data.frame 的提示,我很困惑如何拥有多个类的数据。反正有没有改变它的排序方式?是 1, 2, 3, 4, ... , 10, 11, 12?或者最好的方法是将这些行剪掉并放在最后?
    • @IlyaLederman 不确定您的意思。我提供的代码 (d ), or data[order(as.numeric(data[, 3])), ]` , 都应该按你的意愿订购。
    • 我的订单仍然是 1,10,11,12,2,3,4,5,9,7,8,9。我做了 lapply(data, class) 它说一切都是一个因素。我真的不明白一个因素是什么。我在 csv 中读取 data
    • 因子数据需要使用d &lt;- as.data.frame(data, stringsAsFactors=FALSE); d[order(d[, 3]), ])data[order(as.numeric(as.character(data[, 3]))), ]
    • 我仍然使用 as.data.frame 将其作为字符排序,但使用 as.numeric 和 as.character 使其工作。谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-06-26
    • 1970-01-01
    • 2014-08-12
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多