【发布时间】:2015-05-11 13:25:23
【问题描述】:
我希望计算name 列的每个字符串的字符数。我的数据框sample 如下所示:
date name expenditure type
23MAR2013 KOSH ENTRP 4000 COMPANY
23MAR2013 JOHN DOE 800 INDIVIDUAL
24MAR2013 S KHAN 300 INDIVIDUAL
24MAR2013 JASINT PVT LTD 8000 COMPANY
25MAR2013 KOSH ENTRPRISE 2000 COMPANY
25MAR2013 JOHN S DOE 220 INDIVIDUAL
25MAR2013 S KHAN 300 INDIVIDUAL
26MAR2013 S KHAN 300 INDIVIDUAL
为什么nchar 给了我一个随机数列表? str_length() 来自 stringr 包也是如此
Length <- aggregate(nchar(sample$name), by=list(sample$name), FUN=nchar)
输出
Group.1 x
1 JASINT PVT LTD 2
2 JOHN DOE 1
3 JOHN S DOE 2
4 KOSH ENTRP 2
5 KOSH ENTRPRISE 2
6 S KHAN 1, 1, 1
期望的输出:
Group.1 x
1 JASINT PVT LTD 14
2 JOHN DOE 8
3 JOHN S DOE 10
4 KOSH ENTRP 10
5 KOSH ENTRPRISE 14
6 S KHAN 6
上表的csv:
"Date","name","expenditure","type"
"23MAR2013","KOSH ENTRP",4000,"COMPANY"
"23MAR2013 ","JOHN DOE",800,"INDIVIDUAL"
"24MAR2013","S KHAN",300,"INDIVIDUAL"
"24MAR2013","JASINT PVT LTD",8000,"COMPANY"
"25MAR2013","KOSH ENTRPRISE",2000,"COMPANY"
"25MAR2013","JOHN S DOE",220,"INDIVIDUAL"
"25MAR2013","S KHAN",300,"INDIVIDUAL"
"26MAR2013","S KHAN",300,"INDIVIDUAL"
【问题讨论】:
-
您是否需要将
spaces也包括在计数中?在预期的输出中,字符数有一些不一致。例如,在第一行,空格也被计算在内,但在最后一行,5空格被省略如果是错字@987654333 @
标签: r dataframe string-length