【发布时间】:2016-10-30 16:26:36
【问题描述】:
我正在从网页源中提取评论数据并使用以下方法构建树:
`tmpTree <- FromListExplicit(postData[[1]], nameName = "poster", childrenName = "child")`
其中postData 是使用xml_find_all 提取的节点列表,每次从列表中创建新树时,postData[[1]] 都会更改。提取节点的函数可以在我八月份发布的SO question 中找到,谢天谢地,RSelenium 的创建者自己回答了,jdharrison。
我想问的是我是否可以创建一种树,例如:
newTree <- Node$new("Tree67770)
newTree$AddChild(tmpTree)
所以我最终得到一棵由其他三棵树组成的树,然后它们将成为最终树中的节点,当我绘制大树时,我可以看到所有名称(海报的)。
上述方法不起作用,错误cannot coerce type 'environment' to vector of type 'character' 是可以理解的,因为每个tmpTree 不是字符而是列表。我想把每棵树都改成一个 data.frame,然后把所有的 data.frame 添加回去来构建一棵大树,但在我看来它太长而且太麻烦了。任何帮助将不胜感激。谢谢。
已编辑以添加 dput 示例: 示例 1:
structure(list(postId = 2794984430, date = "Thursday, July 21, 2016 11:17 AM",
poster = "MMM", disqusUname = "disqus_rVXuxnq9MP", message = "\rI am against abortion but I am in favour of contraceptives. Is the MAP a (emergency) contraceptive or not? Is the MAP abortive or not? Unless there is clear unequivocal evidence about this, the circus will continue!\r",
child = list(structure(list(postId = 2795948275, date = "Thursday, July 21, 2016 9:07 PM",
poster = "David Farrugia", disqusUname = "davidfarrugia",
message = "\rIt all depends when the soul has been installed into the egg. LOL\r"), .Names = c("postId",
"date", "poster", "disqusUname", "message")))), .Names = c("postId", "date", "poster", "disqusUname", "message", "child"))
示例 2:
structure(list(postId = 2795142611, date = "Thursday, July 21, 2016 2:04 PM",
poster = "David", disqusUname = "disqus_tTjwlqxma8", message = "\rthis reminds me of the Divorce debate. the dinosaurs from church and the parliament seem to be against anything 'god' does not allow. can they accept the fact that not all of us are into religious fairy tales?\r",
child = list(structure(list(postId = 2796284665, date = "Friday, July 22, 2016 12:30 AM",
poster = "Nessy Testa", disqusUname = "NICOTI", message = "\rno they want to shove their \"morals\" down our throats.. then they go to repent their sins..\r"), .Names = c("postId",
"date", "poster", "disqusUname", "message")))), .Names = c("postId", "date", "poster", "disqusUname", "message", "child"))
上述每个示例都会生成一棵具有根节点和子节点的树,并根据它们的大小进行选择,因为其他示例的深度为 6 层或更多。
我使用tmpTree <- FromListExplicit(postData[[Example 1 or 2]], nameName = "poster", childrenName = "child") 提取树,然后尝试使用以下方法将其转换为新节点:
newTree <- Node$new("root6770")
newNode <- Node$new(tmpTree)
newTree$AddChildNode(newNode)
Error in as.vector(x, "character") :
cannot coerce type 'environment' to vector of type 'character' 的结果是在 newNode <- Node$new(tmpTree) 执行后立即执行的。
我希望通过这个例子更好地解释自己。感谢您的帮助。
【问题讨论】: