如何在每个数据类别中绘制多个多边形？答案

【问题标题】：How to plot numerous polygons in each data category?如何在每个数据类别中绘制多个多边形？
【发布时间】：2017-10-30 05:34:58
【问题描述】：

我正在与 ggplot 合作，使用一组单独的工具绘制组中的双变量数据以及这些数据的标准椭圆。这些返回定义每个椭圆的 n=100 x,y 坐标，然后对于每个组，我想绘制大约 10-25 个椭圆。

从概念上讲，如何实现？我可以使用 geom_polygon 轻松绘制单个椭圆，但我很困惑如何组织数据以使其工作，以便绘制多个椭圆并为每个组应用指南（颜色、填充、线型等）。

在传统的 R 绘图中，我可以使用 for 循环继续添加行。

谢谢！

更新：这是一个包含单个椭圆的 100 个坐标的 CSV。

Data

假设我有三组已应用椭圆拟合的二元数据：绿色、红色、蓝色。对于每个组，我想绘制几个椭圆。

我不知道如何以 ggplot 首选的长格式工作并保留组从属关系的方式组织数据。列表有用吗？

更新 2：

这是一个包含原始 x 和 y 数据的 csv，分为两组：河流和湖泊

Data

数据图如下：

test.data <- read.csv("ellipse_test_data.csv")
ggplot(test.data) +
  geom_point(aes(x, y, color = group)) +
  theme_classic()

我正在使用一个名为 SIBER 的包，它将贝叶斯椭圆拟合到数据中，以便按椭圆区域等比较组。以下输出创建一个列表，其中元素数 = 数据组数，每个元素每个拟合椭圆包含 6 xn（n = 绘制次数） - 前四列是向量格式的协方差矩阵 Sigma，后两列是双变量均值：

# options for running jags
parms <- list()
parms$n.iter <- 2 * 10^5   # number of iterations to run the model for
parms$n.burnin <- 1 * 10^3 # discard the first set of values
parms$n.thin <- 100     # thin the posterior by this many
parms$n.chains <- 2        # run this many chains

# define the priors
priors <- list()
priors$R <- 1 * diag(2)
priors$k <- 2
priors$tau.mu <- 1.0E-3

# fit the ellipses which uses an Inverse Wishart prior
# on the covariance matrix Sigma, and a vague normal prior on the 
# means. Fitting is via the JAGS method.
ellipses.test <- siberMVN(siber.test, parms, priors)

列表中第一个元素的前几行：

$`1.river`
     Sigma2[1,1]   Sigma2[2,1]   Sigma2[1,2] Sigma2[2,2]     mu[1]    mu[2]
[1,]   1.2882740  2.407070e-01  2.407070e-01    1.922637 -15.52846 12.14774
[2,]   1.0677979 -3.997169e-02 -3.997169e-02    2.448872 -15.49182 12.37709
[3,]   1.1440816  7.257331e-01  7.257331e-01    4.040416 -15.30151 12.14947

我希望能够提取这些椭圆的随机数，并使用 ggplot 使用 alpha 透明度绘制它们。

SIBER 包有一个函数 (addEllipse) 可以将“6 x n”条目转换为一组定义椭圆的 x 和 y 点，但我不知道如何为 ggplot 组织该输出。我认为可能有一种优雅的方式可以在内部使用 ggplot。

理想的输出应该是这样的，但在 ggplot 中，椭圆可以匹配数据级别的美学：

【问题讨论】：

如果您可以包含数据样本，则更容易重现该示例。见How to make a great R reproducible example
嗨，对不起。我会，但数据示例会非常大，有几千行。
您不需要包含所有数据，只需一个小样本，以便您可以证明您的问题
好的，我包含了一个由 100 个坐标定义的示例多边形。谢谢！
通过提供的数据，我能够使用geom_polygon() 绘制一个椭圆，但这很容易，并且不是您已经提到的问题的核心。有趣的部分是您的二元数据看起来如何以及它们与一个或多个椭圆的关系如何。这些数据的样本会很有帮助。

标签： r ggplot2 ellipse

【解决方案1】：

在来自 SIBER 的捆绑演示数据集上执行此操作的一些代码。

在此示例中，我们尝试使用 ggplot2 创建后椭圆多个样本的图。

library(SIBER)
library(ggplot2)
library(dplyr)
library(ellipse)

将基本 SIBER 模型拟合到与包捆绑在一起的示例数据。

# load in the included demonstration dataset
data("demo.siber.data")
#
# create the siber object
siber.example <- createSiberObject(demo.siber.data)

# Calculate summary statistics for each group: TA, SEA and SEAc
group.ML <- groupMetricsML(siber.example)

# options for running jags
parms <- list()
parms$n.iter <- 2 * 10^4   # number of iterations to run the model for
parms$n.burnin <- 1 * 10^3 # discard the first set of values
parms$n.thin <- 10     # thin the posterior by this many
parms$n.chains <- 2        # run this many chains

# define the priors
priors <- list()
priors$R <- 1 * diag(2)
priors$k <- 2
priors$tau.mu <- 1.0E-3

# fit the ellipses which uses an Inverse Wishart prior
# on the covariance matrix Sigma, and a vague normal prior on the 
# means. Fitting is via the JAGS method.
ellipses.posterior <- siberMVN(siber.example, parms, priors)

# The posterior estimates of the ellipses for each group can be used to
# calculate the SEA.B for each group.
SEA.B <- siberEllipses(ellipses.posterior)

siberDensityPlot(SEA.B, xticklabels = colnames(group.ML), 
                xlab = c("Community | Group"),
                ylab = expression("Standard Ellipse Area " ('\u2030' ^2) ),
                bty = "L",
                las = 1,
                main = "SIBER ellipses on each group"
                )

现在我们想从这些分布中创建一些示例椭圆的图。我们需要为每个组创建一个包含所有椭圆的 data.frame 对象。在这个例子中，我们假设它们彼此独立，我们只取前 10 个后验图，但如果您愿意，您可以随机抽取一个样本。

# how many of the posterior draws do you want?
n.posts <- 10

# decide how big an ellipse you want to draw
p.ell <- 0.95

# for a standard ellipse use
# p.ell <- pchisq(1,2)




# a list to store the results
all_ellipses <- list()

# loop over groups
for (i in 1:length(ellipses.posterior)){

  # a dummy variable to build in the loop
  ell <- NULL
  post.id <- NULL

  for ( j in 1:n.posts){

    # covariance matrix
    Sigma  <- matrix(ellipses.posterior[[i]][j,1:4], 2, 2)

    # mean
    mu     <- ellipses.posterior[[i]][j,5:6]

    # ellipse points

    out <- ellipse::ellipse(Sigma, centre = mu , level = p.ell)


    ell <- rbind(ell, out)
    post.id <- c(post.id, rep(j, nrow(out)))

  }
  ell <- as.data.frame(ell)
  ell$rep <- post.id
  all_ellipses[[i]] <- ell
}

ellipse_df <- bind_rows(all_ellipses, .id = "id")


# now we need the group and community names

# extract them from the ellipses.posterior list
group_comm_names <- names(ellipses.posterior)[as.numeric(ellipse_df$id)]

# split them and conver to a matrix, NB byrow = T
split_group_comm <- matrix(unlist(strsplit(group_comm_names, "[.]")),
                           nrow(ellipse_df), 2, byrow = TRUE)

ellipse_df$community <- split_group_comm[,1]
ellipse_df$group     <- split_group_comm[,2]

ellipse_df <- dplyr::rename(ellipse_df, iso1 = x, iso2 = y)

现在创建图。首先根据需要绘制所有原始数据。

first.plot <- ggplot(data = demo.siber.data, aes(iso1, iso2)) +
  geom_point(aes(color = factor(group):factor(community)), size = 2)+
  ylab(expression(paste(delta^{15}, "N (\u2030)")))+
  xlab(expression(paste(delta^{13}, "C (\u2030)"))) + 
  theme(text = element_text(size=15))
print(first.plot)

现在我们可以尝试按组在顶部和小平面上添加后椭圆

second.plot <- first.plot + facet_wrap(~factor(group):factor(community))
print(second.plot)

# rename columns of ellipse_df to match the aesthetics

third.plot <- second.plot + 
  geom_polygon(data = ellipse_df,
              mapping = aes(iso1, iso2,
                             group = rep,
                             color = factor(group):factor(community),
                             fill = NULL),
               fill = NA,
               alpha = 0.2)
print(third.plot)

Facet-wrapped plot of sample of posterior ellipses by group

【讨论】：