【问题标题】:How do I display a correlation coefficient in a scatterplot?如何在散点图中显示相关系数?
【发布时间】:2021-12-27 11:10:00
【问题描述】:

在散点图中,我想沿描述 x 和 y 之间关系的方程显示相关系数。 我已经创建了我的数据材料,这是我目前的代码:

library(tidyverse)

# Creation of datamaterial

salary <- c(95, 100, 105, 110, 120, 124, 135, 150, 165, 175, 225, 230, 235, 260)
height <- c(160, 150, 182, 165, 172, 175, 183, 187, 174, 193, 201, 172, 180, 188)
fakenumbers <- data.frame(salary, height)

cor(height, salary, method = c("pearson"))

# Creation of scatterplot

r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
  geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
  labs(y = "Hourly salary 
       (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
  theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                          axis.title = element_text(size = 15), 
                          axis.title.y = element_text(angle = 0, vjust = 0.5), 
                          axis.text = element_text(size = 11))

# Adding a regressionline

r + geom_smooth(method = lm, formula = y ~ x, se = FALSE)

在坐标系内部,在回归线旁边,我想显示一个“r = 0.588”和一些描述线性关系的方程。我怎样才能做到这一点,最好使用 ggplot() 或其他一些函数?

【问题讨论】:

    标签: r ggplot2 linear-regression scatter-plot pearson-correlation


    【解决方案1】:

    我们可以使用ggpubr 包来实现,将stat_cor(p.accuracy = 0.001, r.accuracy = 0.01) 添加到您的代码中:

    library(ggpubr)
    library(tidyverse)
    
    r <- ggplot(fakenumbers, aes(x = height, y = salary)) + 
      geom_point(size = 3, shape = 21, color = "black", fill = "blue") + 
      stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)+
      labs(y = "Hourly salary 
           (sek)", x = "height (cm)", title = "Relationship between height and salary (made up data)") + 
      theme_classic() + theme(plot.title = element_text(hjust = 0.5, size = 18), 
                              axis.title = element_text(size = 15), 
                              axis.title.y = element_text(angle = 0, vjust = 0.5), 
                              axis.text = element_text(size = 11))
    

    【讨论】:

      【解决方案2】:

      这里是基本的 R 方式。定义一个公式fo,计算回归,定义一个eqation。

      corr <- cor(height, salary, method = c("pearson"))
      
      fo <- salary ~ height
      fit <- lm(fo, fakenumbers)
      (eq <- paste0(all.vars(fo)[1], ' ~ ', paste0(round(coef(fit), 2),
                    gsub('\\*\\(Intercept\\)', '', 
                         paste0('*', names(coef(fit)))), collapse=' + ')))
      # [1] "salary ~ -281.58 + salary ~ 2.49*height"
      

      然后在plot()abline()text()中使用变量。

      plot(fo, fakenumbers, pch=20, col=4,
           xlab='height (cm)', ylab='Hourly salar (sec)',
           main='Relationship between height and salary (made up data)')
      abline(fit, col=4)
      text(149, 250, bquote(italic('r=')~.(round(corr, 3))), adj=0, cex=.8)
      text(149, 235, eq, adj=0, cex=.8)
      


      数据:

      fakenumbers <- structure(list(salary = c(95, 100, 105, 110, 120, 124, 135, 150, 
      165, 175, 225, 230, 235, 260), height = c(160, 150, 182, 165, 
      172, 175, 183, 187, 174, 193, 201, 172, 180, 188)), class = "data.frame", row.names = c(NA, 
      -14L))
      

      【讨论】:

        【解决方案3】:

        另一种方式:

        round(cor(height, salary, method = c("pearson")), 4) -> corr
        

        然后使用geom_text显示相关系数:

        r +
          geom_smooth(method = lm, formula = y ~ x, se = FALSE) +
          geom_text(x = 152, y = 250,
                    label = paste0('r = ', corr),
                    color = 'red')
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2014-12-08
          • 2022-01-03
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2023-01-24
          • 1970-01-01
          • 2019-08-03
          相关资源
          最近更新 更多