【问题标题】:Correlation coefficient for a Scatter plot散点图的相关系数
【发布时间】:2022-01-03 21:05:00
【问题描述】:

我已经为每个国家/地区绘制了散点图,并且我试图在散点图下添加一个相关系数,但我不断收到错误消息说“选择不能有缺失值”。即使使用了 na.rm

有人可以帮我吗? 感谢您提供的任何帮助。

数据链接EuropeIndia

#
# This is a Shiny web application. You can run the application by clicking
# the 'Run App' button above.
#
# Find out more about building applications with Shiny here:
#
#    http://shiny.rstudio.com/
#

library(shiny)
library(plotly)
library(DT)
library(tidyverse)
library(car)
library(ggpubr)
covid <- read.csv("EuropeIndia.csv")


title <- tags$a(href='https://ourworldindata.org/covid-vaccinations?country=OWID_WRL',
                'COVID 19 Vaccinations')

# Define UI for application 
ui <- fluidPage(
  headerPanel(title = title),
  
  # Application title
  titlePanel("COVID vaccinations: Deaths Vs All variables"),
  
  # Sidebar with a slider input for number of bins 
  sidebarLayout(
    sidebarPanel(
      selectInput("location", "1. Select a country",
                  choices = covid$location, selectize = TRUE, multiple = FALSE),
      br(),
      helpText("2. Select variables for scatterplot"),
      selectInput(inputId = "y", label = "Y-axis:",
                  choices = c("total_deaths", "new_deaths"), 
                  selected = "Deaths",),
      br(),
      selectInput(inputId = "x", label = "X-axis:",
                  choices = names(subset(covid,select = -c(total_deaths,new_deaths,
                                                           iso_code, continent,date,location), na.rm =TRUE)),
                  selectize = TRUE,
                  selected = "Comparator variables")
    ),
    mainPanel(
      textOutput("location"),
      #plotOutput("Scatterplot"),
      tabsetPanel(
        type = "tabs",
        tabPanel("Scatterplot", plotlyOutput("scatterplot"),
                 verbatimTextOutput("correlation"),
                 verbatimTextOutput("interpretation")),
        tabPanel("Summary of COVID data", verbatimTextOutput("summary")),
        tabPanel("Dataset", DTOutput("dataset")))
    )
  )
)

# Define server logic 
server <- function(input, output) {
  output$location <- renderPrint({locationfilter <- subset(covid, covid$location == input$location)})
  output$summary <- renderPrint({summary(covid)})
  output$dataset <- renderDT(
    covid, options = list(
      pageLength = 50,
      initComplete = JS('function(setting, json) { alert("done"); }')
    )
  )
  
  output$scatterplot <- renderPlotly({
    ggplotly(
      ggplot(subset(covid, covid$location == input$location),
             aes(y = .data[[input$y]], x = .data[[input$x]],col = factor(stringency_index)))+
        geom_smooth()+geom_point()+labs(col ="Stringency Index") 
                )
  })
  
  output$correlation <- renderText({
    x= subset(covid, covid$location == input$location) %>% dplyr::select(as.numeric(!!!input$x, na.rm =TRUE))
    y= subset(covid, covid$location == input$location) %>% dplyr::select(as.numeric(!!!input$y, na.rm = TRUE))
    var(x,y, na.rm = T, use)
    cor(x,y, method = 'pearson', na.rm =T)
    })
}


# Run the application 
shinyApp(ui = ui, server = server)

【问题讨论】:

  • 您选择的选项应该是选项之一或 NULL。 selectInput(inputId = "y", label = "Y-axis:", choices = c("total_deaths", "new_deaths"), selected = "Deaths",) 的情况并非如此。此外,您在选择后还有一个额外的逗号。最后,您应该在output$scatterplot 中使用req()

标签: r shiny correlation scatter-plot coefficients


【解决方案1】:

首先,您应该从选择列表中选择一个国家/地区。

为了检查错误,我建议你使用下一个代码。

library(shiny)
library(plotly)
library(DT)
library(tidyverse)
library(car)
library(ggpubr)
covid <- read.csv("EuropeIndia.csv")


title <- tags$a(href='https://ourworldindata.org/covid-vaccinations?country=OWID_WRL',
            'COVID 19 Vaccinations')

# Define UI for application 
ui <- fluidPage(
headerPanel(title = title),

# Application title
titlePanel("COVID vaccinations: Deaths Vs All variables"),

# Sidebar with a slider input for number of bins 
sidebarLayout(
sidebarPanel(
  selectInput("location", "1. Select a country",
              choices = covid$location[1], selectize = TRUE, multiple = FALSE),
  br(),
  helpText("2. Select variables for scatterplot"),
  selectInput(inputId = "y", label = "Y-axis:",
              choices = c("total_deaths", "new_deaths"), 
              selected = "Deaths",),
  br(),
  selectInput(inputId = "x", label = "X-axis:",
              choices = names(subset(covid,select = -c(total_deaths,new_deaths,
                                                       iso_code, continent,date,location), na.rm =TRUE)),
              selectize = TRUE,
              selected = "Comparator variables")
),
mainPanel(
  textOutput("location"),
  #plotOutput("Scatterplot"),
  tabsetPanel(
    type = "tabs",
    tabPanel("Scatterplot", plotlyOutput("scatterplot"),
             verbatimTextOutput("correlation"),
             verbatimTextOutput("interpretation")),
    tabPanel("Summary of COVID data", verbatimTextOutput("summary")),
    tabPanel("Dataset", DTOutput("dataset")))
)
)
)

# Define server logic 
server <- function(input, output) {
output$location <- renderPrint({locationfilter <- subset(covid, covid$location == input$location)})
output$summary <- renderPrint({summary(covid)})
output$dataset <- renderDT(
covid, options = list(
  pageLength = 50,
  initComplete = JS('function(setting, json) { alert("done"); }')
 ) 
)

output$scatterplot <- renderPlotly({
ggplotly(
  ggplot(subset(covid, covid$location == input$location),
         aes(y = .data[[input$y]], x = .data[[input$x]],col = factor(stringency_index)))+
    geom_smooth()+geom_point()+labs(col ="Stringency Index") 
  )
})

output$correlation <- renderText({
x <- covid[covid$location == input$location, input$x]
y <- covid[covid$location == input$location, input$y]
xy = data.frame(x,y)
xy = xy[complete.cases(xy),]
var(xy)
cor(xy,method = 'pearson')
})
}


# Run the application 
shinyApp(ui = ui, server = server)

【讨论】:

  • 成功了。它给了我每个国家的相关性,但只有两次;我没有意识到我必须创建一个数据框并完成。在进行相关之前,cases 是处理缺失值的理想方法,而不是 na.rm。太感谢了。感谢您的帮助。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-09-29
  • 2015-09-29
  • 1970-01-01
  • 2018-06-16
相关资源
最近更新 更多