【问题标题】:Using a for-loop to iterate the inputs to a function in R使用 for 循环将输入迭代到 R 中的函数
【发布时间】:2019-09-26 18:12:16
【问题描述】:

我正在使用一个使用 API 来收集数据的函数。我想动态循环通过在“年”中分配给函数的不同输入。具体来说,我正在尝试编写一个 for 循环来循环遍历每年并将其输入到函数中。

    Years = c("2019", "2018", "2017", "2016", "2015", "2014", "2013", "2012", "2011", "2010")  

    for (Year in names(Years)){


 YearVar <- Year
 Month <- "08"
 Day <- "01"

 Date <- paste(YearVar, Month, Day, sep = "/")

getHourlyLMP <- function(day = Date, locID = 4004, user = getOption(x       = "ISO_NE_USER"), password = getOption(x = "PASSWORD"), 
                                 out.tz = "America/New_York", ...){



 dd_Year <- format(as.Date(day), "%Y%m%d")
 json_Year <- get_path(path = paste0("/hourlylmp/da/final/day/",    dd_Year, "/location/", locID), user = user, password = password, ...)

 dat_Year <- do.call(what = "rbind", 
             lapply(json_Year$HourlyLmps$HourlyLmp, 
                    FUN = function(x){
                      dd_Year <- as.data.frame(x = x, stringsAsFactors = FALSE)
                      locId <- dd_Year[1,"Location"]
                      dd_Year <- dd_Year[2,]
                      dd_Year$locId <- locId
                      dd_Year
                    } ))

 dat_Year$BeginDate <- lubridate::ymd_hms(dat_Year$BeginDate, tz = out.tz)

 rownames(dat_Year) <- 1:nrow(dat_Year)

 return(dat_Year)
 }
  }

当我运行它时,我收到以下错误:“错误:$ 运算符对原子向量无效”

知道是什么导致了错误吗?谢谢!

【问题讨论】:

  • get_path函数从何而来?你确定它返回一个名称为$HourlyLmps 的对象吗?如果您包含一个简单的reproducible example,其中包含可用于测试和验证可能解决方案的示例输入和所需输出,则更容易为您提供帮助。您是否打算在循环的每次迭代中重新定义函数?您实际上在哪里打电话给getHourlyLMP?另请注意 names(Years) 返回 NULL 因为 Years 不是命名向量,所以这似乎不正确。
  • 'get_path' 是一个使用 'httr' 包来拉取 json 文件的函数。 json 文件中有 '$HourlyLmps' 函数在添加 for 循环之前可以正常工作。
  • 年份没有名字...只有长度
  • 谢谢卡尔!如何输入姓名?

标签: r function for-loop


【解决方案1】:

在没有太多信息的情况下,这是我将采取的方法:

首先,将getHourlyLMP 函数放在循环之外...您可以在迭代中调用它,并将变量作为参数传递。

# The function we want to run for each year.

getHourlyLMP <- function(day = Date, locID = 4004, user = getOption(x = "ISO_NE_USER"), password = getOption(x = "PASSWORD"), out.tz = "America/New_York", ...){

  dd_Year <- format(as.Date(day), "%Y%m%d")
  json_Year <- get_path(path = paste0("/hourlylmp/da/final/day/", dd_Year, "/location/", locID), user = user, password = password, ...)

  dat_Year <- do.call(what = "rbind", 
                      lapply(json_Year$HourlyLmps$HourlyLmp, 
                             FUN = function(x){
                               dd_Year <- as.data.frame(x = x, stringsAsFactors = FALSE)
                               locId <- dd_Year[1,"Location"]
                               dd_Year <- dd_Year[2,]
                               dd_Year$locId <- locId
                               dd_Year
                             } ))

  dat_Year$BeginDate <- lubridate::ymd_hms(dat_Year$BeginDate, tz = out.tz)

  rownames(dat_Year) <- 1:nrow(dat_Year)

  return(dat_Year)
}

# Build the Date object in an easier way using ?sprintf
Years = c("2019", "2018", "2017", "2016", "2015", "2014", "2013", "2012", "2011", "2010")  

Date <- sprintf("%s/08/01", Years)

# Date
# [1] "2019/08/01" "2018/08/01" "2017/08/01" "2016/08/01" "2015/08/01" "2014/08/01" "2013/08/01" "2012/08/01" "2011/08/01" "2010/08/01"

# Now just loop through the Date objects
lapply(Date, function(i){
  getHourlyLMP(day = i)
})

经过一些研究并注册 API....

获取给定位置的当前 Final Da Hourly LMP。

#Parameters
#name   description type    default
#day    The day to retrieve data for (YYYYMMDD) path    FORMAT IS VITAL
#locationId The location id path    

get_paths2 <- function(year = NULL, user = getOption("ISO_NE_USER"), pass = getOption("ISO_NE_PASS"), loc_id = 4004, out.tz = "America/New_York"){
  library(jsonlite)
  # Pay attention to the path query, which needs to be YYYYMMDD or 20190927... since you gave static date, i hard coded
  # the 0801, but change that for your needs moving forward
  base <- 'https://webservices.iso-ne.com/api/v1.1/hourlylmp/da/final/day/%s0801/location/%s'

  # Build our call url
  api_url <- sprintf(base, year, loc_id)

  # Call the API
  req <- httr::GET(api_url, httr::authenticate(user = user, password = pass, type = "basic"))
  # confirm the request worked by returning a 200 status code
  if(status_code(req) == 200L){
    data <- content(req)

    # Your data manipulation functions here:
    # I tested on a just 2 items in the list, and it works fine...
    # > rbind_pages(lapply(a$HourlyLmps$HourlyLmp[1:2], function(x){
    #     as.data.frame(x)
    # }))
    # 
    # Building a ?tryCatch in here...
    tryCatch({
      lapply(data$HourlyLmps$HourlyLmp, function(i){
        dd_Year <- as.data.frame(i, stringsAsFactors = FALSE)
        dd_Year %>% mutate(
          BeginDate = lubridate::ymd_hms(BeginDate, tz = out.tz, quiet = TRUE)
        )
      }) %>% rbind_pages()
    }, error = function(e){
      NA
    })
  }
}

注意:原来 2010 年的年份没有返回数据......

out_test <- setNames(lapply(Years, function(i){
    get_paths2(i)
}), Years)




> which(is.na(out_test))
2010 
  10 

> Map(slice, out_test[which(!is.na(out_test))], 1)
$`2019`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2019-08-01            4004         LOAD ZONE .Z.CONNECTICUT     23.8           24.46                   0         -0.66

$`2018`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2018-08-01            4004         LOAD ZONE .Z.CONNECTICUT    24.54           24.59                   0         -0.05

$`2017`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2017-08-01            4004         LOAD ZONE .Z.CONNECTICUT    19.63           19.38                   0          0.25

$`2016`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2016-08-01            4004         LOAD ZONE .Z.CONNECTICUT    28.26           28.08                   0          0.18

$`2015`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2015-08-01            4004         LOAD ZONE .Z.CONNECTICUT    19.42           19.21                   0          0.21

$`2014`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2014-08-01            4004         LOAD ZONE .Z.CONNECTICUT    22.27           21.98                   0          0.29

$`2013`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2013-08-01            4004         LOAD ZONE .Z.CONNECTICUT    27.11           26.73                   0          0.38

$`2012`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2012-08-01            4004         LOAD ZONE .Z.CONNECTICUT    26.59           26.45                   0          0.14

$`2011`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2011-08-01            4004         LOAD ZONE .Z.CONNECTICUT    40.49           39.93                   0          0.56

然后可以将所有与rbind_pages(out_test[which(!is.na(out_test))])结合起来

【讨论】:

  • 好建议!不幸的是,我仍然收到同样的错误。在 lapply 语句中,我是否需要迭代地命名“json_Year”数据集?
  • 你能发布那个函数吗?
  • 可能的错误:getOption 是否返回您期望的结果?文件路径是否真的正确拉动?
  • get_path
  • 我相信 lapply 函数传递了一个原子向量,不能用 '$' 访问。任何想法如何绕过这个问题?
猜你喜欢
  • 2018-10-05
  • 2021-12-20
  • 2020-07-21
  • 2022-11-03
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多