【问题标题】:R large data frame random sub sample based on gridR大数据框基于网格的随机子样本
【发布时间】:2021-03-28 23:33:38
【问题描述】:

我有一个非常大的数据框(7000 列和 14000 个)观察值。它们实际上是对图像每个像素的灰度观察。 x 轴上 7000 像素的观察值和 y 轴上 140000 像素的观察值。 我正在寻找一种方法来执行以下操作:

  • 将数据框划分为 1000 x 1000 像素(或 1000 列和 1000 行)的网格。在这种情况下,它将是一个 7 x 14 的网格
  • 从每个网格中随机选择 1、2 或更多像素
  • 将返回的值连同其 x 和 y 坐标以及它的来源网格(如果可能)存储在新的数据框中,但这意味着也要对每个网格进行编号。

任何关于我如何做到这一点的想法将不胜感激

【问题讨论】:

  • 这是否意味着您的输出将仅包含 1000 行和 1000 列,其中仅选择一个值,原始数据的每个单元格就像7x14 网格,并且输出将包含每个随机选择的像素在这些 1000*1000 网格中?此外,您的数据值是 int/dbl/string 吗?
  • 你是对的。这些值是数字......测量值。
  • 在这种情况下,建议的答案可能对您有用。

标签: r matrix random


【解决方案1】:

如果你想有块矩阵,你可以试试下面的代码

blks <- t(
  sapply(
    split(
      df,
      ceiling(seq(nrow(df)) / 1000)
    ),
    function(x) {
      Map(
        as.matrix,
        split.default(x, ceiling(seq_along(x) / 1000))
      )
    }
  )
)

你会看到

> blks
  1               2               3               4
1 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
2 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
3 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
4 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
5 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
6 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
7 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
  5               6               7               8
1 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
2 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
3 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
4 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
5 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
6 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
7 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
  9               10              11              12
1 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
2 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
3 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
4 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
5 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
6 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
7 Numeric,1000000 Numeric,1000000 Numeric,1000000 Numeric,1000000
  13              14
1 Numeric,1000000 Numeric,1000000
2 Numeric,1000000 Numeric,1000000
3 Numeric,1000000 Numeric,1000000
4 Numeric,1000000 Numeric,1000000
5 Numeric,1000000 Numeric,1000000
6 Numeric,1000000 Numeric,1000000
7 Numeric,1000000 Numeric,1000000

> dim(blks[[2,3]])
[1] 1000 1000

> str(blks[[2,3]])
 num [1:1000, 1:1000] 0.909 0.833 0.347 0.837 0.58 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:1000] "1001" "1002" "1003" "1004" ...
  ..$ : chr [1:1000] "X2001" "X2002" "X2003" "X2004" ...

数据

set.seed(123)
df <- data.frame(matrix(runif(7000 * 14000), ncol = 14000))

【讨论】:

    【解决方案2】:

    您可以通过选择所需的随机观察结果来进行下采样,然后对数据框进行子集化。例如,下面的代码会在每个块中生成一个随机位置,然后对选择的点进行子集化。

    library(tidyverse)
    set.seed(42)
    df <- data.frame(matrix(runif(7000 * 14000), ncol = 7000, nrow = 14000))
    h <- 14
    w <- 7
    j <- 1000
    x_places <- sample.int(j, w, replace = TRUE) + seq(0, (w-1) * j, by = j)
    y_places <- sample.int(j, h, replace = TRUE) + seq(0, (h-1) * j, by = j)
    new_df <- df[y_places, x_places]
    
    

    如果您想要更精细或更粗糙的网格,您可以相应地更改 h、j 和 k。此外,如果您为原始数据框的行和列命名,则默认情况下您将拥有这些位置。

    【讨论】:

    • 谢谢凯莉。那很棒,但不是我所追求的。该代码给了我一个数据框,但列数据点都来自相同的 x 值。我想要的是每个 1000 x 1000 像素的网格在 x 和 y 方向上随机采样。考虑一下,这意味着输出不能是矩阵,没关系。它可能应该类似于带有 x 和 y 指定的列以及来自 x 和 y 的值,例如:x y 值 234 566 0.9876 345. 1343 0.8762 等这有意义吗
    【解决方案3】:

    遵循这个策略

    #create a sample dataframe
    set.seed(123)
    df <- data.frame(matrix(runif(7000*14000), 14000))
    
    #Step-1: create a blank output df say `df2`
    df2 <- as.data.frame(matrix(rep(NA, 1000*1000), 1000))
    #step-2: for loop to store sampled values in output `df2`
    for(i in 0:999){
      for(j in 0:999){
        df2[i+1, j+1] <- sample(matrix(as.matrix(df[0:13999 %/% 14 == i, 0:6999 %/% 7 == j]),1),1)
      }
    }
    

    检查它的尺寸

    > dim(df2)
    [1] 1000 1000
    

    检查其随机元素以查看该循环是否有效

    > df2[5,45]
    [1] 0.1724635
    

    【讨论】:

    • Anil .. 我认为这应该有效,但小 mac 无法应付.. 我发布了我非常笨拙但可行的解决方案。感谢您的意见。
    【解决方案4】:

    我确信比我更好的人有一个更优雅的解决方案,但这就是我最终所做的,在一个 7000 x 140000 的矩阵中随机抽样每个 1000 x 1000 网格。它冗长但有效

    设置随机数(1 和 1000)的最小值和最大值并逐行工作以随机选择行值和列值并将该数字从矩阵中提取出来。

    第一行 (7000/1000) 执行此操作七次。 然后将 1000 添加到第二行的最小值和最大值 每次提取这些值时,它们都会存储到“g”并使用 rbind 附加到 df_new。

    重复直到我到达第 14 行,他们整理了一些数据并重命名。

    最后是一个包含 98 个值的数据框,以及它们来自的网格编号及其在矩阵中的 x 和 y 坐标

    对于 7000 x 14000 的矩阵和 1000 x 1000 的网格不是很灵活和固定。但我不需要改变这个.....但是!!!

    我的愿望清单将是一个让我设置的功能..

    • 网格大小(x 和 y),所以我可以有矩形网格
    • 从每个网格中提取的样本数(1、2、3 或更多)

    感谢大家的意见。非常感谢。

    # MAKE A DUMMY MATRIX
    set.seed(123)
    df <- data.frame(matrix(runif(7000*14000), 14000))
    
    # PICK 1 RANDOM VALUE FROM EACH 1000 BY 1000 GRID
    # STARTING WITH THE TOP ROW (GRIDS 1 TO 7)
    # THEN THE NEXT ROW (GRIDS 8 TO 9) ETC ETC 
    # UNITL GRID 98
    
    set.seed(42)
    
    # 1st row of grids
    min=1
    max=1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    df_new = c(ran_c, ran_r, df[ran_r,ran_c])
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 2nd row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 3rd row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 4th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 5th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 6th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 7th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 8th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 9th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 10th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 11th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 12th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 13th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    # 14th row of grids
    min = min + 1000
    max = max +1000
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1,1000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,1001,2000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,2001,3000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,3001,4000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,4001,5000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,5001,6000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    ran_r = round(runif(1,min,max),0)
    ran_c = round(runif(1,6001,7000),0)
    g = c(ran_c, ran_r, df[ran_r,ran_c])
    df_new = rbind(df_new, g)
    
    
    # SOME DATA TIDYING
    library(dplyr)
    df_new = as.data.frame(df_new) # convert to data frame
    df_new = df_new %>% mutate(grid = 1:n()) # add a sequential column named grid
    df_new = df_new %>%
      relocate(grid)
    rownames(df_new)<-c(1:nrow(df_new)) # add row names 1 to 98 or nrow(df_new)
    df_new = df_new %>%
      rename(
        x = 2,
        y = 3,
        value = 4) # rename columns
    
    # Clear data no longer needed to prepare for next run
    rm(g)
    rm(max)
    rm(min)
    rm(ran_c)
    rm(ran_r)
    

    【讨论】:

      猜你喜欢
      • 2012-03-26
      • 1970-01-01
      • 1970-01-01
      • 2021-10-04
      • 2016-10-03
      • 1970-01-01
      • 2014-04-11
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多