代码之家  ›  专栏  ›  技术社区  ›  Mikko Marttila

从print()控制台输出重新创建矢量

r
  •  5
  • Mikko Marttila  · 技术社区  · 6 年前

    令人遗憾的是,您经常会看到问题,以便以一种格式显示数据 这是不可复制的;通常只是 print()

    set.seed(1)
    
    x <- sample(LETTERS, 40, replace = T)
    y <- rnorm(20)
    

    …例如:

    x
     [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
    [18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
    [35] "V" "R" "U" "C" "S" "K"
    

    …或者:

    y
     [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
     [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
    [11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
    [16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575
    

    理想情况下,我希望能够将文本从上面的块复制到我的剪贴板,并调用一些函数 foo() 这样 all.equal(foo(), x) 对于离散数据类型,以及 all(near(foo(), y)) 对于浮点数(给定打印精度)。

    是否有一种简单的方法(近似地)从复制的结果重建一个简单的向量 打印() “是吗?


    编辑: 具有讽刺意味的是,我意识到我自己的例子并不能完全重现。下面是创建复制打印输出的代码:
    y_printed <- capture.output(y)
    
    3 回复  |  直到 6 年前
        1
  •  2
  •   Rui Barradas    6 年前

    我用 scan 为了那个问题。

    你能用下面的代码做一个函数吗?

    y <-
      '[1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
     [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
    [11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
    [16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575'
    
    y <- scan(what = character(), text = y)
    y <- sub("^\\s*\\[\\d+\\]", "", y)
    y <- as.numeric(y[y != ""])
    

    根据@moody_mudskipper评论中的建议,

    模式可以更新为“^\s*\[\d+\]”,以支持op的示例(以空格开头)。

    函数可以是

    recreateVector <- function(X, numeric = TRUE, quiet = FALSE){
      X <- scan(what = character(), text = X, quiet = quiet)
      X <- sub("^\\s*\\[\\d+\\]", "", X)
      X <- X[X != ""]
      if(numeric) X <- as.numeric(X)
      X
    }
    
    
    recreateVector(y)   # Use the original y
    #Read 24 items
    # [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
    # [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
    #[11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
    #[16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575
    

    使用字符向量,设置参数 numeric = FALSE ,默认为 TRUE .

    x <-
    '[1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
    [18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
    [35] "V" "R" "U" "C" "S" "K"'
    
    recreateVector(x, numeric = FALSE)
    #Read 43 items
    # [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U"
    #[16] "M" "S" "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I"
    #[31] "M" "P" "M" "E" "V" "R" "U" "C" "S" "K"
    

    注意这个论点 quiet 。我已将默认设置为 FALSE 就像在定义 扫描 因为我更喜欢看有没有什么东西被真正读进去。

        2
  •  2
  •   Nicolas2    6 年前

    我们可以模拟读取csv文件时对数据类型的猜测:

    library(tidyverse)
    unprint <- function(s) {
      s %>% str_replace_all(" *\\[\\d+\\] *","") %>% str_replace_all(" +","\n") %>% 
      textConnection %>% read.table
    }
    unprint(' [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
     [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
    [11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
    [16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575') %>% head
    
    #           V1
    #1  0.91897737
    #2  0.78213630
    #3  0.07456498
    #4 -1.98935170
    #5  0.61982575
    #6 -0.05612874
    
    
    unprint(' [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
    [18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
    [35] "V" "R" "U" "C" "S" "K"') %>% head
    
    #  V1
    #1  G
    #2  J
    #3  O
    #4  X
    #5  F
    #6  X
    

    处理字符串中括号的更详细的版本: 也给出了正确的输出:矢量,而不是数据帧。

    unprint <- function(s) {
      t <- s %>% textConnection %>% readLines %>% 
        str_replace(" *\\[\\d+\\] *","") %>%
        paste(collapse=' ') %>% str_replace_all(" ","\n") %>% 
        textConnection %>% read.table(stringsAsFactors=FALSE) 
      t$V1 %>% str_replace_all("\n"," ")
    }
    
    x <- unprint(' [1] "x + y  [1]" "x + z  [2]"')
    x
    #[1] "x + y  [1]" "x + z  [2]"
    
        3
  •  0
  •   Mikko Marttila    6 年前

    为了我的使用,我最后修改了@ruibaradas'的答案。 我想要的一些功能:从剪贴板上阅读,输入猜测(用 帮助 读写器 )

    rescue_vector <- function(x = readClipboard()) {
      x <- gsub("(^|\n)\\s*\\[\\d+\\]", "", x)
      x <- scan(text = x, what = character(),
                allowEscapes = TRUE, quiet = TRUE)
      readr::parse_guess(x, na = character())
    }
    

    它在给定的示例数据上工作:

    set.seed(1)
    
    x <- sample(LETTERS, 40, replace = TRUE)
    all.equal(x, rescue_vector(capture.output(x)))
    #> [1] TRUE
    
    y <- rnorm(20)
    all.equal(y, rescue_vector(capture.output(y)))
    #> [1] TRUE
    

    从剪贴板中读取:

    writeClipboard(capture.output(y))
    all.equal(y, rescue_vector())
    #> [1] TRUE
    

    还有一些奇怪的案例:

    z <- c("[1] first \n second", "[2] + 1")
    all.equal(z, rescue_vector(capture.output(z)))
    #> [1] TRUE
    

    但缺少价值观仍然是一个问题:

    na <- c("", "NA", NA)
    rescue_vector(capture.output(na))
    #> [1] "" NA NA
    

    正如穆迪在评论中提到的那样,进一步的发展可能 还包括对粘贴桌子的救援尝试。