代码之家  ›  专栏  ›  技术社区  ›  F. Privé

从包调用时呈现问题

  •  4
  • F. Privé  · 技术社区  · 7 年前

    我做了一个小包装来重现这个问题:

    # example package
    devtools::install_github("privefl/minipkg")
    
    # example Rmd
    rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")
    writeLines(readLines(rmd))  ## see content
    
    # works fine
    rmarkdown::render(
      rmd,
      "all",
      envir = new.env(),
      encoding = "UTF-8"
    )
    
    # !! does not work !!
    minipkg::my_render(rmd)
    minipkg::my_render  ## see source code
    

    我不明白为什么行为不同以及如何解决这个问题。

    编辑: 我知道我可以用 Matrix::t() . 我的问题更多的是“为什么我需要在这个特定的情况下使用它,而不是在所有其他情况下(例如 rmarkdown::render() 在包裹外面?“.


    误差

    Quitting from lines 10-13 (Matrix.Rmd) 
    Error in t.default(mat) : argument is not a matrix
    

    矩阵文件

    ---
    output: html_document
    ---
    ```{r setup, include=FALSE}
    knitr::opts_chunk$set(echo = TRUE)
    ```
    
    ```{r}
    library(Matrix)
    mat <- rsparsematrix(10, 10, 0.1)
    t(mat)
    ```
    

    控制台输出:

    > # example package
    > devtools::install_github("privefl/minipkg")
    Downloading GitHub repo privefl/minipkg@master
    ✔  checking for file ‘/private/var/folders/md/03gdc4c14z18kbqwpfh4jdfc0000gr/T/RtmpKefs4h/remotes685793b9df4/privefl-minipkg-c02ae62/DESCRIPTION’ ...
    ─  preparing ‘minipkg’:
    ✔  checking DESCRIPTION meta-information ...
    ─  checking for LF line-endings in source and make files and shell scripts
    ─  checking for empty or unneeded directories
    ─  building ‘minipkg_0.1.0.tar.gz’
    
    * installing *source* package ‘minipkg’ ...
    ** R
    ** inst
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** testing if installed package can be loaded
    * DONE (minipkg)
    > # example Rmd
    > rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")
    > writeLines(readLines(rmd))  ## see content
    ---
    output: html_document
    ---
    
    ```{r setup, include=FALSE}
    knitr::opts_chunk$set(echo = TRUE)
    ```
    
    ```{r}
    library(Matrix)
    mat <- rsparsematrix(10, 10, 0.1)
    t(mat)
    ```
    
    > # works fine
    > rmarkdown::render(
    +   rmd,
    +   "all",
    +   envir = new.env(),
    +   encoding = "UTF-8"
    + )
    
    
    processing file: Matrix.Rmd
      |.............                                                    |  20%
      ordinary text without R code
    
      |..........................                                       |  40%
    label: setup (with options) 
    List of 1
     $ include: logi FALSE
    
      |.......................................                          |  60%
      ordinary text without R code
    
      |....................................................             |  80%
    label: unnamed-chunk-1
      |.................................................................| 100%
      ordinary text without R code
    
    
    output file: Matrix.knit.md
    
    /usr/local/bin/pandoc +RTS -K512m -RTS Matrix.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --output Matrix.html --email-obfuscation none --self-contained --standalone --section-divs --template /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /var/folders/md/03gdc4c14z18kbqwpfh4jdfc0000gr/T//RtmpKefs4h/rmarkdown-str68525040df1.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --metadata pagetitle=Matrix.utf8.md 
    
    Output created: Matrix.html
    > # !! does not work !!
    > minipkg::my_render(rmd)
    
    
    processing file: Matrix.Rmd
      |.............                                                    |  20%
      ordinary text without R code
    
      |..........................                                       |  40%
    label: setup (with options) 
    List of 1
     $ include: logi FALSE
    
      |.......................................                          |  60%
      ordinary text without R code
    
      |....................................................             |  80%
    label: unnamed-chunk-1
    Quitting from lines 10-13 (Matrix.Rmd) 
    Error in t.default(mat) : argument is not a matrix
    
    > minipkg::my_render  ## see source code
    function (rmd) 
    {
        rmarkdown::render(rmd, "all", envir = new.env(), encoding = "UTF-8")
    }
    <bytecode: 0x7f89c416c2a8>
    <environment: namespace:minipkg>
    >
    
    1 回复  |  直到 7 年前
        1
  •  2
  •   AlexR    7 年前

    它是如何工作的

    问题是 envir = new.env() . 你需要的是 envir = new.env(parent = globalenv()) :

    devtools::install_github("privefl/minipkg")
    rmd <- system.file("extdata", "Matrix.Rmd", package = "minipkg")
    
    minipkg::my_render(rmd)
    # Fails
    
    f <- minipkg::my_render
    body(f) <- quote(rmarkdown::render(rmd, "all", envir = new.env(parent = globalenv()), encoding = "UTF-8"))
    
    ns <- getNamespace("minipkg")
    unlockBinding("my_render", ns)
    assign("my_render", f, envir = ns)
    
    minipkg::my_render(rmd)
    # Patched one works :)
    

    为什么工作

    查看的默认参数 new.env() 找到默认父环境是 parent.frame() . 注意,从控制台,这将是 globalenv() 从一个包裹里,就是这个包裹 命名空间 (与包环境不同!).

    您可以使用 getNamespace("pkg") .它是包含包的所有(也是内部)对象的环境。问题是,从某种意义上讲,这个环境与R中常见的搜索/方法查找机制“断开连接”,因此即使这些方法附加到 search() .

    现在选择 new.env(parent = globalenv()) 将父环境设置为搜索路径的顶部,从而能够查找所有附加的方法。

    不同方法的基准

    这三种方法都可以生成适当的HTML文件:

    #' Render an Rmd file
    #' @param rmd Path of the R Markdown file to render.
    #' @export
    my_render <- function(rmd) {
      rmarkdown::render(
        rmd,
        "all",
        envir = new.env(parent = globalenv()),
        encoding = "UTF-8"
      )
    }
    
    #' Render an Rmd file
    #' @param rmd Path of the R Markdown file to render.
    #' @export
    my_render2 <- function(rmd) {
      cl <- parallel::makePSOCKcluster(1)
      on.exit(parallel::stopCluster(cl), add = TRUE)
      parallel::clusterExport(cl, "rmd", envir = environment())
      parallel::clusterEvalQ(cl, {
        rmarkdown::render(rmd, "all", encoding = "UTF-8")
      })[[1]]
    }
    
    #' Render an Rmd file
    #' @param rmd Path of the R Markdown file to render.
    #' @export
    my_render3 <- function(rmd) {
        system2(
            command = "R",
            args = c("-e", shQuote(sprintf("rmarkdown::render('%s', 'all', encoding = 'UTF-8')", gsub("\\\\", "/", normalizePath(rmd))))),
            wait = TRUE
        )
    }
    

    现在比较它们的速度很有趣:

    > microbenchmark::microbenchmark(my_render("inst/extdata/Matrix.Rmd"), my_render2("inst/extdata/Matrix.Rmd"), my_render3("inst/extdata/Matrix.Rmd"), times = 10L)
    
    [...]
    
    Unit: milliseconds
                                      expr       min       lq      mean    median        uq      max neval
      my_render("inst/extdata/Matrix.Rmd")  352.7927  410.604  656.5211  460.0608  560.3386 1836.452    10
     my_render2("inst/extdata/Matrix.Rmd") 1981.8844 2015.541 2163.1875 2118.0030 2307.2812 2407.027    10
     my_render3("inst/extdata/Matrix.Rmd") 2061.7076 2079.574 2152.0351 2138.9546 2181.1284 2377.623    10
    

    结论

    • envir = new.env(globalenv()) 是迄今为止最快的(比其他方案快了近4倍)
      我希望开销是恒定的,所以对于较大的RMD文件来说应该是不相关的。
    • 在用 system2 以及使用具有1个节点的并行sock集群。