代码之家  ›  专栏  ›  技术社区  ›  James L.

cSplit\u e不返回二进制数据帧

  •  0
  • James L.  · 技术社区  · 8 年前

    我有一个带有 Genre 包含以下行的列 Action,Romance . 我想拆分这些值并创建一个二进制向量。如果 Action,Romance,Drama 是所有可能的类型,那么上述行将是 1,1,0 在输出数据框中。

    我发现 this this SO发布,以及 this CRAN doc 覆盖cSplit\u e,但当我使用它时,我不会得到二进制数据帧输出,而是得到原始数据帧,其中有一些值被置乱。

    a = cSplit_e(df4, "Genre", sep = ",", mode = "binary", type = "character", drop=TRUE, fixed=TRUE,fill = 0)
    

    编辑:问题似乎是将新列添加到旧数据框中,而不是创建新框架。我怎样才能让这些流派进入它们自己的框架?

    > names(a)
     [1] "Title"             "Year"              "Rated"             "Released"          "Runtime"           "Genre"             "Director"          "Writer"            "Actors"           
    [10] "Plot"              "Language"          "Country"           "Awards"            "Poster"            "Metascore"         "imdbRating"        "imdbVotes"         "imdbID"           
    [19] "Type"              "tomatoMeter"       "tomatoImage"       "tomatoRating"      "tomatoReviews"     "tomatoFresh"       "tomatoRotten"      "tomatoConsensus"   "tomatoUserMeter"  
    [28] "tomatoUserRating"  "tomatoUserReviews" "tomatoURL"         "DVD"               "BoxOffice"         "Production"        "Website"           "Response"          "Budget"           
    [37] "Domestic_Gross"    "Gross"             "Date"              "Genre_Action"      "Genre_Adult"       "Genre_Adventure"   "Genre_Animation"   "Genre_Biography"   "Genre_Comedy"     
    [46] "Genre_Crime"       "Genre_Documentary" "Genre_Drama"       "Genre_Family"      "Genre_Fantasy"     "Genre_Film-Noir"   "Genre_Game-Show"   "Genre_History"     "Genre_Horror"     
    [55] "Genre_Music"       "Genre_Musical"     "Genre_Mystery"     "Genre_N/A"         "Genre_News"        "Genre_Reality-TV"  "Genre_Romance"     "Genre_Sci-Fi"      "Genre_Short"      
    [64] "Genre_Sport"       "Genre_Talk-Show"   "Genre_Thriller"    "Genre_War"         "Genre_Western"    
    
    1 回复  |  直到 8 年前
        1
  •  1
  •   A5C1D2H2I1M1N2O1R2T1    8 年前

    这个 drop 参数仅适用于要拆分的列,而不适用于 data.frame . 因此,要随后仅提取拆分的列,请使用原始列名并仅提取这些列。

    例子:

    > a <- cSplit_e(df4, "Genre", ",", mode = "binary", type = "character", fill = 0, drop = TRUE)
    > a
      id Genre_Action Genre_Drama Genre_Romance
    1  1            1           0             1
    2  2            1           1             1
    > a[startsWith(names(a), "Genre")]
      Genre_Action Genre_Drama Genre_Romance
    1            1           0             1
    2            1           1             1
    

    哪里:

    df4 <- structure(list(Genre = c("Action,Romance", "Action,Romance,Drama"), id = 1:2), 
      .Names = c("Genre", "id"), row.names = 1:2, class = "data.frame")
    
    推荐文章