如何在R中以有组织的方式读取csv列

时间:2018-10-10 13:59:03

标签: r

我有以下CSV文件:

Evaluator   5    9    2     8
Parser      10   5    16    2
Tokenizer   19   3    7     10

我想阅读以下这些列:

Evaluator   5
Parser      10
Tokenizer   19

Evaluator   9
Parser      5
Tokenizer   3

Evaluator   2
Parser      16
Tokenizer   7

Evaluator   8
Parser      2
Tokenizer   10

如何在R中做到这一点?

3 个答案:

答案 0 :(得分:2)

我们可以在这里利用R的回收性质。您可以按R中的形式读取csv,然后可以通过

对其进行重塑
data.frame(V1 = df$V1 , V2 = unlist(df[-1]))

#       V1  V2
# Evaluator  5
#    Parser 10
# Tokenizer 19
# Evaluator  9
#    Parser  5
# Tokenizer  3
# Evaluator  2
#    Parser 16
# Tokenizer  7
# Evaluator  8
#    Parser  2
# Tokenizer 10

其中V1是数据框的第一列。


如果需要按降序对每个组进行排序,则可以创建一个分组变量和arrange。每个组由V1中的原始条目数组成,在本例中为3,我们在这些组中按降序排序。

library(dplyr)

data.frame(V1 = df$V1 , V2 = unlist(df[-1])) %>%
  arrange(rep(1:(n()/length(df$V1)), each = length(df$V1)), -V2) 

#          V1 V2
#1  Tokenizer 19
#2     Parser 10
#3  Evaluator  5
#4  Evaluator  9
#5     Parser  5
#6  Tokenizer  3
#7     Parser 16
#8  Tokenizer  7
#9  Evaluator  2
#10 Tokenizer 10
#11 Evaluator  8
#12    Parser  2 

或者使用gather

更好的方法
library(dplyr)

df %>%
  gather(Type, Value, -V1) %>%
  arrange(Type, -Value) %>%
  select(-Type)

#          V1 Value
#1  Tokenizer   19
#2     Parser   10
#3  Evaluator   5
#4  Evaluator   9
#5     Parser   5
#6  Tokenizer   3
#7     Parser   16
#8  Tokenizer   7
#9  Evaluator   2
#10 Tokenizer   10
#11 Evaluator   8
#12    Parser   2

数据

df <- structure(list(V1 = structure(1:3, .Label = c("Evaluator", "Parser", 
"Tokenizer"), class = "factor"), V2 = c(5L, 10L, 19L), V3 = c(9L, 
5L, 3L), V4 = c(2L, 16L, 7L), V5 = c(8L, 2L, 10L)), .Names = c("V1", 
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
-3L))

答案 1 :(得分:1)

我们可以尝试读取CSV文件,然后使用rbind

df1 <- data.frame(type=c("Evaluator", "Parser", "Tokenizer"),
                  v1=c(5, 10, 19),
                  v2=c(9, 5, 3),
                  v3=c(2, 16, 7),
                  v4=c(8, 2, 10), stringsAsFactors=FALSE)

df2 <- data.frame(type=character(), value=numeric(), stringsAsFactors=FALSE)
names <- c("type", "value")
df2 <- rbind(df2, setNames(df1[, c(1,2)], names))
df2 <- rbind(df2, setNames(df1[, c(1,3)], names))
df2 <- rbind(df2, setNames(df1[, c(1,4)], names))
df2 <- rbind(df2, setNames(df1[, c(1,5)], names))

df2

enter image description here

Demo

答案 2 :(得分:1)

这不是明智的方法。但是您可以这样做:

   df <- structure(list(Data = c("Evaluator", "Parser", "Tokenizer"), 
                 A = c(5L, 10L, 19L), B = c(9L, 5L, 3L), C = c(2L, 16L, 7L
                 ), D = c(8L, 2L, 10L)), row.names = c(NA, -3L), class = c("tbl_df", 
           "tbl", "data.frame"), spec = structure(list(cols = list(Data = 
   structure(list(), class = c("collector_character",                                                                                                                                                                                                                                                                                                                                           
  "collector"))), class = "col_spec"))

library(reshape2)
melt(df)->df
df[-2]
相关问题