Question

我正在读取许多具有相同列名的大型.csv文件，并使用以下代码对它们进行行绑定（如https://serialmentor.com/blog/2016/6/13/reading-and-combining-many-tidy-data-files-in-R所示）：

require(readr)  # for read_csv()
require(purrr)  # for map(), reduce()

# find all file names ending in .csv 
files <- dir(pattern = "*.csv")
files

data <- files %>%
  map(read_csv) %>%    # read in all the files individually, using
                   # the function read_csv() from the readr package
  reduce(rbind)        # reduce with rbind into one dataframe
data

但是，我的数据只有一列需要以.character格式读取，因为它包含以“，”分隔的数字字符串条目，否则read_csv将该列转换为没有逗号的数字。

我怎么

1。）指定仅以字符形式读入一列（最好按名称）？

或

2。）只需将所有列都读为字符？

第二个选项并不理想，因为那之后我不得不将许多列改回数字。

我尝试使用：

col_types = cols(.default = "c")

，如https://github.com/tidyverse/readr/issues/148和https://github.com/tidyverse/readr/issues/292所述。

我的方法是这样的：

data <- files %>%
   map(read_csv( col_types = cols(.default = "c" ))) %>%
   reduce(rbind)   
data

但是，这不起作用，因为read_csv（）要求输入'x'（即.csv文件路径）。它引发此错误：

Error in read_delimited(file, tokenizer, col_names = col_names, col_types = col_types,  : 
  argument "file" is missing, with no default

Answer 1

每个.csv文件的九个（或其他数字）列名称相同，只有两列（在本例中为“ start_scan”和“ end_scan”）将被读取为数字，其余所有将被读取为字符：

files <- dir(pattern = "*.csv")

metadata <- files %>%
  map_df(~read_csv(., col_types = cols(.default = "c", 
    scan_end = "n", scan_start = "n") ))

将多个.csv文件与tidyr函数结合使用时，需要将部分（或全部）列读取为.character

1 个答案: