我有多个CSV文件,列数不同,需要重新格式化为固定格式的文本文件。
在这个阶段,我对需要编辑的列进行散列和取消散列,但是它很乏味,我无法在不更改程序的情况下添加新列。
是否有更简单的方法来读取,拆分和编辑所有列,无论文件中的列数是多少?
到目前为止,这是我的代码:
library(tidyverse)
library(jsonlite)
flight.df.time <- data.frame(icao24 = character(),
callsign = character(),
origin_country = character(),
time_position = double(),
time_velocity = double(),
longitude = double(),
latitude = double(),
altitude = double(),
on_ground = character(),
velocity = double(),
heading = double(),
vertical_rate = double(),
sensors = character(),
Time = character(),
stringsAsFactors = FALSE)
write.table(flight.df.time,'G:/DCIST/OpenSky/Data/flight_week.csv', sep = ",", row.names = FALSE)
for (day in 1:100){
for (i in seq(1, 8640)){
rm(flight.df)
flight.df <- data.frame()
flight_url <- 'https://opensky-network.org/api/states/all'
tryCatch({
all_flights <- fromJSON(txt = flight_url)
}, error=function(e){cat("ERROR :",conditionMessage(e), "\n")})
dc_flights <- as.data.frame(all_flights$states) %>%
select(-(V14:V18)) %>%
mutate(V6 = as.numeric(levels(V6))[V6],
V7 = as.numeric(levels(V7))[V7],
Time = Sys.time()) %>%
filter(between(V6, -78.361647, -75.872761),
between(V7, 38.197760, 39.646129))
flight.df.time <- rbind(dc_flights, flight.df)
write.table(flight.df.time,'G:/DCIST/OpenSky/Data/flight_week.csv', append = TRUE, sep = ",", col.names = FALSE, row.names = FALSE)
print(Sys.time())
Sys.sleep(10)
}
}
答案 0 :(得分:3)
你的意思是这样吗?
perl -aF/,/ -lne 'print map sprintf("%10s", $_), @F' FILENAME.csv > FILENAME.txt
答案 1 :(得分:-1)
无论何时使用顺序变量,都应该使用数组。在这种情况下,由于您只使用一次数组,因此您甚至不需要做多次暂停。
另外:使用词法文件句柄,这是更好的做法。
#!/usr/bin/env perl
use strict;
use warnings;
my $input_file = 'FILENAME.csv';
my $output_file = 'FILENAME.txt';
my $format = '%10s';
open( my $input_fh, "<", $input_file ) or die "\n !! Cannot open $input_file: $!";
open( my $output_fh, ">>", $output_file ) or die "\n !! Cannot create $output_file: $!";
while (<$input_fh>) {
print {$output_fh} join "", map { sprintf $format, $_ } split /,/;
}
close $input_fh;
close $output_fh;
exit;