所以我有一个csv文件,它是由一些基于java的代码(处理)即时创建的。问题是当我尝试在R中加载它时,它在开头添加一个列似乎没有理由,然后在中间留下一个填充NA的列。
这是csv文件的样子。
x1,x2,y1,y2,angle,size1,size2,distance1,distance2
400.0,1100.0,500.0,500.0,0.0,,0.0,0.0,-100.0,600.0
现在问题是,我试图在开放式办公室打开它只是为了咯咯笑,它开得很好。
现在在R中使用read.csv()它会打开它:
所以我认为从调查开始的最佳位置是文件的创建位置。
以下是处理代码:
out.println("x1,"+ "x2," + "y1," + "y2," + "angle," + "size1," + "size2," + "distance1," + "distance2");
for (int i = 0; i < directions; i++)
{
//extraneous code skipped
String output = pointX + "," + point2X + "," + pointY + "," + point2Y + "," + (double)angle + "," + "," + size1 + "," + size2 + "," + distance + "," + distance2;
out.println(output);
}
无论如何,我可以使用一些提示来解决问题或解决建议。
答案 0 :(得分:2)
如果我们计算字段,我们会看到有9个标题列但有10个数据列,所以它假设额外数据列是第一列,第一列表示行名称。
count.fields(textConnection(Lines), sep = ",")
[1] 9 10
要解决此问题,请跳过标题并在删除额外列6时读取数据。然后读入标题行并将标题应用于数据框。
# test data
Lines <- "x1,x2,y1,y2,angle,size1,size2,distance1,distance2
400.0,1100.0,500.0,500.0,0.0,,0.0,0.0,-100.0,600.0"
DF <- read.table(text = Lines, skip = 1, sep = ",")[-6]
names(DF) <- unlist(read.table(text = Lines, nrows = 1, sep = ","))
我们已经使用text = Lines
来保持这种自包含状态,但当然,您会改为使用类似file = "myfile.csv"
的内容。
答案 1 :(得分:1)
正如评论中已经解释的那样,你输入的是一个双逗号(,,
):
cat 'wrong.csv'
x1,x2,y1,y2,angle,size1,size2,distance1,distance2
400.0,1100.0,500.0,500.0,0.0,,0.0,0.0,-100.0,600.0
删除它可以解决问题:
cat 'right.csv'
x1,x2,y1,y2,angle,size1,size2,distance1,distance2
400.0,1100.0,500.0,500.0,0.0,0.0,0.0,-100.0,600.0
在这里你可以看到差异:
Rscript -e 'read.csv("wrong.csv");read.csv("right.csv")'
x1 x2 y1 y2 angle size1 size2 distance1 distance2
400.0 1100 500 500 0 NA 0 0 -100 600
x1 x2 y1 y2 angle size1 size2 distance1 distance2
1 400 1100 500 500 0 0 0 -100 600
原因是R
将,,
视为没有值的列。由于不清楚这是character
,因此不会将其解释为空字符串(""
),而是将其解释为缺失值(NA
)。
由于这种方式您的输入比标题多一个数据列,read.csv
会将第一列解释为结果data.frame
的行名称。
因此,您不会收到错误但会出现意外输出。
通过修改列号,R
了解第1列实际上是x1
,依此类推。