我有一个像这样的数据表
// Print each studdent test scores and ther average
public static void printStudentReport(String[] names, int[][] scores) {
out.println("STUDENT REPORT");
for (int i = 0; i < names.length; i++) {
out.print("Scores for " + names[i] + ": ");
out.println();
double sum = 0;
for (int s = 0; s < scores[0].length; s++) {
out.println("Test #" + (s + 1) + ": " + scores[i][s]);
sum += scores[i][s];
}
out.println();
if (0 < scores[0].length) {
double average = sum / scores[0].length;
out.println("Average for " + names[i] + ": " + average);
out.println();
}
}
out.println();
}
对于&#34;姓名&#34;字段,我想在单独的data.table列中的第二个下划线后面的部分。 目前,我使用strplit和&#34; _&#34;作为令牌,但我的问题是有些记录有3个元素,有些则有4个。 我目前的解决方案是
ID Name
1: 2760925 01_HOOFD_010
2: 2760925 01_HOOFD_015
3: 2771451 01_HOOFD_010
4: 2771451 01_HOOFD_190_2
5: 2771451 01_HOOFD_030_2
6: 2771451 08_AWB45_020_2
7: 2771451 08_AWB45_040
8: 2771451 01_HOOFD_065_2
但我怀疑这是否是最真实/最简洁的方式...... 你有更好的想法吗? 谢谢
DT$code_3<-DT[,.(lapply(strsplit(Name,"_"),"[",3:4)),][,.(lapply(V1,function(x) paste(na.omit(x),collapse="_"))),]
答案 0 :(得分:1)
x <-structure(list(ID = c(2760925L, 2760925L, 2771451L, 2771451L,
2771451L, 2771451L, 2771451L, 2771451L), Name = c("01_HOOFD_010",
"01_HOOFD_015", "01_HOOFD_010", "01_HOOFD_190_2", "01_HOOFD_030_2",
"08_AWB45_020_2", "08_AWB45_040", "01_HOOFD_065_2")), .Names = c("ID",
"Name"), row.names = c(NA, -8L), class = c("data.table", "data.frame"
))
x$two <- gsub( "(.*?)_(.*?)_(.*?)" , "" , x$Name )