快速替换data.table

时间:2019-04-17 15:04:44

标签: r data.table stringi

给出下表

df <- structure(list(V1 = c("Prodigal_2|LOCUS_00010", "Prodigal_2|LOCUS_00010", 
"Prodigal_2|LOCUS_00010", "Prodigal_2|LOCUS_00010", "Prodigal_2|LOCUS_00010", 
"Prodigal_2|LOCUS_00010"), V2 = c("WP_001212884.1", "WP_042596810.1", 
"WP_131250681.1", "WP_001212880.1", "WP_016079538.1", "WP_086396124.1"
), V3 = c(100, 99.7, 99.7, 99.7, 99.7, 99.7), V4 = c(381L, 381L, 
381L, 381L, 381L, 381L), V5 = c(0L, 1L, 1L, 1L, 1L, 1L), V6 = c(0L, 
0L, 0L, 0L, 0L, 0L), V7 = c(1L, 1L, 1L, 1L, 1L, 1L), V8 = c(381L, 
381L, 381L, 381L, 381L, 381L), V9 = c(1L, 1L, 1L, 1L, 1L, 1L), 
    V10 = c(381L, 381L, 381L, 381L, 381L, 381L), V11 = c(1.3e-206, 
    1.7e-206, 1.7e-206, 3e-206, 3e-206, 3e-206), V12 = c(728, 
    727.6, 727.6, 726.9, 726.9, 726.9)), row.names = c(NA, -6L
), class = c("data.table", "data.frame"))

看起来像这样

                       V1             V2    V3  V4 V5 V6 V7  V8 V9 V10      V11 V12
1: Prodigal_2|LOCUS_00010 WP_001212884.1 100.0 381  0  0  1 381  1 381 1.3e-206 728
2: Prodigal_2|LOCUS_00010 WP_042596810.1  99.7 381  1  0  1 381  1 381 1.7e-206 728
3: Prodigal_2|LOCUS_00010 WP_131250681.1  99.7 381  1  0  1 381  1 381 1.7e-206 728
4: Prodigal_2|LOCUS_00010 WP_001212880.1  99.7 381  1  0  1 381  1 381 3.0e-206 727
5: Prodigal_2|LOCUS_00010 WP_016079538.1  99.7 381  1  0  1 381  1 381 3.0e-206 727
6: Prodigal_2|LOCUS_00010 WP_086396124.1  99.7 381  1  0  1 381  1 381 3.0e-206 727

我要在V1列中将所有字符串| LOCUS_XXXXX替换为如下所示的任何内容。

          V1             V2    V3  V4 V5 V6 V7  V8 V9 V10      V11 V12
1 Prodigal_2 WP_001212884.1 100.0 381  0  0  1 381  1 381 1.3e-206 728
2 Prodigal_2 WP_042596810.1  99.7 381  1  0  1 381  1 381 1.7e-206 728
3 Prodigal_2 WP_131250681.1  99.7 381  1  0  1 381  1 381 1.7e-206 728
4 Prodigal_2 WP_001212880.1  99.7 381  1  0  1 381  1 381 3.0e-206 727
5 Prodigal_2 WP_016079538.1  99.7 381  1  0  1 381  1 381 3.0e-206 727
6 Prodigal_2 WP_086396124.1  99.7 381  1  0  1 381  1 381 3.0e-206 727

我尝试了以下方法:

Lookup <- c("\\|LOCUS_[0-9]+")
Rename <- ""

library(stringi)

setDT(df)[, Result := Rename[stri_detect_regex(V1, Lookup)], by = V1])

“结果”列为空。理想情况下,我想在第V1列中进行替换。数据表很大,有220万行。

1 个答案:

答案 0 :(得分:2)

我们需要 <!DOCTYPE html> <div class ="start" <h1>Click the button, start the game!</h1> </div> <div class="game" id="paper"> <header>Paper</header> </div> <div class="game" id="scissors"> <header>Scissors</header> </div> <div class="game" id="stone"> <header>Stone</header> </div> <div id="output"></div> <div id="result"</div>而不是str_replace

str_detect