不允许NA

时间:2014-09-18 22:00:50

标签: r na subscript

我是R的新手,如果我错过了一些协议,请原谅我,但这是我的问题: 我正在创建临时向量,以便添加' 0'需要的地方。我最终想要一个由12位数组成的值,如果不是这种情况,我将添加' 0s'我需要的。但是,在尝试使用适当的零粘贴我的临时索引后,我收到以下消息:

colnames(ALLMBRS) <- c("SSN","tracts","GeoBlock","GeoCodeBlck","GeoMatch") #TA Members Tracts
#Remove special characters and decimals
tmp1 <- str_replace_all(ALLMBRS$GeoCode,"[[:punct:]]","")
#Temporary Vector of ALLMBRS
tmp2 <- tmp1
#Vectors of Indices used to add 0's
add1 <- str_length(ALLMBRS$tracts) == 11
add2 <- str_length(ALLMBRS$tracts) == 10
add3 <- str_length(ALLMBRS$tracts) == 9
add4 <- str_length(ALLMBRS$tracts) == 8
add5 <- str_length(ALLMBRS$tracts) == 7
#Paste temporary vector indices into temporary vector
tmp2[add1] <- paste(tmp2[add1],"0",sep="")
tmp2[add2] <- paste(tmp2[add2],"00",sep="")
tmp2[add3] <- paste(tmp2[add3],"000",sep="")
tmp2[add4] <- paste(tmp2[add4],"0000",sep="")
tmp2[add5] <- paste(tmp2[add5],"00000",sep="")

数据示例:

[1] "0"            "0"            "0"            "0"            "0"            "0"           
 [7] "0"            "360010146121" "720210310133" "0"            "517100023001" "90034808002" 
[13] "250158202021" "250158211004" "250138125003" "290470203002" "250138124031" "250158202033"
[19] "250138019012" "250138112002"

我希望所有值都包含12位数。所以我想看看

[1]000000000000 

[12]900348080020

Error Message: Error in tmp2[add1] <- paste(tmp2[add1],"0",sep = ""):
NAs are not allowed in subscripted assignments

如果我的数据中有NA,我该如何规避这一点,以便完成任务。 谢谢你的帮助。

1 个答案:

答案 0 :(得分:1)

您可以使用str_pad中的stringr填充字符串。将pad参数设置为"0"

> x <- c("0", "0", "0", "0", "0", "0", "0", "360010146121",
         "720210310133", "0", "517100023001", "90034808002",
         "250158202021", "250158211004", "250138125003", 
         "290470203002", "250138124031", "250158202033",
         "250138019012", "250138112002")
> library(stringr)
> str_pad(x, 12, pad = "0")
# [1] "000000000000" "000000000000" "000000000000" "000000000000"
# [5] "000000000000" "000000000000" "000000000000" "360010146121"
# [9] "720210310133" "000000000000" "517100023001" "090034808002"
#[13] "250158202021" "250158211004" "250138125003" "290470203002"
#[17] "250138124031" "250158202033" "250138019012" "250138112002"

更新:对于包含任何NA值的矢量,您可以执行

x[!is.na(x)] <- str_pad(x[!is.na(x)], 12, pad = "0")

填充值并保持NA不受影响。例如,

> y <- c("0", NA, "123", "68")
> y[!is.na(y)] <- str_pad(y[!is.na(y)], 12, pad = "0")
> y
# [1] "000000000000" NA             "000000000123" "000000000068"