我的数据看起来像这样:
dput(sequence_data)
;
structure(list(Obs = 1:13, Seq.1 = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("a", "b", "c"
), class = "factor"), Seq.2 = structure(c(1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("c", "d"), class = "factor"),
Seq.3 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("", "d", "e"), class = "factor"),
Seq.4 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
1L, 1L, 2L), .Label = c("", "f"), class = "factor")), .Names = c("Obs",
"Seq.1", "Seq.2", "Seq.3", "Seq.4"), class = "data.frame", row.names = c(NA,
-13L))
我正在尝试获取过渡状态图。这是代码:
transitions <- table(sequence_data$Seq.1,sequence_data$Seq.2) %>%
getRefClass("Transition")$new(label=c("1st Iteration", "2nd Iteration"))
transitions$box_width = 0.25;
transitions$box_label_cex = 0.7;
transitions$arrow_type = "simple";
transitions$arrow_rez = 300;
table(sequence_data$Seq.2,sequence_data$Seq.3) %>% transitions$addTransitions(label = '3rd Iteration')
transitions$render()
可以从图表中删除空值,使其看起来更干净吗?我试图删除,但表语句需要值相同的长度。
我正在使用GMISC包(library(Gmisc)
)作为图表。
由于
答案 0 :(得分:3)
这可能有点hacky但会让你到那儿。基本上,您手动将过渡(transitions$transitions
)设置为0
。
transitions$transitions[[2]][1,1] = 0
transitions$transitions[[2]][2,1] = 0
transitions$render()
也许这个循环可以自动将所有值更改为0
(虽然我还没有检查过大数据)
for (level_n in 1:length(transitions$transitions)){
x = transitions$transitions[[level_n]]
for (cols in 1:ncol(transitions$transitions[[level_n]])){
if (dimnames(x)[[2]][cols] == ""){
transitions$transitions[[level_n]][,cols] = 0
}
}
}
答案 1 :(得分:3)
您可以使用lapply
为列名为空字符串的每列设置值为零。然后当您运行transition$render()
时,空转换将消失。我一开始认为,这可以简单地完成如下:
# Set transitions table columns with a blank name to zeros
transitions$transitions = lapply(transitions$transitions, function(tab) {
tab[ , which(colnames(tab)=="")] = 0
tab
})
但是,lapply
从输出列表中删除“transitions”属性,导致错误(如果有人知道解决此问题的方法,请告诉我)。因此,我将更新后的列表保存在名为tmp
的临时对象中,恢复“过渡”属性,然后重置transitions$transitions
的值:
# Set transitions table columns with a blank name to zeros
tmp = lapply(transitions$transitions, function(tab) {
tab[ , which(colnames(tab)=="")] = 0
tab
})
# Restore "transition" attribute
attributes(tmp)$transitions = TRUE
# Set transitions to the new values we just created
transitions$transitions = tmp
当我研究这个时,我想知道空白转换值应该发生了什么。上面的图表似乎有误导性,因为它似乎表明第二次迭代中的所有c
值都转到了d
,而第二次迭代中的所有d
值都显示为e
}。但事实上,13个值中的5个转到""
(即空字符串)。他们刚刚消失了吗?如果是这样,第3迭代条的总高度不应该是第1和第2迭代条的高度的7/13吗?或者也许尝试这样的事情,只是为了表明一些价值观被转变为遗忘:
transitions$fill_clr[[3]] = c("white", transitions$fill_clr[[3]][-1])
transitions$render()
或者,空白实际上代表从第2次到第3次过渡时保持不变的值吗?如果是这种情况,那么可能最好用前面转换中的相应值填充空白值。这种情况的图表如下:
library(zoo)
# Convert empty values to NA
sequence_data[sequence_data==""] = NA
# Fill NA values with last value carried forward
sequence_data=as.data.frame(t(apply(sequence_data, 1, na.locf)))
transitions <- table(sequence_data$Seq.1, sequence_data$Seq.2) %>%
getRefClass("Transition")$new(label=c("1st Iteration", "2nd Iteration"))
transitions$box_width = 0.25;
transitions$box_label_cex = 1;
transitions$box_cex = 2;
transitions$arrow_type = "simple";
transitions$arrow_rez = 300;
table(sequence_data$Seq.2,sequence_data$Seq.3) %>% transitions$addTransitions(label = '3rd Iteration')
transitions$render()