R使用数据框中的circlize制作圆/弦图

时间:2014-12-16 08:01:42

标签: r plot chord-diagram circlize

我想使用circlize包制作一个和弦图。我有一个包含四列汽车的数据框。 2个第一列包含有关汽车频段和所拥有车型的信息,以及受访者迁移到的品牌和型号的下两列。

以下是数据框的一个简单示例:

   Brand_from model_from Brand_to Model_to
1:      VOLVO        s80      BMW  5series
2:        BMW    3series      BMW  3series
3:      VOLVO        s60    VOLVO      s60
4:      VOLVO        s60    VOLVO      s80
5:        BMW    3series     AUDI       s4
6:       AUDI         a4      BMW  3series
7:       AUDI         a5     AUDI       a5

能够将其制作成和弦图会很棒。我在帮助中找到了一个有效的例子但是我无法将我的数据转换成正确的格式以制作情节。 此代码来自circlize包中的帮助。这产生了一层,我想我需要两个,品牌和型号。

mat = matrix(1:18, 3, 6)
rownames(mat) = paste0("S", 1:3)
colnames(mat) = paste0("E", 1:6)

rn = rownames(mat)
cn = colnames(mat)
factors = c(rn, cn)
factors = factor(factors, levels = factors)
col_sum = apply(mat, 2, sum)
row_sum = apply(mat, 1, sum)
xlim = cbind(rep(0, length(factors)), c(row_sum, col_sum))

par(mar = c(1, 1, 1, 1))
circos.par(cell.padding = c(0, 0, 0, 0))
circos.initialize(factors = factors, xlim = xlim)
circos.trackPlotRegion(factors = factors, ylim = c(0, 1), bg.border = NA,
                       bg.col = c("red", "green", "blue", rep("grey", 6)), track.height = 0.05,
                       panel.fun = function(x, y) {
                         sector.name = get.cell.meta.data("sector.index")
                         xlim = get.cell.meta.data("xlim")
                         circos.text(mean(xlim), 1.5, sector.name, adj = c(0.5, 0))
})

col = c("#FF000020", "#00FF0020", "#0000FF20")
for(i in seq_len(nrow(mat))) {
  for(j in seq_len(ncol(mat))) {
    circos.link(rn[i], c(sum(mat[i, seq_len(j-1)]), sum(mat[i, seq_len(j)])),
                cn[j], c(sum(mat[seq_len(i-1), j]), sum(mat[seq_len(i), j])),
                col = col[i], border = "white")
  }
}
circos.clear()

此代码生成以下图表:

enter image description here

理想的结果就像这个例子,但我想要汽车品牌而不是大陆,而在内圈上则属于该品牌的车型 enter image description here

3 个答案:

答案 0 :(得分:8)

这里的关键是将您的数据转换为矩阵(邻接矩阵) 哪些行对应于'来自'和列对应于'到')。

df = read.table(textConnection("
 Brand_from model_from Brand_to Model_to
      VOLVO        s80      BMW  5series
        BMW    3series      BMW  3series
      VOLVO        s60    VOLVO      s60
      VOLVO        s60    VOLVO      s80
        BMW    3series     AUDI       s4
       AUDI         a4      BMW  3series
       AUDI         a5     AUDI       a5
"), header = TRUE, stringsAsFactors = FALSE)

from = paste(df[[1]], df[[2]], sep = ",")
to = paste(df[[3]], df[[4]], sep = ",")

mat = matrix(0, nrow = length(unique(from)), ncol = length(unique(to)))
rownames(mat) = unique(from)
colnames(mat) = unique(to)
for(i in seq_along(from)) mat[from[i], to[i]] = 1

mat的值是

> mat
            BMW,5series BMW,3series VOLVO,s60 VOLVO,s80 AUDI,s4 AUDI,a5
VOLVO,s80             1           0         0         0       0       0
BMW,3series           0           1         0         0       1       0
VOLVO,s60             0           0         1         1       0       0
AUDI,a4               0           1         0         0       0       0
AUDI,a5               0           0         0         0       0       1

然后将矩阵发送到chordDiagram并指定orderdirectionalorder的手动规范是为了确保将相同的品牌组合在一起。

par(mar = c(1, 1, 1, 1))
chordDiagram(mat, order = sort(union(from, to)), directional = TRUE)
circos.clear()

为了使图形更复杂,您可以为品牌名称创建轨道, 品牌识别的轨道,模型名称的轨道。我们也是 可以设定品牌之间的差距大于每个品牌内部。

1套gap.degree

circos.par(gap.degree = c(2, 2, 8, 2, 8, 2, 8))
在绘制和弦图之前

2,我们创建两个空轨道,一个用于品牌名称, 一个用于preAllocateTracks参数的识别行。

par(mar = c(1, 1, 1, 1))
chordDiagram(mat, order = sort(union(from, to)),
    direction = TRUE, annotationTrack = "grid", preAllocateTracks = list(
        list(track.height = 0.02),
        list(track.height = 0.02))
)

3将模型名称添加到注释轨道(默认情况下创建此轨道, 左右图中较粗的轨道。请注意,这是第三首曲目 外圈到里面)

circos.trackPlotRegion(track.index = 3, panel.fun = function(x, y) {
    xlim = get.cell.meta.data("xlim")
    ylim = get.cell.meta.data("ylim")
    sector.index = get.cell.meta.data("sector.index")
    model = strsplit(sector.index, ",")[[1]][2]
    circos.text(mean(xlim), mean(ylim), model, col = "white", cex = 0.8, facing = "inside", niceFacing = TRUE)
}, bg.border = NA)

4添加品牌识别线。因为品牌涵盖多个行业,我们需要 手动计算线(弧)的起始和结束程度。在下面, rou1rou2是第二首曲目中两个边框的高度。标识线 在第二首曲目中绘制。

all_sectors = get.all.sector.index()
rou1 = get.cell.meta.data("yplot", sector.index = all_sectors[1], track.index = 2)[1]
rou2 = get.cell.meta.data("yplot", sector.index = all_sectors[1], track.index = 2)[2]

start.degree = get.cell.meta.data("xplot", sector.index = all_sectors[1], track.index = 2)[1]
end.degree = get.cell.meta.data("xplot", sector.index = all_sectors[3], track.index = 2)[2]
draw.sector(start.degree, end.degree, rou1, rou2, clock.wise = TRUE, col = "red", border = NA)

5首先获取极坐标系中文本的坐标,然后映射到数据坐标 系统reverse.circlize。请注意您映射坐标的单元格以及绘制文本的单元格 应该是同一个细胞。

m = reverse.circlize( (start.degree + end.degree)/2, 1, sector.index = all_sectors[1], track.index = 1)
circos.text(m[1, 1], m[1, 2], "AUDI", cex = 1.2, facing = "inside", adj = c(0.5, 0), niceFacing = TRUE, 
    sector.index = all_sectors[1], track.index = 1)

对于另外两个品牌,使用相同的代码。

start.degree = get.cell.meta.data("xplot", sector.index = all_sectors[4], track.index = 2)[1]
end.degree   = get.cell.meta.data("xplot", sector.index = all_sectors[5], track.index = 2)[2]
draw.sector(start.degree, end.degree, rou1, rou2, clock.wise = TRUE, col = "green", border = NA)
m = reverse.circlize( (start.degree + end.degree)/2, 1, sector.index = all_sectors[1], track.index = 1)
circos.text(m[1, 1], m[1, 2], "BMW", cex = 1.2, facing = "inside", adj = c(0.5, 0), niceFacing = TRUE, 
    sector.index = all_sectors[1], track.index = 1)

start.degree = get.cell.meta.data("xplot", sector.index = all_sectors[6], track.index = 2)[1]
end.degree  = get.cell.meta.data("xplot", sector.index = all_sectors[7], track.index = 2)[2]
draw.sector(start.degree, end.degree, rou1, rou2, clock.wise = TRUE, col = "blue", border = NA)
m = reverse.circlize( (start.degree + end.degree)/2, 1, sector.index = all_sectors[1], track.index = 1)
circos.text(m[1, 1], m[1, 2], "VOLVO", cex = 1.2, facing = "inside", adj = c(0.5, 0), niceFacing = TRUE, 
    sector.index = all_sectors[1], track.index = 1)

circos.clear()

如果你想设置颜色,请转到包装插图,如果需要,你也可以使用circos.axis在图上添加轴。

enter image description here

答案 1 :(得分:8)

当我稍微更新一下软件包时,现在有了一种更简单的方法。如果有人对此感兴趣,我会在此给出另一个答案。

circlize 的最新几个版本中,chordDiagram()接受邻接矩阵和邻接列表作为输入,这意味着,现在您可以提供包含与函数成对关系的数据帧。还有一个highlight.sector()函数可以同时突出显示或标记多个扇区。

我将实现我之前展示的情节,但代码更短:

df = read.table(textConnection("
 brand_from model_from brand_to model_to
      VOLVO        s80      BMW  5series
        BMW    3series      BMW  3series
      VOLVO        s60    VOLVO      s60
      VOLVO        s60    VOLVO      s80
        BMW    3series     AUDI       s4
       AUDI         a4      BMW  3series
       AUDI         a5     AUDI       a5
"), header = TRUE, stringsAsFactors = FALSE)

brand = c(structure(df$brand_from, names=df$model_from),
          structure(df$brand_to,names= df$model_to))
brand = brand[!duplicated(names(brand))]
brand = brand[order(brand, names(brand))]
brand_color = structure(2:4, names = unique(brand))
model_color = structure(2:8, names = names(brand))

brandbrand_colormodel_color的值为:

> brand
     a4      a5      s4 3series 5series     s60     s80
 "AUDI"  "AUDI"  "AUDI"   "BMW"   "BMW" "VOLVO" "VOLVO"
> brand_color
 AUDI   BMW VOLVO
    2     3     4
> model_color
     a4      a5      s4 3series 5series     s60     s80
      2       3       4       5       6       7       8

这一次,我们只添加了一个添加线条和品牌名称的附加曲目。而且你也可以发现输入变量实际上是一个数据框(df[, c(2, 4)])。

library(circlize)
gap.degree = do.call("c", lapply(table(brand), function(i) c(rep(2, i-1), 8)))
circos.par(gap.degree = gap.degree)

chordDiagram(df[, c(2, 4)], order = names(brand), grid.col = model_color,
    directional = 1, annotationTrack = "grid", preAllocateTracks = list(
        list(track.height = 0.02))
)

与之前相同,手动添加模型名称:

circos.trackPlotRegion(track.index = 2, panel.fun = function(x, y) {
    xlim = get.cell.meta.data("xlim")
    ylim = get.cell.meta.data("ylim")
    sector.index = get.cell.meta.data("sector.index")
    circos.text(mean(xlim), mean(ylim), sector.index, col = "white", cex = 0.6, facing = "inside", niceFacing = TRUE)
}, bg.border = NA)

最后,我们通过highlight.sector()函数添加行和品牌名称。这里sector.index的值可以是长度大于1的向量,线(或薄矩形)将覆盖所有指定的扇区。将在扇区中间添加标签,并且激进位置由text.vjust选项控制。

for(b in unique(brand)) {
  model = names(brand[brand == b])
  highlight.sector(sector.index = model, track.index = 1, col = brand_color[b], 
    text = b, text.vjust = -1, niceFacing = TRUE)
}

circos.clear()

enter image description here

答案 2 :(得分:-1)

使用read.table读入您的数据,生成7x4 data.frame(brand.txt应以制表符分隔)。

mt <- read.table("//your-path/brand.txt",header=T,sep="\t",na.string="NA")

您的变量名称(mt)是:&#34; Brand_from&#34;,&#34; model_from&#34;,&#34; Brand_to&#34;和&#34; Model_to&#34;。 选择您感兴趣的两个变量,例如:

mat <- table(mt$Brand_from, mt$model_from)

这导致下表:

# >mat
#        3series a4 a5 s60 s80
# AUDI        0  1  1   0   0
# BMW         2  0  0   0   0
# VOLVO       0  0  0   2   1

然后你可以运行#34; rn = rownames(mat)&#34;正如您在circlize脚本中提供的那样

enter image description here

相关问题