R,GenomicRanges:找到重叠基因组范围的宽度

时间:2016-01-14 22:50:47

标签: r overlap bioconductor

给出两个GenomicRanges,如:

library(GenomicRanges)

gr1 <- 
  makeGRangesFromDataFrame(
    data.frame(
      chr = c("1","1","2","2"),
      start = c(10,50,10,50),
      end = c(20,60,20,60)
    )
  )

gr2 <- 
  makeGRangesFromDataFrame(
    data.frame(
      chr = c("2","2","3","3"),
      start = c(15,40,10,50),
      end = c(25,55,20,60)
    )
  )

我需要找到重叠段的重叠大小(宽度)。在我的例子中,这将是5(对于gr1 [3]和gr2 1)和5(对于gr [4]和gr2 [2])。在命中类上使用ranges()给出here的解决方案不适合GenomicRanges类(似乎):

mm <- findOverlaps(gr1,gr2)
ranges(mm,gr1,gr2)
  

.local(x,...)中的错误:     &#39;查询&#39;必须是长度等于查询数量的范围

有人希望GenomicRanges::subsetByOverlaps()有一个参数可以切片并返回重叠。

UPDATE(见下文):解决方案在包本身GenomicRanges::intersect()中,所以:

width(intersect(gr1, gr2))

1 个答案:

答案 0 :(得分:0)

GenomicRanges包具有特定的功能,intersect()。所以解决方案很简单:

width(intersect(gr1, gr2))
  

[1] 6 6

(这是正确的)