基于多个范围合并

时间:2018-11-15 14:31:31

标签: r data.table sqldf

我想合并两个范围内的数据框。下面我给出了一个有代表性的例子。 sqldf解决方案有效,但是,我想知道是否有更好的方法可以做到这一点(例如,使用data.table)。

base <- data.frame(lower1 = c(12, 12, 3, 2), upper1 = c(20, 20, 20, 4), 
                   lower2 = c(12, 12, 3, 2), upper2 = c(20, 20, 20, 4)) %>% 
  data.table()

more_info <- data.frame(color = 'red', value1 = 4, value2 = 4, thing1 = 5, thing2 = 5) %>% 
  data.table()

setkey(base,      lower1, upper1, lower2, upper2)
setkey(more_info, value1, value2, thing1, thing2)

# works
sqldf('select * from base left join more_info
      on (    base.lower1 <= more_info.value1 and base.upper1 >= more_info.value1
          and base.lower2 <= more_info.thing1 and base.upper2 >= more_info.thing1)')

# doesn't work but is what i would like to do
setkey(base,      lower1, upper1, lower2, upper2)
setkey(more_info, value1, value2, thing1, thing2)

foverlaps(more_info, base, by.x = key(more_info), by.y = key(base), type = 'within', 
          mult = 'all', nomatch = NA)

作为一点背景知识,我有一个匹配算法,需要提高其运行时间。匹配算法通过基于某些特征将大量贷款筛选为较少数量的潜在匹配来工作。然后,我将应用必要的任何其他统计技术来找到最佳匹配。延迟会反复过滤掉所有匹配项的大数据集,从而减少潜在匹配项的数量。我的目标是找到一种更快的方式来创建潜在匹配的数据帧,然后使用分组依据和其他矢量化功能来完成匹配过程。

1 个答案:

答案 0 :(得分:1)

类似的东西:

<script src="https://unpkg.com/vue@latest/dist/vue.js"></script>
<div id="app">
  <div @mouseenter='setDisplay(0)' @mouseleave='setHide(0)'>
    <h1>Before</h1>
    <div class="innerHolder" v-show='innerDisplay[0]'>
      aaa
    </div>
    <h1>After</h1>
  </div>
</div>

输出:

more_info[base, .(lower1, upper1, lower2, upper2, color, value1 = x.value1, 
                       value2 = x.value2, thing1 = x.thing1, thing2 = x.thing2), 
          on = .(value1 >= lower1, value1 <= upper1, thing1 >= lower2, thing1 <= upper2)]
相关问题