"lateral join" equivalent in KDB?

时间:2018-03-25 21:12:59

标签: join kdb

How do you "unpack" an array valued column in kdb ?

I have a table T which includes an array-valued column C, and want a result with each the rows of T duplicated as many times as it has entries in the column C, with each duplicate containing one of the values from that column.

In PostgreSQL this would be a "lateral join".

2 个答案:

答案 0 :(得分:4)

假设您有以下表格:

t:([]a:`a`b`c`d;b:(1,();(2;3);4,();(5;6;7)))

t
a b    
-------
a ,1   
b 2 3  
c ,4   
d 5 6 7

并且您希望列b中的每个值都有重复的行,您可以使用UNGROUP来获取:

q) ungroup t
a b
---
a 1
b 2
b 3
c 4
d 5
d 6
d 7

答案 1 :(得分:1)

展平嵌套列的最简单方法是使用ungroup命令。如果存在多个嵌套列,则此命令将起作用,前提是每行中的列表具有相同的长度。

q)show tab:([]a:`a`b`c;b:(1#`d;`e`f;`g`h);c:(1#1;2 3;4 5))
a b    c
----------
a ,`d  ,1
b `e`f 2 3
c `g`h 4 5
q)ungroup tab
a b c
-----
a d 1
b e 2
b f 3
c g 4
c h 5

这种方法的缺点是所有嵌套列都是未分组的,如果每行中有不同的长度列表,那么命令将失败:

q)show tab2:([]a:`a`b`c;b:(1#`d;`e`f;`g`h);c:(1#1;2 3;1#4))
a b    c
----------
a ,`d  ,1
b `e`f 2 3
c `g`h ,4        / different length lists
q)ungroup tab2
'length
  [0]  ungroup tab2
       ^

通过单个列取消组合的一种可能解决方案如下,它将每行复制每个c值中的元素数量:

q)f:{[t;c]@[t where count each r;c;:;raze r:t c]}
q)f[tab2;`c]
a b    c
--------
a ,`d  1
b `e`f 2
b `e`f 3
c `g`h 4