在Pandas中执行连接

时间:2017-11-22 07:09:15

标签: python sql database pandas

我熟悉在pandas中执行joind

pd.merge(A,B,on='key',how='inner')

但是如何在pandas中为这3个给定的连接编写查询,因为它们需要IF NULL enter image description here

2 个答案:

答案 0 :(得分:1)

设置

A = pd.DataFrame(dict(key=range(0, 5), col1=list('abcde')))
B = pd.DataFrame(dict(key=range(2, 7), col2=list('vwxyz')))

print(A, B, sep='\n' * 2)

  col1  key
0    a    0
1    b    1
2    c    2
3    d    3
4    e    4

  col2  key
0    v    2
1    w    3
2    x    4
3    y    5
4    z    6

使用pd.DataFrame.merge

最直接的方法是使用indicator参数。

A.merge(B, 'outer', indicator=True)

  col1  key col2      _merge
0    a    0  NaN   left_only
1    b    1  NaN   left_only
2    c    2    v        both
3    d    3    w        both
4    e    4    x        both
5  NaN    5    y  right_only
6  NaN    6    z  right_only

然后我们可以使用pd.DataFrame.query

A - B

A.merge(B, 'outer', indicator=True).query('_merge == "left_only"')

  col1  key col2     _merge
0    a    0  NaN  left_only
1    b    1  NaN  left_only

B - A

A.merge(B, 'outer', indicator=True).query('_merge == "right_only"')

  col1  key col2      _merge
5  NaN    5    y  right_only
6  NaN    6    z  right_only

对称差异

A.merge(B, 'outer', indicator=True).query('_merge != "both"')

  col1  key col2      _merge
0    a    0  NaN   left_only
1    b    1  NaN   left_only
5  NaN    5    y  right_only
6  NaN    6    z  right_only

使用pd.Series.isin(主要是)

但是,我想简单地使用pd.Series.isin作为布尔掩码。

A - B

A[~A.key.isin(B.key)]

  col1  key
0    a    0
1    b    1

B - A

B[~B.key.isin(A.key)]

  col2  key
3    y    5
4    z    6

对称差异

A[~A.key.isin(B.key)].append(B[~B.key.isin(A.key)])

或者

A.append(B).drop_duplicates('key', keep=False)

  col1 col2  key
0    a  NaN    0
1    b  NaN    1
3  NaN    y    5
4  NaN    z    6

答案 1 :(得分:0)

虽然piRSquared的答案非常好,但这是另一种方法:

import pandas as pd

创建DataFrames AB

A = pd.DataFrame({'key': range(1, 6), 'A': ['a'] * 5})
B = pd.DataFrame({'key': range(3, 8), 'B': ['b'] * 5})

例A的解决方案(即左排除连接): 首先执行左连接,然后只保留A中没有B中相应行的pd.merge(A, B, on = 'key', how = 'left')[~A.key.isin(B.key)] key A B 0 1 a NaN 1 2 a NaN 列:

pd.merge(A, B, on = 'key', how = 'right')[~B.key.isin(A.key)]
    key A   B
3   6   NaN b
4   7   NaN b

例B的解决方案(即右排除连接): 与解决方案A非常相似,但具有正确的连接:

outer = pd.merge(A, B, on = 'key', how = 'outer')

例如C的解决方案(即外部排除连接): 首先执行完全外连接:

A

然后过滤BBAouter[outer.key.isin(list(A.key[~A.key.isin(B.key)]) + list(B.key[~B.key.isin(A.key)]))] key A B 0 1 a NaN 1 2 a NaN 5 6 NaN b 6 7 NaN b 中没有相应密钥的行:

height = self.bounds.size.height
let thirds = self.height / 3

let aPath: CGMutablePath = CGMutablePath()
let rect = CGRect(x: (58 - 10) / 2, y: 46, width: 10, height: self.height - 58)
aPath.addRect(rect)

aPath.addEllipse(in: CGRect(x: (58 - 48) / 2, y: 0, width: 48, height: 48))
aPath.addEllipse(in: CGRect(x: (58 - 48) / 2, y: thirds, width: 48, height: 48))
aPath.addEllipse(in: CGRect(x: (58 - 48) / 2, y: thirds * 2, width: 48, height: 48))
aPath.addEllipse(in: CGRect(x: 0, y: self.height, width: 48, height: 48))
aPath.closeSubpath()

let other: CGPath = aPath.copy(strokingWithWidth: 12, lineCap: .round, lineJoin: .round, miterLimit: 1.0)
let square = CAShapeLayer()
square.fillColor = UIColor(red: 36/255.0, green: 56/255.0, blue: 82/255.0, alpha: 1.0).cgColor
square.path = other
self.layer.addSublayer(square)

let square2 = CAShapeLayer()
square2.fillColor = UIColor(red: 50/255.0, green: 70/255.0, blue: 96/255.0, alpha: 1.0).cgColor
square2.path = aPath
self.layer.addSublayer(square2)