数据框加入多个库仑

时间:2020-01-02 13:24:02

标签: pyspark pyspark-sql pyspark-dataframes

我正在尝试使用以下语法将两个数据帧df_m100和df_d200连接起来。

Df_m100:

 public function startTestSuite(\PHPUnit\Framework\TestSuite $suite)
 {
     $this->dispatcher->dispatch(new SuiteEvent($suite), 'suite.start');
 }

df_d200:

root

 |-- id: integer (nullable = true)
 |-- season: integer (nullable = true)
 |-- player_of_match: string (nullable = true)

加入语法:有两个字段的加入。

 |-- match_id: integer (nullable = true)
 |-- batsman: string (nullable = true)
 |-- batsman_score: long (nullable = true)

错误:

df_new = df_m100.join(df_d200, df_m100.id == df_d200.match_id & df_m100.player_of_match == df_d200.batsman) 

请告知需要更改什么?

1 个答案:

答案 0 :(得分:1)

语法有点棘手。需要很多括号才能工作。

df_new = df_m100.join(
    df_d200, 
    (df_m100.id == df_d200.match_id) & (df_m100.player_of_match == df_d200.batsman) 
)

因为&==之前被评估

相关问题