我正在尝试使用条件将一个 DF 与另外两个加入。我有以下 DF。 DF1,我想与 df_cond1 和 df_cond2 一起加入的 DF。
如果 DF1 InfoNum col 是 NBC 我想加入 df_cond1 否则如果 DF1 InfoNum Column 是 BBC 我想加入 df_cond2 但我不知道我该怎么做。
DF1
+-------------+----------+-------------+
| Date | InfoNum | Sport |
+-------------+----------+-------------+
| 31/11/2020 | NBC | football |
| 11/01/2020 | BBC | tennis |
+-------------+----------+-------------+
df_cond1
+-------------+---------+-------------+
| Periodicity | Info | Description |
+-------------+---------+-------------+
| Monthly | NBC | DATAquality |
+-------------+---------+-------------+
df_cond2
+-------------+---------+-------------+
| Periodicity | Info | Description |
+-------------+---------+-------------+
| Daily | BBC | InfoIndeed |
+-------------+---------+-------------+
final_df
+-------------+----------+-------------+-------------+
| Date | InfoNum | Sport | Description |
+-------------+----------+-------------+-------------+
| 31/11/2020 | NBC | football | DATAquality |
| 11/01/2020 | BBC | tennis | InfoIndeed |
+-------------+----------+-------------+-------------+
我一直在寻找,但没有找到好的解决方案,你能帮我吗?
答案 0 :(得分:0)
以下是您加入的方式
val df = Seq(
("31/11/2020", "NBC", "football"),
("1/01/2020", "BBC", "tennis")
).toDF("Date", "InfoNum", "Sport")
val df_cond1 = Seq(
("Monthly", "NBC", "DATAquality")
).toDF("Periodicity", "Info", "Description")
val df_cond2 = Seq(
("Daily", "BBC", "InfoIndeed")
).toDF("Periodicity", "Info", "Description")
df.join(df_cond1.union(df_cond2), $"InfoNum" === $"Info")
.drop("Info", "Periodicity")
.show(false)
输出:
+----------+-------+--------+-----------+
|Date |InfoNum|Sport |Description|
+----------+-------+--------+-----------+
|31/11/2020|NBC |football|DATAquality|
|1/01/2020 |BBC |tennis |InfoIndeed |
+----------+-------+--------+-----------+