R中的模式匹配

时间:2018-06-04 09:36:25

标签: r data-analysis

我有一个sql文件,我必须从中选择表名及其对应的列名。 例如:

Select T1.Name , T1.Age, T2.Dept_Name from employee T1 , department T2 where T1.Dept_No= T2.Dept_No

我希望结果像

Table_Name        Column_Name
employee             Name
employee             Age
department          Dept_Name

这是否可以使用R?

1 个答案:

答案 0 :(得分:0)

其中一种方法可能是使用下面的模式匹配

(假设 - 每个查询由换行符分隔,一个完整的SQL查询在一行中)

library(stringr)
library(dplyr)
library(tidyr)

#read file having sql query 
txt <- readLines("test.txt")

#extract column name & table name
df <- data.frame(column_name = str_match(txt, "Select\\s+(.*?)\\s+from")[,2],
                 table_name  = str_match(txt, "from\\s+(.*?)\\s+where")[,2])

#clean above extracted data to have the final outcome
df <- df %>%
  separate_rows(column_name, sep = ",") %>%
  separate_rows(table_name, sep = ",") %>%
  filter(word(trimws(column_name), 1, sep = "\\.") == word(trimws(table_name), -1)) %>%
  mutate(column_name = word(trimws(column_name), -1, sep = "\\."),
         table_name  = word(trimws(table_name), 1))

给出了

> df
  column_name table_name
1        Name   employee
2         Age   employee
3   Dept_Name department


示例数据: test.txt包含

Select T1.Name , T1.Age, T2.Dept_Name from employee T1 , department T2 where T1.Dept_No= T2.Dept_No