从特定格式中提取数据

时间:2016-02-18 16:21:29

标签: sql regex plsql oracle11g expression

我在oracle 11g表中的varchar2(4000)数据类型的列中有数据。

数据看起来像这样,

"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE" 

"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"  

有人可以帮助我如何在oracle sql中执行此操作。

预期产出:

LOCT = NY 

DISPLAY_TYP = M,F,B   

ID = 100  

PRICE = >=20  

需要根据LOCT,Price,Display_typ,ID和过滤器提取数据,以检索相应的值。

由于

1 个答案:

答案 0 :(得分:2)

以下是如何将所有值分成单独的列:

with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
                     select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
       regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
       regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
       replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
       regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
       regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from   sample_data;

      PKEY LOCT PRICE DISPLAY_TYP     ID    FILTER
---------- ---- ----- --------------- ----- ------
         1 MA   10    M,F,B           101   LTE   
         2 NY   30    M,F,B           100   GTE   

以下是如何将其拆分为不同的行:

with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
                     select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
       str_part||' = '||val sub_str
from   (select pkey,
               str,
               regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
               regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
               replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
               regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
               regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
        from   sample_data) res
unpivot (val for str_part in (loct, price, display_typ, id, filter));

      PKEY SUB_STR                            
---------- -----------------------------------
         1 LOCT = MA                          
         1 PRICE = 10                         
         1 DISPLAY_TYP = M,F,B                
         1 ID = 101                           
         1 FILTER = LTE                       
         2 LOCT = NY                          
         2 PRICE = 30                         
         2 DISPLAY_TYP = M,F,B                
         2 ID = 100                           
         2 FILTER = GTE      

N.B。这两个解决方案都依赖于",不会出现在每个子部分的值中的事实,除了display_typ(它预期{{1}并且[不会作为价值的一部分出现。)

对于ID可能包含或不包含在引号中的情况(我不确定它们是否有时),这应该有效:

]

正则表达式中的with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":101,"FILTER":"LTE"' str from dual union all select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual union all select 3 pkey, '"LOCT":"OH","DISPLAY_TYP":["F","B"],"PRICE":"50","FILTER":"BOO","ID":"102"' str from dual) -- end of mimicking a table called sample_data containing your strings. -- You would just run the SQL below, replacing "sample_data" with your tablename select pkey, regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct, regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price, replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ, regexp_substr(str, '"ID":"?([^",]*)"?', 1, 1, null, 1) id, regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter from sample_data; PKEY LOCT PRICE DISPLAY_TYP ID FILTER ---------- ---- ----- --------------- ----- ------ 1 MA 10 M,F,B 101 LTE 2 NY 30 M,F,B 100 GTE 3 OH 50 F,B 102 BOO 表示双引号必须在模式中的该点出现0或1次。

如果ID值周围永远不会出现双引号,那么您可以使用"?作为模式。