使用perl脚本回归匹配的数组元素

时间:2018-03-26 14:47:02

标签: perl

我的输入文件:

select count(*) as counter, my_column
  from my_table
 where regexp_instr (my_column, ',') > 0
   and regexp_like(replace(replace(my_column, ' ', ''), ',', ''), '[0-9])
 group by my_column
 order by counter desc;

我的数组元素:

my $inp = "sample.txt";

#Sample.txt
As the HF exchange `\mathcal{\mathsf{}}` operator adopted 
in, the same HF exchange operator is adopted in without further 
optimization. However, the remaining `\mathbb{\mathbbm{}}`
`\mathbm{\mathbf{}}`, `\mathbf{\mathit{}}`. When compared with those
adopted in the MR hybrid functionals developed by Henderson {\it et al.}
for different `\mathrm{\mathscr{}}`, `\mathsf{\mathfrak{}}`
  

我担心需要检查以下模式:

my @arr = qw(boldsymbol mathbb mathbbm mathbf mathcal mathbf mathit mathbf mathcal mathfrak mathit mathrm mathscr mathsf);

\\$arr[0]{$arr[1] ... \\$arr[0]{$arr[2] .... \\$arr[0]{$arr[3] ... \\$arr[0]{$arr[13]

...

...

例如:

\\$arr[13]{$arr[0] ... \\$arr[13]{$arr[1] ... \\$arr[13]{$arr[2] ... \\$arr[13]{$arr[13]

\boldsymbol{\mathbb} and \\boldsymbol{\mathbbm} ...

\mathbb{\boldsymbol} and \\mathbb{\mathbbm} ...

你能不能请任何人指导我在这个编码流程中做错了。

1 个答案:

答案 0 :(得分:0)

循环遍历术语列表两次,嵌套。这将产生cartesian product

use 5.026;
use strictures;
use Data::Munge qw(list2re);
my @markup = qw(boldsymbol mathbb mathbbm  mathbf mathcal mathbf mathit
    mathbf mathcal  mathfrak mathit mathrm mathscr mathsf);

my $BS = '\\'; # a single backslash
my @expressions;
for my $first_term (@markup) {
    for my $second_term (@markup) {
        push @expressions, "$BS${first_term}{$BS$second_term"
    }
}
my $regex = list2re @expressions;

my $input = <<'';
As the HF exchange \mathcal{\mathsf{}} operator adopted 
in, the same HF exchange operator is adopted in without further 
optimization. However, the remaining \mathbb{\mathbbm{}}
\mathbm{\mathbf{}}, \mathbf{\mathit{}}. When compared with those
adopted in the MR hybrid functionals developed by Henderson {\it et al.}
for different \mathrm{\mathscr{}} \mathsf{\mathfrak{}}

my @results = $input =~ m/$regex/gms;
# (
#     '\\mathcal{\\mathsf',
#     '\\mathbb{\\mathbbm',
#     '\\mathbf{\\mathit',
#     '\\mathrm{\\mathscr',
#     '\\mathsf{\\mathfrak'
# )
相关问题