Question

以下是数据示例：

'1.'    'Numeric types'
'1.1.'  'Integer'
'1.2.'  'Float'
...
'1.10'  'Double'

要自然地对它进行排序，我们可以使用string_to_array和'.'作为分隔符，然后将text[]转换为int[]并按整数数组排序，但由于字段本身是键入text并且可能存在用户决定使用非数字符号的情况，例如1.1.3a因此导致施法错误。为了解决这个问题，我决定使用regexp：

select regexp_matches('1.2.3.4.', E'(?:(\\d+)\.?)+')

预期的结果是数组：{'1', '2', '3', '4'}但是我只获得所述数组的最后一个元素，但是，如果我使用以下regexp：

select regexp_matches('1.2.3.4.', E'((?:\\d+)\.?)+')

结果是{'1.2.3.4.'}。

使用global-flag 'g'不是一个选项，因为regexp_matches会返回一列。

有没有办法只使用一个'1.2.3.4a.'::text将{1, 2, 3 ,4}::int[]转换为regexp_matches？

Fiddle

Answer 1

您可以将全局'g'标记与regexp_matches一起使用，但需要将值聚合到数组（最简单的使用array()构造函数）：

select array(select m[1] from regexp_matches(dt_code, '(\d+)', 'g') m)::int[] nums, *
from data_types
order by 1;

或者，您可以使用string_to_array()将字符串拆分为数组，但仍需要使用regexp删除任何非数字字符：

select string_to_array(trim(regexp_replace(dt_code, '[^\d\.]+', ''), '.'), '.')::int[] nums, *
from data_types
order by 1;

对于更高级的类似自然的排序，您需要自己将文本拆分为标记。请在the related SO question了解更多信息。

我可以想出一个简化的，可重复使用的功能：

create or replace function natural_order_tokens(text)
  returns table (
    txt text,
    num int,
    num_rep text
  )
  language sql
  strict
  immutable
as $func$
  select m[1], (case m[2] when '' then '0' else m[2] end)::int, m[2]
    from regexp_matches($1, '(\D*)(\d*)', 'g') m
   where m[1] != '' or m[2] != ''
$func$;

使用此功能，自然分类将变得如此简单：

select *
from data_types
order by array(select t from natural_order_tokens(dt_code) t);

SQLFiddle

多级列表自然排序与正则表达式

1 个答案: