句点后用数字字符分割句子

时间:2016-01-25 07:15:00

标签: python regex python-3.x split

我正在使用python3,尝试在句点之后拆分带有注释编号的文本:

text = "Reproduction now becomes posited as “natural” production.16 Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue. "

这是我所接受的最接近的句子分裂正则表达式仍然有效:

sentences = re.split(r' *[\.\?!][\'"\)\]]* +', text)

我基本上失去了w / r / t通过正则表达式在一段时间之后立即捕获数字实例。将[0-9]正确纳入表达式的任何帮助?感谢。

编辑这是理想分割的方式:

sentences[0]= "Reproduction now becomes posited as “natural” production.16"
sentences[1]= " Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue."

1 个答案:

答案 0 :(得分:0)

使用re.findall

>>> import regex
>>> regex.split(r'(?<=\.\d+\b)', text, flags=regex.VERSION1)
['Reproduction now becomes posited as “natural” production.16',
 ' Fortunati joins Marx in a minute but crucial declension ...']

如果你可以使用第三方模块,你可以使用regex,它允许非固定宽度的环视声明,拆分为空字符串:

'Collapse this branch');
    $('.tree li.parent_li > span').on('click', function (e) {
        var children = $(this).parent('li.parent_li').find(' > ul > li');
        if (children.is(":visible")) {
            children.hide('fast');
            $(this).attr('title', 'Expand this branch').find(' > i').addClass('icon-plus-sign').removeClass('icon-minus-sign');
        } else {
            children.show('fast');
            $(this).attr('title', 'Collapse this branch').find(' > i').addClass('icon-minus-sign').removeClass('icon-plus-sign');
        }
        e.stopPropagation();
    });
});