Question

我有一个很长的文本文件，看起来像这样：

'a_lot(icl>how)', '', 'TO', 'A', 'VERY', 'GREAT', 'DEGREE', 'OR', 'EXTENT', 'WE', 'ENJOYED', 'OURSELVES', 'A', 'LOT', 'beaucoup', '{CAT(CATADV)}', '', 'a_lot(icl>how)', '', 'TO', 'A', 'VERY', 'GREAT', 'DEGREE', 'OR', 'EXTENT', 'WE', 'ENJOYED', 'OURSELVES', 'A', 'LOT', 'cher', '{CAT(CATADV)}'

我要使用Python Regex进行的操作是将所有大写字母都删除为'TO', 'A', 'VERY', 'GREAT', 'DEGREE', 'OR', 'EXTENT', 'WE', 'ENJOYED', 'OURSELVES', 'A', 'LOT',。

在保持beaucoup或cher（法语小写字母）和'{CAT（CATADV）}'等单词的同时，如何使用正则表达式呢？

更清楚地说，我希望输出为：

 'a_lot(icl>how)', '', 'beaucoup', '{CAT(CATADV)}', '', 'a_lot(icl>how)', 'cher', '{CAT(CATADV)}'

Answer 1

尝试一下

 import enchant
 d = enchant.Dict("en_US")
 list = ['a_lot(icl>how)', '', 'TO', 'A', 'VERY', 'GREAT', 'DEGREE', 'OR', 'EXTENT', 'WE', 'ENJOYED', 'OURSELVES', 'A', 'LOT', 'beaucoup', '{CAT(CATADV)}', '', 'a_lot(icl>how)', '', 'TO', 'A', 'VERY', 'GREAT', 'DEGREE', 'OR', 'EXTENT', 'WE', 'ENJOYED', 'OURSELVES', 'A', 'LOT', 'cher', '{CAT(CATADV)}']

 list_of_words = [word for word in list if not d.check(word)]

Answer 2

目前尚不清楚您希望在这里实现什么。如果仅打印大写单词，请尝试

    @model  IEnumerable<MinetSingleView.Models.SingleModel>


                        @foreach (var item in Model)
                        {
                            <tr>

                                <td>@item.USSD.InsurerName</td>
                                <td>@item.USSD.PolicyType</td>
                                <td>@item.USSD.RenewalDate</td>
                                <td>@item.USSD.PolicyNo</td>
                                <td>@item.USSD.Telephone</td>
                                <td>@item.USSD.Status</td>
                            </tr>
                        }
                    </table>

                    <table class="table table-striped">
                        <tr>
                            <th>Insurance</th>
                            <th>Policy Type</th>
                            <th>Renewal Date</th>
                            <th>Policy Number</th>
                            <th>Telephone</th>
                            <th>Status</th>
                        </tr>
                        @foreach (var item in Model)
                        {
                            <tr>

                                <td>@item.Mombasa.InsurerName</td>
                                <td>@item.Mombasa.PolicyType</td>
                                <td>@item.Mombasa.RenewalDate</td>
                                <td>@item.Mombasa.PolicyNo</td>
                                <td>@item.Mombasa.Telephone</td>
                                <td>@item.Mombasa.Status</td>
                            </tr>
                        }

如果您坚持使用正则表达式，请尝试

for word in words:
    if all(x.isupper() for x in word):
        print(word)

这不会打印包含大写字母的单词和一些其他不包含小写字母的符号；如果需要的话，也许可以探索其他有用的谓词，或者只是尝试使用regex = re.compile(r'^[A-Z]+$') for word in words: if regex.match(word): print(word)而不是not x.islower()。

删除大写字母中的一些单词，但删除小写字母较小的单词

2 个答案: