Question

我通过将句子分成字符（包括标点符号和空格）来创建数组。我想根据这一个中的字符创建另一个数组。这是我写的代码。

def split(arr, size):
    arrs = []
    while len(arr) > size:
       pice = arr[:size]
       arrs.append(pice)
       arr   = arr[1:]
   arrs.append(arr)
   return arrs

def main(): 
x_mat = [list(word.rstrip()) for word in result] 

for rows in x_mat: 
    if len(rows) == 3 :
        rows = rows.append(' ')

myarray = np.asarray(x_mat)
m = len(myarray)
n = len(myarray[0])

numx_mat = np.zeros([m, n])

for j in range(len(myarray)) :  
    if( 97 <= ord(myarray([j,1])) <= 122):  
        numx_mat([j,1])  == 1 
    elif( myarray([j,2]) == '.' or '?' or '!') :
        numx_mat([j,2]) == 1 
    elif( myarray([j,3]) == ' ') :
        numx_mat([j,3]) == 1 
    elif(65 <= ord(myarray([j,4])) <= 90 ): 
            numx_mat([j,4]) == 1 
    else :
            continue 
main()

我得到的错误消息如下： Traceback（最近一次调用最后一次）：在主要 if（97＆lt; = ord（myarray（[j，1]））＆lt; = 122）： TypeError：＆＃39; numpy.ndarray＆＃39;对象不可调用

myarray是我创建的2d数组，包括字符串中的标点符号和空格，一次考虑4个字符，顺序（1234,2345，...等）我想创建一个新的1＆＃39; s和0的数组，我在按字符顺序查找特定模式。（如果第一个字符是小写字母，则新数组在第一个位置具有1，如果第二个字符是空格，则新数组在第二个位置具有1等）。那么如何检查第一个条目（1,1）是否是现有数组中的小写字母？

Answer 1

在numpy之外工作片刻，您可能还需要考虑其他方法。

使用re.match解析字符串

def gen_array(string):
    new_array = []
    for x in range(0, len(string)-3):
        if re.match(r'[a-z][?.!]\s[A-Z]', string[x:x+4]):
            new_array.append(1)
        else:
            new_array.append(0)
    # append trailing three zeros
    new_array.extend([0, 0, 0])
    return new_array

产生1＆0;和0＆＃39;的数组：

>>> string
'This is a sentence. Followed by a bang! And a Question?'
>>> gen_array(string)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

＆＃34; 1＆＃34;在＆＃34; e＆＃34; in＆＃34;句子＆＃34;和＆＃34; g＆＃34;在＆＃34; bang＆＃34;。

使用re.finditer将偏移匹配作为数组

>>> string
'This is a sentence. Followed by a bang! And a Question?'
>>> results = list(re.finditer(r'[a-z][?.!]\s[A-Z]', string))
>>> results
[<_sre.SRE_Match object; span=(17, 21), match='e. F'>, <_sre.SRE_Match object; span=(37, 41), match='g! A'>]

>>> for match in results:
...     print(match.group(), "at", match.span())
... 
e. F at (17, 21)
g! A at (37, 41)

可以根据需要为Unicode或其他标点符号调整字符类（集）。

如何基于我创建的现有阵列创建阵列？

1 个答案:

使用re.match解析字符串

使用re.finditer将偏移匹配作为数组