text = "hellovision hey creator yoyo b creator great publisher"
我想从文本中提取创作者的名字和出版者的名字。
结果将是
creator = hellovision嘿,悠悠
发布者=很好
如何使用正则表达式获取文本?
我需要使用span()..
这是我的代码。
def preprocess2(text):
text_list = test.split(' ')
lyricist = []
composer = []
music_arranger = []
temp = []
lyricist.clear()
composer.clear()
music_arranger.clear()
for i in range(0, len(text_list)):
if text_list[i] == 'creator':
print(len(text_list))
for a in range(0, i-1):
temp.append(text_list[a])
lyricist.append(''.join(temp))
temp.clear()
for b in range(0, i+1):
print(b)
text_list.pop(b)
print(len(text_list))
break
elif text_list[i] == 'pulisher':
for a in range(0, i-1):
temp.append(text_list[a])
composer.append(''.join(temp))
temp.clear()
for b in range(0, i+1):
text_list.pop(b)
break
i = i +1
return text_list
答案 0 :(得分:0)
如果使用带有捕获组的正则表达式拆分数组,拆分时的值也将传递到输出数组中。
然后,您可以循环查找'creator'
或'publisher'
,并在每种情况下,将前一个条目传递到适当的集合中。
const text = "hellovision hey creator yoyo b creator great publisher"
const splitArr = text.split(/(creator|publisher)/)
const creators = [], publishers = []
let i = -1, len = splitArr.length
while(++i < len){
if(splitArr[i] == "creator") creators.push(splitArr[i-1].trim())
else if(splitArr[i] == "publisher") publishers.push(splitArr[i-1].trim())
}
console.log("creators: ", creators)
console.log("publishers: ", publishers)