删除停用词后,从一列创建词汇表

时间:2020-07-31 22:39:10

标签: python pandas

我想创建一个由我的数据框中的一列中的所有文本(标记的)组成的语料库/词汇:

canid

我想做的是先删除停用词,然后将所有标记词添加到列表中,即:

insert into sourcetable (canid, colid, value1, value2, value3, value4, value5)
select 
    max_can_id + row_number() over(order by s.canid),
    s.colid,
    s.value1,
    s.value2,
    s.value3,
    s.value4,
    s.value5
from (
    select 
        s.*, 
        count(*) over(partition by canid) cnt,
        max(canid) over() max_can_id
    from sourcetable s
) s
inner join (
    select canid, row_number() over(partition by canid order by (select null)) rn
    from targettable
) t on t.canid = s.canid and s.cnt < t.rn   

我尝试如下:

User Text
312  Include details about your goal
41   Describe expected and actual results
421  Include any error messages

但是它给了我个性,而不是言语。

1 个答案:

答案 0 :(得分:1)

您不需要申请

SameSite=Lax
相关问题