Question

我需要在R语言中创建一个函数，它可以将句子切换成单词，然后这些单词与pos和neg词典中的单词匹配。这可能导致情绪分数 - 因为句子中的可能单词等于1，而句子中的否定单词等于-1。

Product_ID        Sentence        Attribute        SentimentScore
1111111              1            graphics                1
1111111              1            windows                 1
1111111              2            loads                  -1
2222222              1            laptops                -1
2222222              2            design                  1

产品1111111的第一句似乎是：...这个产品...... 精美的图形 ... 在我的 windows上运行良好

EG。带有词义（pos.txt）的词典看起来像： A + 盛产盛产丰富丰富 accessable 无障碍欢呼广受好评 ......等等

和带有否定词的字典（neg.txt）如下所示： 2面 2-面不正常废除可恶可恨鄙弃厌恶退出中止中止 ......等等

我在gitHub看到了一个名为score.sentiment的函数，但它使用每个句子中pos和neg字之间差异来评估所有句子。我需要一些非常相似的东西，但需要单词。

我非常感谢你的任何帮助。非常感谢前进。

Answer 1

这是否符合您的需求？

pos = c("abound" , "abounds", "abundant")
neg = c("2-face","abnormal")

sent = "abundant abnormal activity was due to 2-face people"

p = 0
for (i in 1:length(pos)) {
  if (grepl(pos[i],sent,ignore.case=T) == TRUE) p = p + 1  
}

n = 0
for (i in 1:length(neg)) {
  if (grepl(neg[i],sent,ignore.case=T) == TRUE) n = n + 1  
}

print(p)
print(n)
print(paste("Overall sentence sentiment score = ", p - n))

结果：正1，负2，整体-1

Answer 2

蛮力逼近。不是最佳的，因为使用太多的循环，但似乎正在做你需要的。希望这应该适合您的应用程序。您可以重新排列内容或将结果存储在另一个变量中，以便输出为[1] [1]等。

代码：

sent = data.frame(Sentences=c("abundant bad abnormal activity was due to 2-face people","strange exciting activity was due to 2-face people"), user = c(1,2)) 
pos = c("abound" , "abounds", "abundant", "exciting")
neg = c("2-face","abnormal", "strange", "bad", "weird")

words <- matrix(ncol = 2,nrow=8)

words = (str_split(unlist(sent$Sentences)," "))

tmp <- data.frame()
tmn <- data.frame()

for (i in 1:nrow(sent)) {
  for (j in 1:length(words)) {
    for (k in 1:length(pos)){
      if (words[[i]][j] == pos[k]) {
        print(paste(i,words[[i]][j],1))
        tmn <- cbind(i,words[[i]][j],1)
        tmp <- rbind(tmp,tmn)
      }
    }
    for (m in 1:length(neg)){
      if (words[[i]][j] == neg[m]) { 
        print(paste(i,words[[i]][j],-1))
        tmn <- cbind(i,words[[i]][j],-1)
        tmp <- rbind(tmp,tmn)
      }
    }  
  }
}

View(tmp)

结果：

    i   V2         V3
1   1   abundant    1
2   1   bad        -1
3   2   strange    -1
4   2   exciting    1

Answer 3

sent1 = data.frame(Sentences=c("abundant bad abnormal activity was due to 2- face people","strange exciting activity was due to great 2-face people"), user = c(1,2)) 
pos1 = c("abound" , "abounds", "abundant", "exciting", "great")
neg1 = c("2-face","abnormal", "strange", "bad", "weird")

然后我用了：

words = (str_split(unlist(sent1$Sentences)," "))

tmp <- data.frame()
tmn <- data.frame()

for (i in 1:nrow(sent1)) {
   for (j in 1:length(words)) {
    for (k in 1:length(pos1)){
     if (words[[i]][j] == pos1[k]) {
    print(paste(i,words[[i]][j],1))
    tmn <- cbind(i,words[[i]][j],1)
    tmp <- rbind(tmp,tmn)
  }
}
for (m in 1:length(neg1)){
  if (words[[i]][j] == neg1[m]) { 
    print(paste(i,words[[i]][j],-1))
    tmn <- cbind(i,words[[i]][j],-1)
    tmp <- rbind(tmp,tmn)
      }
    }  
  }
 }

结果导致：

print(tmp)
  i       V2 V3
1 1 abundant  1
2 1      bad -1
3 2  strange -1
4 2 exciting  1

如果我这样做的话：

sent1$Sentences <- as.character(sent1$Sentences)
List <- strsplit(sent1$Sentences, " ")
a <- data.frame(Id=rep(sent1$user, sapply(List, length)),    Words=unlist(List))
a$Words <- as.character(a$Words)
a[a$Words %in% pos1,]

导致了正确的：

Id    Words
1 abundant
2 exciting
2    great

和否定： a [$％％in％neg1，]

Id    Words
1      bad
1 abnormal
1   2-face
2  strange
2   2-face

但是我需要为正确值添加值1，为负面词添加-1。

如何从句子中提取单个单词并将其与来自R和pos中的单词的单词匹配

3 个答案: