计算R

时间:2016-02-07 01:02:07

标签: r entropy

enter image description here

 A=c("f","t","t","f","t","f","f","f","t","f")
    B=c("t","t","t","t","t","f","f","f","t","t")
    class=c("+","+","+","-","+","-","-","-","-","-")
    df=data.frame(A,B,class)
    df
       A B class
    1  f t     +
    2  t t     +
    3  t t     +
    4  f t     -
    5  t t     +
    6  f f     -
    7  f f     -
    8  f f     -
    9  t t     -
    10 f t     -

我根据类对属性A或B进行了分区,如下所示:

         {A}
       [T , F]          
    /         \                  
 -------     -------
 [3+,1-]     [1+,5-]





         {B}
       [T , F]          
    /         \                  
 -------     -------
 [4+,3-]     [0+,3-]

取决于上面的公式,我通过R中的代码计算熵。

1-属性A

t=table(A,class)
 t
   class
A   - +
  f 5 1
  t 1 3
 prop1=t[1,]/sum(t[1,])
 prop1
        -         + 
0.8333333 0.1666667 
 prop2=t[2,]/sum(t[2,])
  prop2
   -    + 
0.25 0.75 
 H1=-(prop1[1]*log2(prop1[1]))-(prop1[2]*log2(prop1[2]))
 H1

0.6500224 
 H2=-(prop2[1]*log2(prop2[1]))-(prop2[2]*log2(prop2[2]))
 H2

0.8112781 
 entropy=(table(A)[1]/length(A))*H1 +(table(A)[2]/length(A))*H2
 entropy

0.7145247 

2-属性B

t=table(B,class)
 t
   class
B   - +
  f 3 0
  t 3 4
 prop1=t[1,]/sum(t[1,])
 prop1
 - + 
 1 0 
 prop2=t[2,]/sum(t[2,])
 prop2
        -         + 
0.4285714 0.5714286 
 H1=-(prop1[1]*log2(prop1[1]))-(prop1[2]*log2(prop1[2]))
 H1 
NaN

 H2=-(prop2[1]*log2(prop2[1]))-(prop2[2]*log2(prop2[2]))
 H2         
0.9852281 

 entropy=(table(B)[1]/length(B))*H1 +(table(B)[2]/length(B))*H2
 entropy 
    NaN 

当我计算属性B的熵时,结果给出NaN归因于零(0)(log2(0)是错误的)。在这种情况下如何解决此错误或如何使H1给我零而不是NaN

0 个答案:

没有答案