如何处理余弦相似度的负值

时间:2016-05-26 07:53:51

标签: python scikit-learn svd cosine-similarity lsa

我根据术语计算了我的文档的tf-idf。然后,我应用LSA来减少术语的维度。 ' similarity_dist'包含负值(见下表)。如何计算0-1?

范围内的余弦距离
DatabaseMetaData metaData = connection.getMetaData();
ResultSet result = metaData.getProcedureColumns(null,"IFSAPP","DECODE",null);
System.out.println(result.getFetchSize());
result.setFetchSize(100);
System.out.println(result.getFetchSize());

enter image description here

1 个答案:

答案 0 :(得分:1)

cosine_similarity的范围是-1到1

余弦距离定义为:

// **AppDelegate is a singleton class, its is access any where in view controller.**

    **//crete a variable global in AppDelegate. in**

#import <UIKit/UIKit.h>

@interface AppDelegate : UIResponder <UIApplicationDelegate>
@property (strong, nonatomic) UIWindow *window;

@property (strong, nonatomic) NSString *tokenString;
@property (strong, nonatomic) NSString* edriveridString;

// **you can assign the value in AppDelegate.m**

_tokenString = [[temp objectAtIndex:0] valueForKey:@"token"];
_edriveridString = [[temp objectAtIndex:0] valueForKey:@"edriverId”];

// In View Controller where you want to access this value , write.

AppDelegate *del = [[UIApplication sharedApplication] delegate];
NSLog(@“tokenString -> %@“, del.tokenString);
NSLog(@“ edriveridString-> %@“, del. edriveridString);

因此cosine_distance将在0到2的范围内

请参阅https://en.wikipedia.org/wiki/Cosine_similarity

余弦距离是正空间中常用于补码的术语,即:D_C(A,B)= 1 - S_C(A,B)。

注意:如果 必须在0到1的范围内,您可以使用cosine_distance / 2