我想从列B和C的两个输入近似值得到列A的名称
Data.csv
A; B; C
ALGOL;3.13614789;40.95564610
ALIOTH;12.90050072;55.95981118
ALKAID;13.79233003;49.31324779
以下代码适用于准确值:
fid = fopen('test.csv');
C = textscan(fid, '%s %s %s', 'Delimiter', ';');
fclose(fid);
val1 = input('Enter the first input: ', 's');
val2 = input('Enter the second input: ', 's');
if(find(ismember(C{2},val1)) == find(ismember(C{3},val2)))
output = C{1}{find(ismember(C{2},val1))}
else
disp('No match found!');
end
结果:
Enter the first input: 12.90050072
Enter the second input: 55.95981118
output =
ALIOTH
但是如何用val1和val2的近似值得到相同的结果?示例:val1 = 13.001,val2 = 57.210将给出=> “玉衡”
也许我必须使用importdata,然后检查容忍度,但我不知道如何。 有没有办法做到这一点?
答案 0 :(得分:4)
我建议您将数据读取为浮点数,而不是将数据作为字符串读取,即:
C = textscan(fid, '%s %f %f', 'Delimiter', ';', 'HeaderLines', 1);
这将使您能够进行数值比较。然后你可以计算搜索值和数据矩阵中每一行之间的距离(比如说Euclidean distance):
v = [val1, val2];
dist = sqrt(sum(bsxfun(@minus, [C{2:3}], v) .^ 2, 2));
然后您可以从dist
中选择最小值(这将始终保证匹配):
tf = (dist - min(dist) < eps);
或选择低于特定阈值的值:
tol = 2; %// Tolerance of your choice
tf = (dist < tol);
生成的逻辑(布尔)向量tf
在匹配行的位置应该有“1”。
您可以通过编写以下内容将其转换为第一列的实际值:
result = C{1}(tf)
此解决方案可针对数据中的任意数量的列P进行推广。另外,假设您要在数据中搜索v
的几个不同实例(我们假设v
是M×P矩阵,其中v
中的每一行都是不同的实例匹配):
vv = permute(v, [3 2 1]);
dist = permute(sqrt(sum(bsxfun(@minus, [C{2:end}], vv) .^ 2, 2)), [1 3 2]);
同样,您可以选择最小值,确保匹配:
tf = (abs(bsxfun(@minus, dist, min(dist))) < eps);
或设置阈值:
tf = (dist < tol);
此处tf
是逻辑M×N矩阵(N是数据中的总行数),其中每列指示匹配数据行到v
中的相应行。
要将其转换为第一列的值,您必须将输出存储在单元格数组中:
result = arrayfun(@(x)C{1}(tf(:, x)), 1:size(tf, 2), 'UniformOutput', false);
v = [13, 57.2; 13, 47]; %// Entries to search
vv = permute(v, [3 2 1]);
dist = permute(sqrt(sum(bsxfun(@minus, [C{2:end}], vv) .^ 2, 2)), [1 3 2])
tf = bsxfun(@minus, dist, min(dist)) < eps;
这导致:
tf =
0 0
1 0
0 1
表示v
的第一行与第二个数据行匹配,v
中的第二行与第三个数据行匹配。要查找第一个数据列中的匹配值,我们执行以下操作:
result = arrayfun(@(x)C{1}(tf(:, x)), 1:size(tf, 2), 'UniformOutput', false);
生成以下单元格数组:
result =
{ 'ALIOTH' }
{ 'ALKAID' }
答案 1 :(得分:1)
假设您对目标中任何一个数字的距离有一定的容差,这是一种方法:
function testApproximate
% define tolerance
tolerance = 1;
% open file
fid = fopen('Data.csv');
% read headers and discard
textscan(fid, '%s %s %s', 1, 'delimiter', ';');
% read rest of the data, combine columns 2 and 3 into a single matrix
C = textscan(fid, '%s %f %f', 'delimiter', ';', 'CollectOutput', 1);
% close file
fclose(fid);
% ask user for values
val1 = input('Enter the first input: ');
val2 = input('Enter the second input: ');
% use Euclidean distance to find the closest point within tolerance
x = isApproximatelyEqual(C{2}, [val1, val2], tolerance);
if x > 0
output = C{1}{x}
else
disp('No match found!');
end
end
function x = isApproximatelyEqual(vectors, member, tol)
% set default tolerance if it is not provided
if nargin < 3, tol = Inf; end
% v is the difference between all points in vectors and our single
% point in member
v = vectors - repmat(member, size(vectors,1), 1);
% find the minimum value and index of square root of sum of square of
% all difference vectors
[mn, x] = min(sqrt(diag(v * v')));
% if minimum value does not meet tolerance, reset x
if mn > tol
x = 0;
end
% return x
return
end
此方法使用欧氏距离来找到最近的点。如果您需要单独检查每个值以查看它们是否在容差范围内,请将上面的isApproximatelyEqual
函数替换为:
function x = isApproximatelyEqual(vectors, member, tol)
% set default tolerance if it is not provided
if nargin < 3, tol = Inf; end
% v is the difference between all points in vectors and our single
% point in member
v = vectors - repmat(member, size(vectors,1), 1);
% return the first pair of points that matches the tolerance
x = find(all(abs(v') < tol), 1);
return
end