digit categorisation using Euclidean distance

时间:2016-02-12 19:39:50

标签: java machine-learning artificial-intelligence euclidean-distance categorization

I want to categorise digits which are represented in a 64 dimensional space which gives an 8X8 pixel character image. Each attribute is an integer from 0...16. I have 20 rows of 64 values plus one at the end which determines the category. The category is previously determined by UCI but I want to know how they got each particular category for each row. So they say they used Euclidean distance to determine the category.

My question is how do I apply Euclidean distance to 64 values? I tried to use following formula (pythagorean theorem) Math.sqrt(Math.pow(x2-x1)+Math.pow(y2-y1)) within a row but the result was too big and I do not know what that represents. For example for the first row I obtained 1612 which is the square root of 40.15

This is my code for the process:

enter code here
    public static void main(String[]args)
    {
        int row[]= new int[64];
        for(int z=0;z<64;z++)
        {
            row[z]=digits[0][z]; //get the first row and store it

        }

        double result = 0;
        for(int z=0;z<64;z+=2)
        {
            double distance = Math.pow(row[z]-row[z+1],2); 

            result = result+distance; //add  distance each time
            System.out.print(result+", ");
        }
    }

The first row of digits is this: 0,0,5,13,9,1,0,0,0,0,13,15,10,15,5,0,0,3,15,2,0,11,8,0,0,4,12,0,0,8,8,0,0,5,8,0,0,9,8,0,0,4,11,0,1,12,7,0,0,2,14,5,10,12,0,0,0,0,6,13,10,0,0,0,0

I am not sure if this makes sense but if something is not clear please do ask. Thanks in advance.

1 个答案:

答案 0 :(得分:0)

My question is how do I apply Euclidean distance to 64 values?

You do not. Distance is a measure between two objects, each of which can have 64 values, but you need two objects. In particular, euclidean distance is defined as

transform: skewY(-15deg);

where dist(x, y) = ||x-y||_2 = sqrt[ SUM_{i=1}^d (x_i - y_i)^2 ] is the number of dimensions, and d means x_ith dimension of i.

So they say they used Euclidean distance to determine the category.

They said more than that, as the distance itself does not define anything besides... distance. Category on the other hand is an abstract object, which might be defined by some some characteristic point (centroid), then you assign a category with closest (in terms of given distance) centroid.