Question

我有二进制分类问题，确定与特定文档关联的类别，文档呈现为单词形式的样式特征向量：

示例：

Document 1 = ["I", "am", "awesome"]
Document 2 = ["I", "am", "great", "great"]

字典是：

["I", "am", "awesome", "great"]

因此，作为矢量的文档看起来像：

Document 1 = [1, 1, 1, 0]
Document 2 = [1, 1, 0, 2]

我想在此输入中应用随机梯度下降算法，以“最小化涉及hinge loss的经验风险”。

我已经搜索了高低，看看随机梯度下降算法如何接受这种形式的输入，但我没有在任何地方找到简单明了的解释。

这是维基百科的伪代码：

Choose an initial vector of parameters w and learning rate \alpha.
    Randomly shuffle examples in the training set.
        Repeat until an approximate minimum is obtained:
            For i=1, 2, ..., n, do:
                w := w - alpha DELTA Q_i(w)

有人请向我解释我正在使用的输入是如何符合该伪代码的吗？

我见过这样的数据：

private List<Point2D> loadData() 
{
    List<Point2D> data = new ArrayList<>();
    data.add(new Point2D.Double(1, 2));
    data.add(new Point2D.Double(2, 3));
    data.add(new Point2D.Double(3, 4));
    data.add(new Point2D.Double(4, 5));
    data.add(new Point2D.Double(5, 6));
    data.add(new Point2D.Double(6, 7));
    return data;
}

也是这样的：

 static double[] x = {2, 4, 6, 8};
 static double[] y = {2, 5, 5, 8};

我想后者对我的情况更为可取。

这是一个感知器实现，我想修改它以产生随机梯度下降，也许有人可能能指出我需要做出哪些改变，以及如何？

public static void perceptron(Set<String> globoDict,
   Map<String, int[]> trainingPerceptronInput,
   Map<String, int[]> testPerceptronInput)
{
    //store weights to be averaged. 
   Map<Integer,double[]> cached_weights = new HashMap<Integer,double[]>();


   final int globoDictSize = globoDict.size(); // number of features

   // weights total 32 (31 for input variables and one for bias)
   double[] weights = new double[globoDictSize + 1];
   for (int i = 0; i < weights.length; i++) 
   {
       weights[i] = 0.0;
   }


   int inputSize = trainingPerceptronInput.size();
   double[] outputs = new double[inputSize];
   final double[][] a = Prcptrn_InitOutpt.initializeOutput(trainingPerceptronInput, globoDictSize, outputs, LABEL);


   double globalError;
   int iteration = 0;
   do 
   {
       iteration++;
       globalError = 0;
       // loop through all instances (complete one epoch)
       for (int p = 0; p < inputSize; p++) 
       {
           // calculate predicted class
           double output = Prcptrn_CalcOutpt.calculateOutput(THETA, weights, a, p);
           // difference between predicted and actual class values
           //always either zero or one
           double localError = outputs[p] - output;

           int i;
           for (i = 0; i < a.length; i++) 
           {
               weights[i] += LEARNING_RATE * localError * a[i][p];
           }
           weights[i] += LEARNING_RATE * localError;

           // summation of squared error (error value for all instances)
           globalError += localError * localError;
       }

       //store weights for averaging
       cached_weights.put( iteration , weights );

       /* Root Mean Squared Error */
       System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt(globalError / inputSize));
   } 
   while (globalError != 0 && iteration <= MAX_ITER);



   int size = globoDictSize + 1;
   //compute averages
   double[] sums = new double[size];
   double[] averages = new double[size];

   for (Entry<Integer, double[]> entry : cached_weights.entrySet()) 
   {
       double[] value = entry.getValue();
       for(int pos=0; pos < size; pos++){
           sums[ pos ] +=  value[ pos ]; 
       }
   }
   for(int pos=0; pos < size; pos++){
       averages[ pos ] = sums[ pos ] / cached_weights.size(); 
   }


   System.out.println("\n=======\nDecision boundary equation:");
   int i;
   for (i = 0; i < a.length; i++) 
   {
       System.out.print(" a");
       if (i < 10) System.out.print(0);
       System.out.println( i + " * " + weights[i] + " + " );


   }
   System.out.println(" bias: " + weights[i]);


   //TEST
   //this works because, at this point the weights have already been learned. 
   inputSize = testPerceptronInput.size();
   outputs = new double[inputSize];
   double[][] z = Prcptrn_InitOutpt.initializeOutput(testPerceptronInput, globoDictSize, outputs, LABEL); 

   test_output = Prcptrn_CalcOutpt.calculateOutput(THETA, weights, z, TEST_CLASS);       

   System.out.println("class = " + test_output);

}

Answer 1

您需要使用权重乘以插入到您选择的损失函数中的数据表示来编写表达式。这就是你写Q的方式。您可以在此表达式中使用数据。我认为你表达它的方式并没有错，因为在你初始化w之后你会调整它们来计算一个好的决策函数。

随机梯度下降变量的显式规范

1 个答案: