Question

我正在尝试设计一种（好的）方法，从一系列可能的数字中选择一个随机数，其中范围中的每个数字都有一个权重。简单地说：给定数字范围（0,1,2）选择一个数字，其中0有80％被选中的概率，1有10％的几率，2有10％的几率。

我的大学统计课程已经有8年了，所以你可以想象这个目前正确的公式让我逃脱了。

这是我提出的'廉价和肮脏'的方法。此解决方案使用ColdFusion。你可以使用你想要的任何语言。我是程序员，我想我可以处理它。最终我的解决方案需要在Groovy中 - 我在ColdFusion中写了这个，因为它很容易在CF中快速编写/测试。

public function weightedRandom( Struct options ) {

    var tempArr = [];

    for( var o in arguments.options )
    {
        var weight = arguments.options[ o ] * 10;
        for ( var i = 1; i<= weight; i++ )
        {
            arrayAppend( tempArr, o );
        }
    }
    return tempArr[ randRange( 1, arrayLen( tempArr ) ) ];
}

// test it
opts = { 0=.8, 1=.1, 2=.1  };

for( x = 1; x<=10; x++ )
{
    writeDump( weightedRandom( opts ) );    
}

我正在寻找更好的解决方案，请提出改进或替代方案。

Answer 1

Rejection sampling（例如在您的解决方案中）首先想到的是，您可以构建一个查找表，其中包含由权重分布填充的元素，然后在表中选择一个随机位置并将其返回。作为一个实现选择，我会创建一个更高阶的函数，它接受一个规范并返回一个函数，该函数根据规范中的分布返回值，这样就可以避免为每个调用构建表。缺点是构建表的算法性能与项目数量呈线性关系，并且对于大规格（或者具有非常小或精确权重的成员的那些，可能存在大量内存使用，例如{0：0.99999,1 ：0.00001}）。好处是选择一个值具有恒定的时间，如果性能至关重要，这可能是理想的。在JavaScript中：

function weightedRand(spec) {
  var i, j, table=[];
  for (i in spec) {
    // The constant 10 below should be computed based on the
    // weights in the spec for a correct and optimal table size.
    // E.g. the spec {0:0.999, 1:0.001} will break this impl.
    for (j=0; j<spec[i]*10; j++) {
      table.push(i);
    }
  }
  return function() {
    return table[Math.floor(Math.random() * table.length)];
  }
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...

另一种策略是在[0,1)中选择一个随机数并迭代权重规范，对权重求和，如果随机数小于总和则返回相关值。当然，这假设权重总和为1。此解决方案没有前期成本，但平均算法性能与规范中的条目数呈线性关系。例如，在JavaScript中：

function weightedRand2(spec) {
  var i, sum=0, r=Math.random();
  for (i in spec) {
    sum += spec[i];
    if (r <= sum) return i;
  }
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...

Answer 2

生成0到1之间的随机数R.

如果R在[0,0.1） - > 1

如果R在[0.1,0.2） - > 2

如果R在[0.2,1]中 - > 3

如果无法直接获得介于0和1之间的数字，请生成一个范围内的数字，该范围将产生您想要的精度。例如，如果您有权重

（1,83.7％）和（2,16.3％），从1到1000滚动一个数字.1-837是1. 838-1000是2.

Answer 3

这或多或少是@trinithis用Java编写的通用版本：我使用int而不是浮点数来避免凌乱的舍入错误。

static class Weighting {

    int value;
    int weighting;

    public Weighting(int v, int w) {
        this.value = v;
        this.weighting = w;
    }

}

public static int weightedRandom(List<Weighting> weightingOptions) {

    //determine sum of all weightings
    int total = 0;
    for (Weighting w : weightingOptions) {
        total += w.weighting;
    }

    //select a random value between 0 and our total
    int random = new Random().nextInt(total);

    //loop thru our weightings until we arrive at the correct one
    int current = 0;
    for (Weighting w : weightingOptions) {
        current += w.weighting;
        if (random < current)
            return w.value;
    }

    //shouldn't happen.
    return -1;
}

public static void main(String[] args) {

    List<Weighting> weightings = new ArrayList<Weighting>();
    weightings.add(new Weighting(0, 8));
    weightings.add(new Weighting(1, 1));
    weightings.add(new Weighting(2, 1));

    for (int i = 0; i < 100; i++) {
        System.out.println(weightedRandom(weightings));
    }
}

Answer 4

以下是javascript中的3个解决方案，因为我不确定您希望使用哪种语言。根据您的需要，前两个中的一个可能有效，但第三个可能是最容易实现的大型集合数字。

function randomSimple(){
  return [0,0,0,0,0,0,0,0,1,2][Math.floor(Math.random()*10)];
}

function randomCase(){
  var n=Math.floor(Math.random()*100)
  switch(n){
    case n<80:
      return 0;
    case n<90:
      return 1;
    case n<100:
      return 2;
  }
}

function randomLoop(weight,num){
  var n=Math.floor(Math.random()*100),amt=0;
  for(var i=0;i<weight.length;i++){
    //amt+=weight[i]; *alternative method
    //if(n<amt){
    if(n<weight[i]){
      return num[i];
    }
  }
}

weight=[80,90,100];
//weight=[80,10,10]; *alternative method
num=[0,1,2]

Answer 5

怎么样

int [] numbers = {0,0,0,0,0,0,0,0,1,2};

然后你可以从数字中随机选择，0会有80％的几率，1 10％和2 10％

Answer 6

我使用以下

function weightedRandom(min, max) {
  return Math.round(max / (Math.random() * max + min));
}

这是我的加权＆＃34;加权＆＃34;随机，我使用＆＃34; x＆＃34;的反函数。（其中x是最小值和最大值之间的随机值）以生成加权结果，其中最小值是最重要的元素，最大值是最轻的（获得结果的机会最小）

所以基本上，使用weightedRandom(1, 5)意味着得到1的几率高于高于3的2，高于4，高于5。

可能对您的用例没有用，但可能对搜索同一问题的人有用。

经过100次迭代尝试后，它给了我：

==================
| Result | Times |
==================
|      1 |    55 |
|      2 |    28 |
|      3 |     8 |
|      4 |     7 |
|      5 |     2 |
==================

Answer 7

这个是在Mathematica中，但是很容易复制到另一种语言，我在我的游戏中使用它并且它可以处理十进制权重：

weights = {0.5,1,2}; // The weights
weights = N@weights/Total@weights // Normalize weights so that the list's sum is always 1.
min = 0; // First min value should be 0
max = weights[[1]]; // First max value should be the first element of the newly created weights list. Note that in Mathematica the first element has index of 1, not 0.
random = RandomReal[]; // Generate a random float from 0 to 1;
For[i = 1, i <= Length@weights, i++,
    If[random >= min && random < max,
        Print["Chosen index number: " <> ToString@i]
    ];
    min += weights[[i]];
    If[i == Length@weights,
        max = 1,
        max += weights[[i + 1]]
    ]
]

（现在我正在与列出第一个元素的索引等于0）这背后的想法是在那里有一个规范化的列表权重 weights [n] 有机会返回索引 n ，因此步骤 n 中min和max之间的距离应为权重[N] 。从最小min （我们将其设为0）的总距离和最大max是列表权重的总和。

这背后的好处是你不会附加到任何数组或嵌套for循环，这会大大增加执行时间。

以下是C＃中的代码，无需规范化权重列表并删除一些代码：

int WeightedRandom(List<float> weights) {
    float total = 0f;
    foreach (float weight in weights) {
        total += weight;
    }

    float max = weights [0],
    random = Random.Range(0f, total);

    for (int index = 0; index < weights.Count; index++) {
        if (random < max) {
            return index;
        } else if (index == weights.Count - 1) {
            return weights.Count-1;
        }
        max += weights[index+1];
    }
    return -1;
}

Answer 8

这里是输入和比率：0（80％），1（10％），2（10％）

让我们把它们画出来，这样很容易想象。

                0                       1        2
-------------------------------------________+++++++++

让我们将总重量相加，并将其称为总比率TR。所以在这种情况下100。让我们从（0-TR）或（0到100）中随机获取一个数字。 100是你的总重量。称其为随机数RN。

所以现在我们将TR作为总权重，将RN作为0和TR之间的随机数。

所以让我们想象一下，我们从0到100中选择了一个随机＃，说21，所以实际上是21％。

我们必须将这些转换/匹配到我们的输入数字但是如何？

让循环遍历每个权重（80,10,10）并保持我们已经访问过的权重之和。当我们循环的权重之和大于随机数RN（在这种情况下为21）时，我们停止循环＆amp;返回那个元素位置。

double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 21) //(80 > 21) so break on first pass
break;
}
//position will be 0 so we return array[0]--> 0

假设随机数（0到100之间）是83.让我们再做一遍：

double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 83) //(90 > 83) so break
break;
}

//we did two passes in the loop so position is 1 so we return array[1]---> 1

Answer 9

我建议使用连续检查概率和随机数的其余部分。

此函数首先将返回值设置为最后一个可能的索引并迭代，直到随机值的其余部分小于实际概率。

概率必须加到一。

function getRandomIndexByProbability(probabilities) {
    var r = Math.random(),
        index = probabilities.length - 1;

    probabilities.some(function (probability, i) {
        if (r < probability) {
            index = i;
            return true;
        }
        r -= probability;
    });
    return index;
}

var i,
    probabilities = [0.8, 0.1, 0.1],
    count = probabilities.map(function () { return 0; });

for (i = 0; i < 1e6; i++) {
    count[getRandomIndexByProbability(probabilities)]++;
}

console.log(count);

.as-console-wrapper { max-height: 100% !important; top: 0; }

Answer 10

我有一台slotmachine，我使用下面的代码生成随机数。在probabilitiesSlotMachine中，键是slotmachine中的输出，值代表权重。

const random = allSlotMachineResults[Math.floor(Math.random() * allSlotMachineResults.length)]

现在生成一个随机输出，我使用这段代码：

[grahams@CQ5110F trespass]$ tree
.
├── build
│   ├── lib
│   │   └── trespass
│   │       └── trespass.py
│   └── scripts-3.6
│       ├── trespass
│       └── trespass.py
├── dist
│   ├── trespass-0.6.5.4.tar.gz
│   └── trespass-0.6.5.5.tar.gz
├── License
├── MANIFEST
├── README.md
├── README.txt
├── setup.cfg
├── setup.py
└── trespass
    └── trespass

Answer 11

晚了8年，但这是我的3行解决方案。

1）准备一个概率质量函数的数组，使得

pmf [array_index] = P（X = array_index）：

var pmf = [0.8, 0.1, 0.1]

2）为相应的累积分布函数准备一个数组，使得

cdf [array_index] = F（X = array_index）：

var cdf = pmf.map((sum => value => sum += value)(0))
// [0.8, 0.9, 1]

3a）生成一个随机数。

3b）获取一个大于或等于该数字的元素数组。

3c）返回其长度。

cdf.filter(el => Math.random() >= el).length

生成加权随机数

11 个答案: