Question

我已经尝试过这个Joe回答https://stackoverflow.com/a/32187103/2229367的例子，但效果很好，但是当我尝试编辑这段代码时：

$pool = new Pool(4);

while (@$i++<10) {
    $pool->submit(new class($i) extends Collectable {
        public function __construct($id) {
            $this->id = $id;
        }

        public function run() {
            printf(
                "Hello World from %d\n", $this->id);
        $this->html = file_get_contents('http://google.fr?q=' . $this->query);
            $this->setGarbage();
        }

        public $id;
public $html;
    });
}

while ($pool->collect(function(Collectable $work){
    printf(
        "Collecting %d\n", $work->id);
var_dump($work->html);
    return $work->isGarbage();
})) continue;

$pool->shutdown();

＆＃34; Hello world＆＃34;不同于＆＃34;收集＆＃34;的数量。文档已过时。这个问题怎么样？

Answer 1

Worker::collect并非旨在让您获得结果;这是不确定的。

Worker::collect 仅旨在对Worker个对象堆栈中引用的对象运行垃圾回收。

如果打算在每个结果可用时处理它们，代码可能如下所示：

<?php
$pool = new Pool(4);
$results = new Volatile();
$expected = 10;
$found = 0;

while (@$i++ < $expected) {
    $pool->submit(new class($i, $results) extends Threaded {

        public function __construct($id, Volatile $results) {
            $this->id = $id;
            $this->results = $results;
        }

        public function run() {
            $result = file_get_contents('http://google.fr?q=' . $this->id);

            $this->results->synchronized(function($results, $result){
                $results[$this->id] = $result;
                $results->notify();
            }, $this->results, $result);
        }

        private $id;
        private $results;
    });
}

do {
    $next = $results->synchronized(function() use(&$found, $results) {
        while (!count($results)) {
            $results->wait();
        }

        $found++;

        return $results->shift();
    });

    var_dump($next);
} while ($found < $expected);

while ($pool->collect()) continue;

$pool->shutdown();
?>

这显然不能容忍错误，但主要区别在于我使用共享的Volatile结果集合，并且我正确地同步以在主要上下文中获取结果，因为它们变得可用。

如果你想等待所有结果变得可用，并且可能避免一些争用锁 - 你应该总是试图避免 - 然后代码看起来会更简单，例如：

<?php
$pool = new Pool(4);
$results = new Volatile();
$expected = 10;

while (@$i++ < $expected) {
    $pool->submit(new class($i, $results) extends Threaded {

        public function __construct($id, Volatile $results) {
            $this->id = $id;
            $this->results = $results;
        }

        public function run() {
            $result = file_get_contents('http://google.fr?q=' . $this->id);

            $this->results->synchronized(function($results, $result){
                $results[$this->id] = $result;
                $results->notify();
            }, $this->results, $result);
        }

        private $id;
        private $results;
    });
}

$results->synchronized(function() use($expected, $results) {
    while (count($results) != $expected) {
        $results->wait();
    }
});

var_dump(count($results));

while ($pool->collect()) continue;

$pool->shutdown();
?>

值得注意的是Collectable接口已经在最新版本的pthreads中由Threaded实现 - 这是你应该使用的...总是......

文档已经过时了，对不起......一个人...

Answer 2

Pthreads V3远比V2少得多。收集是一个没有进入V3。

规则n°1：我在线程中执行所有查询，避免在其中传递过多的数据。这对V2来说还不错，不再是V3了。我尽可能地传递给工人的论点。这也可以加快处理速度。

规则n°2：我没有查看每个池可用的CPU线程数，并通过循环对其进行相应的chunck。这样我确保没有大量池的内存开销，每次循环完成时，我强制进行垃圾收集。这对我来说是必要的，因为跨线程的Ram需求非常高，可能不是你的情况，但要确保你消耗的ram没有超过你的php限制。你传递给线程的参数越多越大，ram会越快。

规则n°3：在带有（数组）的worker中正确声明对象数组，以确保返回所有结果。

这是一个基本的重写工作示例，按照我的例子尽可能接近3个规则：

使用多线程查询数组。
一个可收集的工具，用来代替收集结果。
根据线程的CPU数量来批量池，以避免ram开销。
线程查询，每个查询都有他的连接，而不是通过工作人员传递。
最后将所有结果推送到数组中。

代码：

    define("SQLHOST", "127.0.0.1");
    define("SQLUSER", "root");
    define("SQLPASS", "password");
    define("SQLDBTA", "mydatabase");

    $Nb_of_th=12; // (6 cpu cores in this example)
    $queries = array_chunk($queries, ($Nb_of_th));// whatever list of queries you want to pass to the workers
    $global_data=array();// all results from all pool cycles

    // first we set the main loops
    foreach ($queries as $key => $chunks) {
    $pool = new Pool($Nb_of_th, Worker::class);// 12 pools max
    $workCount = count($chunks);

    // second we launch the submits 
    foreach (range(1, $workCount) as $i) {
        $chunck = $chunks[$i - 1];
        $pool->submit(new MyWorkers($chunck));
    }

    $data = [];// pool cycle result array
    $collector = function (\Collectable $work) use (&$data) {
        $isGarbage = $work->isGarbage();
        if ($isGarbage) {
            $data[] = $work->result; // thread result
        }
        return $isGarbage;
    };

    do {
        $count = $pool->collect($collector);
        $isComplete = count($data) === $workCount;
    } while (!$isComplete);

    array_push($global_data, $data);// push pool results into main

    //complete purge
    unset($data);
    $pool->shutdown();
    unset($pool);
    gc_collect_cycles();// force garbage collector before new pool cycle
    }

    Var_dump($global_data); // results for all pool cycles

    class MyWorkers extends \Threaded implements \Collectable {

    private $isGarbage;
    public $result;
    private $process;

    public function __construct($process) {
        $this->process = $process;
    }

    public function run() {

        $con = new PDO('mysql:host=' . SQLHOST . ';dbname=' . SQLDBTA . ';charset=UTF8', SQLUSER, SQLPASS);
        $proc = (array) $this->process; // important ! avoid volatile destruction in V3
        $stmt = $con->prepare($proc);
        $stmt->execute();
        $obj = $stmt1->fetchall(PDO::FETCH_ASSOC);

        /* do whatever you want to do here */
        $this->result = (array) $obj; // important ! avoid volatile destruction in V3
        $this->isGarbage = true;
    }

    public function isGarbage() : bool
    {
    return $this->isGarbage;
    }
}

为什么不完成所有线程？

2 个答案: