使用Laravel DB Seed插入100万条记录

时间:2016-02-16 10:12:08

标签: php mysql laravel

我正在使用faker获取虚拟数据并尝试添加100万条记录。不知怎的,我只能达到大约100000行,以下是我的代码

$no_of_rows = 1000000;

for( $i=1; $i <= $no_of_rows; $i++ ){
        $user_data[] = [
            'status' => 'ACTIVE',
            'username' => $faker->userName,
            'email' => $faker->email,
            'password' => $password,
            'firstname' => $faker->firstName,
            'surname' => $faker->lastName,
            'mobilenumber' => $faker->phoneNumber,
            'confirmed' => (int)$faker->boolean(50),
            'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
            'dob' => $faker->date(),
            'address_line_1' => $faker->address,
            'address_line_2' => '',
            'post_code' => $faker->postcode,
        ];


}

User::insert($user_data);

我收到以下错误消息

PHP Fatal error:  Allowed memory size of 1073741824 bytes exhausted

我已设置ini_set('memory_limit', '1024M');

任何有用的想法或解决方案?

4 个答案:

答案 0 :(得分:4)

这个问题的核心问题是Faker lib实例(通常用于在Laravel中生成数据)是内存繁重的,并且在大循环中使用它时垃圾收集器无法正确清除它。 / p>

我同意@ Rob Mkrtchyan上面添加的被处理的处理,但由于这是Laravel,我建议使用Factory工具提供更优雅的解决方案。

您可以创建一个特定的模型工厂(在Laravel 5.3中,这应该放在数据库/工厂/中),例如:

$factory->define(Tests::class, function (Faker\Generator $faker) {
    return [
        'status' => 'ACTIVE',
        'username' => $faker->userName,
        'email' => $faker->email,
        'password' => bcrypt('secret'),
        'firstname' => $faker->firstName,
        'surname' => $faker->lastName,
        'mobilenumber' => $faker->phoneNumber,
        'confirmed' => (int)$faker->boolean(50),
        'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
        'dob' => $faker->date(),
        'address_line_1' => $faker->address,
        'address_line_2' => '',
        'post_code' => $faker->postcode,
    ];
});

然后在你的dB播种机类中运行工厂很简单。请注意,数字200表示要创建的种子数据条目的数量。

factory(Tests::class, 200)
    ->create();

使用种子工厂的原因是它允许您更灵活地设置变量等。有关此文档,您可以参考Laravel docs on dB seeding

现在,既然你正在处理大量的记录,那么实现一个有助于php垃圾收集的分块解决方案是微不足道的。例如:

for ($i=0; $i < 5000; $i++) {
    factory(Tests::class, 200)
        ->create();
}

我做了一个快速测试,在这个配置中,无论创建的数据条目如何,你的脚本内存使用量应该在12-15mb左右(当然取决于其他系统因素)。

答案 1 :(得分:2)

foreach循环中设置的变量永远不会被使用,所以如果foreach循环的唯一目的是添加一百万个记录,你可以取消foreach并使用这样的东西?这样,用于填充数据库的数组在每次迭代时都会重新声明,而不是添加越来越多的条目。

$no_of_rows = 1000000;

for( $i=0; $i < $no_of_rows; $i++ ){
    $user_data = array(
        'status' => 'ACTIVE',
        'username' => $faker->userName,
        'email' => $faker->email,
        'password' => $password,
        'firstname' => $faker->firstName,
        'surname' => $faker->lastName,
        'mobilenumber' => $faker->phoneNumber,
        'confirmed' => (int)$faker->boolean(50),
        'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
        'dob' => $faker->date(),
        'address_line_1' => $faker->address,
        'address_line_2' => '',
        'post_code' => $faker->postcode,
    );

    User::insert( $user_data );
    $user_data=null;
}

根据您的上一条评论,我可以看到为什么使用块 - 在发布回复之前无法知道sql的语法,所以也许这可能更合适?

$no_of_rows = 1000000;
$range=range( 1, $no_of_rows );
$chunksize=1000;

foreach( array_chunk( $range, $chunksize ) as $chunk ){
    $user_data = array();/* array is re-initialised each major iteration */
    foreach( $chunk as $i ){
        $user_data[] = array(
            'status' => 'ACTIVE',
            'username' => $faker->userName,
            'email' => $faker->email,
            'password' => $password,
            'firstname' => $faker->firstName,
            'surname' => $faker->lastName,
            'mobilenumber' => $faker->phoneNumber,
            'confirmed' => (int)$faker->boolean(50),
            'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
            'dob' => $faker->date(),
            'address_line_1' => $faker->address,
            'address_line_2' => '',
            'post_code' => $faker->postcode
        );      
    }
    User::insert( $user_data );
}

答案 2 :(得分:1)

您好:这是非常好的而且非常快速的插入数据解决方案

$no_of_data = 1000000;
$test_data = array();
for ($i = 0; $i < $no_of_data; $i++){
  $test_data[$i]['number'] = "1234567890";
  $test_data[$i]['message'] = "Test Data";
  $test_data[$i]['status'] = "Delivered";
}
$chunk_data = array_chunk($test_data, 1000);
if (isset($chunk_data) && !empty($chunk_data)) {
  foreach ($chunk_data as $chunk_data_val) {
     DB::table('messages')->insert($chunk_data_val);
  }
}

答案 3 :(得分:0)

您好:这是一个很好的解决方案

public function run(){
    for($j = 1; $j < 1000; $j++){
        for($i = 0; $i < 1000; $i++){
             $user_data[] = [
                 'status' => 'ACTIVE',
                 'username' => $faker->userName,
                 'email' => $faker->email,
                 'password' => $password,
                 'firstname' => $faker->firstName,
                 'surname' => $faker->lastName,
                 'mobilenumber' => $faker->phoneNumber,
                 'confirmed' => (int)$faker->boolean(50),
                 'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
                 'dob' => $faker->date(),
                 'address_line_1' => $faker->address,
                 'address_line_2' => '',
                 'post_code' => $faker->postcode,
             ];
        }

        User::insert($user_data);
    }
}

此代码在内存中仅使用1000个长度数组...您可以在不更改任何默认php设置的情况下运行此代码...

享受,..