如何逐行从大表中获取数据

时间:2018-06-15 12:05:37

标签: mysql perl dbi

我需要从mysql表中获取所有数据。我到目前为止所尝试的是:

execute()

但总有一个但是......

表有大约600M行,显然查询的所有结果都在my $query = $connection->prepare("select * from table"); while (my @row=$query->fetchrow_array) { #....do stuff } 命令后存储在内存中。没有足够的内存:(

我的问题是:

有没有办法使用perl DBI逐行从表中获取数据?这样的事情:

-v ~/.composer:/home/composer/.composer
顺便说一句,分页是慢的:/

3 个答案:

答案 0 :(得分:2)

fetchall_arrayref method有两个参数,第二个参数允许你一次限制从表中提取的行数

以下代码一次从表中读取1,000行并处理每一行

my $sth = $dbh->prepare("SELECT * FROM table");
$sth->execute;

while ( my $chunk = $sth->fetchall_arrayref( undef, 1000 ) ) {

    last unless @$chunk;    # Empty array returned at end of table

    for my $row ( @$chunk ) {

        print format_row(@$row);
    }
}

答案 1 :(得分:2)

apparently all results from query is store in memory after execute() command

That is the default behaviour of the mysql client library. You can disable it by using the mysql_use_result attribute on the database or statement handle.

Note that the read lock you'll have on the table will be held much longer while all the rows are being streamed to the client code. If that might be a concern you may want to use SQL_BUFFER_RESULT.

答案 2 :(得分:0)

在处理巨大的表时,我使用动态构建的SQL语句构建数据包,例如

{
"capabilities": [
    {
    "browserName": "firefox",
    "maxInstances": 4,
    "seleniumProtocol": "WebDriver"
    },
],
    "nodeTimeout": 240,
    "nodePolling": 2000,
    "maxSession": 4,
    "cleanUpCycle": 2000
    "port": 6435,
    "host": <provide ip>,
    "register": true,
    "registerCycle": 5000,
    "hubPort": 4444,
    "hubHost": 10.174.1.51,
    "timeout": 30000,
    "browserTimeout":60
}

应用程序将根据其处理的每个程序包动态填充$sql = "SELECT * FROM table WHERE id>" . $lastid . " ORDER BY id LIMIT " . $packagesize

如果$lastid有一个ID字段table,则它也有一个基于该字段的索引,因此性能相当好。
它还通过每个查询之间的少量休息来限制数据库负载。

相关问题