优化Python MySQL /连接器速度

时间:2015-10-16 20:29:54

标签: python mysql csv pandas

我在MySQL中有一个包含以下列的表:

import pandas as pd
import mysql.connector 

con = mysql.connector.connect(**CONFIG) 
cur = con.cursor()

def get_data1():
    df = pd.read_sql(
        """
        SELECT datetime, open, high, low, close 
        FROM prices
        WHERE contract_id = 1 
            AND datetime >= '2015-01-01 09:00:00' 
            AND datetime <= '2015-10-15 16:00:00'; 
        """, con)
    return df

该表相当大(> 3亿行),但在数据库内进行的查询在半秒内执行,即使它们返回300,000行也是如此。但是,当我从Python检索数据时,它非常慢(同样的请求从MySQL Workbench中的0.5秒到Python中的34秒):

def get_data2():
    cur.execute(
        """
        SELECT datetime, open, high, low, close 
        FROM prices
        WHERE contract_id = 1 
            AND datetime >= '2015-01-01 09:00:00' 
            AND datetime <= '2015-10-15 16:00:00'
        INTO OUTFILE 'C:/Data/Temp.csv'
        FIELDS TERMINATED BY ','
        ENCLOSED BY '"'
        LINES TERMINATED BY "\n";
        """)
    return pd.read_csv('C:/Data/Temp.csv')

我发现将数据从MySQL导出到平面文件,然后在Python中读取它比在数据库上直接查询快23倍:

get_data1

怎么可能?我怀疑这与数据类型转换有关。知道如何更快地使功能java.lang.RuntimeException: An error occured while executing doInBackground() at android.os.AsyncTask$3.done(AsyncTask.java:299) at java.util.concurrent.FutureTask$Sync.innerSetException(FutureTask.java:273) at java.util.concurrent.FutureTask.setException(FutureTask.java:124) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:307) at java.util.concurrent.FutureTask.run(FutureTask.java:137) at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:230) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569) at java.lang.Thread.run(Thread.java:856) Caused by: java.lang.IllegalStateException: Unable to create directory: /storage/sdcard0/storage/external_SD/Video at android.app.DownloadManager$Request.setDestinationInExternalPublicDir(DownloadManager.java:496) at com.pipodi.italiansubsmobileclient.connections.ConnectionForSubtitle.doInBackground(ConnectionForSubtitle.java:61) at com.pipodi.italiansubsmobileclient.connections.ConnectionForSubtitle.doInBackground(ConnectionForSubtitle.java:21) at android.os.AsyncTask$2.call(AsyncTask.java:287) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:305) ... 5 more 更快,而不必先导出到CSV?谢谢。

1 个答案:

答案 0 :(得分:0)

以下解决方案比初始解决方案快3倍(12秒对34秒):

import mysql.connector
con = mysql.connector.connect(**CONFIG)
cur = con.cursor()
class MySQLConverter(mysql.connector.conversion.MySQLConverter):
    def _DECIMAL_to_python(self, value, desc=None):
        return float(value)
    _NEWDECIMAL_to_python = _DECIMAL_to_python

con.set_converter_class(MySQLConverter)

它将MySQL十进制类型转换为Python float而不是decimal.Decimal,这更快。它仍然比&#34; CSV解决方案慢得多。需要1.57秒才能完成。还在挖......