为什么python SqlAlchemy Boolean和Integer Type之间存在大的插入性能差异

时间:2013-06-28 10:16:01

标签: python sqlite sqlalchemy

使用Python和Sqlalchemy将相同的值作为布尔或整数存储在sqlite数据库中会产生以下结果。

Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 62.5009999275 secs
SqlAlchemy Core: Total time for 40000 records 56.0600001812 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 5.72099995613 secs
SqlAlchemy Core: Total time for 40000 records 0.770999908447 secs

使用布尔类型时为什么会出现这样的性能问题?

我知道SQLite没有布尔类型的概念,而是将它们存储为整数1(True)或0(False)。我原以为SqlAlchemy会把python bool映射到Sqlite整数。

用于生成上述输出的脚本(从this问题修改):

import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine, Boolean
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())

class CustomerInteger(Base):
    __tablename__ = "customerInteger"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Integer)

class CustomerBoolean(Base):
    __tablename__ = "customerBoolean"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Boolean)

def init_sqlalchemy(dbname = 'sqlite:///sqlalchemy.db'):
    global engine
    engine = create_engine(dbname, echo=False)
    DBSession.remove()
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)

def test_sqlalchemy_orm(n, table):
    init_sqlalchemy()
    t0 = time.time()
    for i in range(n):
        customer = table()
        customer.name = 'NAME ' + str(i)
        customer.value = True
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print "SqlAlchemy ORM: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


def test_sqlalchemy_core(n, table):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        table.__table__.insert(),
        [{"name":'NAME ' + str(i), "value":True } for i in range(n)]
    )
    print "SqlAlchemy Core: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


if __name__ == '__main__':

    print "Value stored as Boolean:"
    test_sqlalchemy_orm(40000, CustomerBoolean)
    test_sqlalchemy_core(40000, CustomerBoolean)

    print "Value stored as Integer:"
    test_sqlalchemy_orm(40000, CustomerInteger)
    test_sqlalchemy_core(40000, CustomerInteger)

1 个答案:

答案 0 :(得分:3)

我已经对三种配置进行了测试。虽然布尔值和整数之间的运行时间存在差异,但它不是10倍。可能你想尝试切换到另一个python版本。

PS。我在使用Windows 8的Core i5 M430 CPU机器上运行我的测试。

此外,我建议运行profiler以查看sqlalchemy在您的系统上运行时花费的时间。

1)

python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.7.8
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.84400010109 secs
SqlAlchemy Core: Total time for 40000 records 0.725000143051 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.0680000782 secs
SqlAlchemy Core: Total time for 40000 records 0.443000078201 secs

2)

python: 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 9.69299983978 secs
SqlAlchemy Core: Total time for 40000 records 0.572000026703 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 9.35899996758 secs
SqlAlchemy Core: Total time for 40000 records 0.40700006485 secs

3)

python: 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.531000137329102 secs
SqlAlchemy Core: Total time for 40000 records 0.7139999866485596 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.023000001907349 secs
SqlAlchemy Core: Total time for 40000 records 0.44099998474121094 secs
相关问题