SQL alchemy - 指定外键可能不存在时的关系

时间:2015-06-16 15:13:32

标签: python sqlalchemy foreign-keys

我需要在数据库中设置两个表,我正在努力决定如何在SQL Alchemy中设计表。

表1包含原始地址数据和地址来源。如果原始地址来自不同来源,则可能会出现不止一次。

表2包含这些地址的地理编码版本。每个地址只出现一次。地址应仅出现在此表中,如果它们在表1中至少出现一次

当新地址进入系统时,它们将首先插入到表1中。然后,我将有一个脚本,查找表1中不在表2中的记录,对它们进行地理编码并将它们插入表2中。 / p>

我有以下代码:

class RawAddress(Base):
    __tablename__ = 'rawaddresses'

    id = Column(Integer,primary_key = True)
    source_of_address = Column(String(50))

    #Want something like a foreign key here, but address may not yet exist
    #in geocoded address table
    full_address = Column(String(400)) 


class GeocodedAddress(Base):
    __tablename__ = 'geocodedaddresses'

    full_address = Column(String(400), primary_key = True)
    lat = Column(Float)
    lng = Column(Float)

有没有办法在SQL Alchemy中建立full_address字段之间的关系?或许我的设计错了 - 也许每当我看到一个新的原始地址时,我应该将它添加到GeocodedAddress表中,并带有一个标记,说明它是否进行了地理编码?

非常感谢您对此的任何帮助。

1 个答案:

答案 0 :(得分:1)

考虑到您的评论,允许此类数据存储以及插入/更新过程的代码应该可以胜任。之前几点评论:

  • 外键可以为NULL,因此您的FK构思仍然有效。
  • 您可以在任何型号上定义关系,并使用backref
  • 命名另一侧

代码:

# Model definitions

class RawAddress(Base):
    __tablename__ = 'rawaddresses'

    id = Column(Integer, primary_key=True)
    source_of_address = Column(String(50))

    full_address = Column(
        ForeignKey('geocodedaddresses.full_address'),
        nullable=True,
    )


class GeocodedAddress(Base):
    __tablename__ = 'geocodedaddresses'

    full_address = Column(String(400), primary_key=True)
    lat = Column(Float)
    lng = Column(Float)

    raw_addresses = relationship(RawAddress, backref="geocoded_address")

现在:

# logic

def get_geo(full_address):
    " Dummy function which fakes `full_address` and get lat/lng using hash(). "
    hs = hash(full_address)
    return (hs >> 8) & 0xff, hs & 0xff


def add_test_data(addresses):
    with session.begin():
        for fa in addresses:
            session.add(RawAddress(full_address=fa))


def add_geo_info():
    with session.begin():
        q = (session
             .query(RawAddress)
             .filter(~RawAddress.geocoded_address.has())
             )
        for ra in q.all():
            print("Computing geo for: {}".format(ra))
            lat, lng = get_geo(ra.full_address)
            ra.geocoded_address = GeocodedAddress(
                full_address=ra.full_address, lat=lat, lng=lng)

和一些测试:

# step-1: add some raw addresses
add_test_data(['Paris', 'somewhere in Nevada'])
print("-"*80)

# step-2: get those raw which do not have geo
add_geo_info()
print("-"*80)

# step-1: again with 1 new, 1 same
add_test_data(['Paris', 'somewhere in Chicago'])
print("-"*80)

# step-2: get those raw which do not have geo
add_geo_info()
print("-"*80)


# check: print all data for Paris geo:
gp = session.query(GeocodedAddress).filter(GeocodedAddress.full_address == 'Paris').one()
assert 2 == len(gp.raw_addresses)
print(gp.raw_addresses)