如何在Pyramid框架内的请求之间缓存/记忆pandas DataFrame(昂贵的计算)?

时间:2017-02-14 17:58:56

标签: python-3.x pandas pyramid

Public Class Klant
Private mNaam As String
Private mStraat As String
Private mPostcode As String
Private mGemeente As String
Private mTelefoon As String
Private mEmail As String
Private mHardware As Boolean
Private mSoftware As Boolean
Private mInternet As Boolean
Private mMultimedia As Boolean
Public Sub New()
    mHardware = False
    mInternet = False
    mSoftware = False
    mMultimedia = False
    mNaam = "Niet ingevuld"
    mStraat = "Niet ingevuld"
    mPostcode = "Niet ingevuld"
    mGemeente = "Niet ingevuld"
    mTelefoon = "Niet ingevuld"
    mEmail = "Niet ingevuld"
End Sub
Public Property Hardware() As Boolean
    Get
        Return mHardware
    End Get
    Set(ByVal value As Boolean)
        mHardware = value
    End Set
End Property
Public Property Software() As Boolean
    Get
        Return mSoftware
    End Get
    Set(ByVal value As Boolean)
        mSoftware = value
    End Set
End Property
Public Property Internet() As Boolean
    Get
        Return mInternet
    End Get
    Set(ByVal value As Boolean)
        mInternet = value
    End Set
End Property
Public Property Multimedia() As Boolean
    Get
        Return mMultimedia
    End Get
    Set(ByVal value As Boolean)
        mMultimedia = value
    End Set
End Property

Public Property Naam() As String
    Get
        Return mNaam
    End Get
    Set(ByVal value As String)
        mNaam = value
    End Set
End Property
Public Property Straat() As String
    Get
        Return mStraat
    End Get
    Set(ByVal value As String)
        mStraat = value
    End Set
End Property
Public Property Postcode() As String
    Get
        Return mPostcode
    End Get
    Set(ByVal value As String)
        mPostcode = value
    End Set
End Property
Public Property Gemeente() As String
    Get
        Return mGemeente
    End Get
    Set(ByVal value As String)
        mGemeente = value
    End Set
End Property
Public Property Telefoon() As String
    Get
        Return mTelefoon
    End Get
    Set(ByVal value As String)
        mTelefoon = value
    End Set
End Property
Public Property Email() As String
    Get
        Return mEmail
    End Get
    Set(ByVal value As String)
        mEmail = value
    End Set
End Property

我有一个计算一些命中的课程。计算和数据检索过程相当昂贵。因此,对于给定的一组参数,我希望保持" calculate_hits"的结果。我想这样做,以便在数小时或更长时间后再次需要这些结果时,不需要再次进行计算和检索。

天真地,我尝试将输出结果放入Pyramid框架内的会话对象中。这不起作用,因为大熊猫Dataframe太大了......(并且可能还有其他问题)。

那我该怎么做?

[有关其他上下文:我正在计算pandas数据帧中的一些值。然后我在一个jquery DataTable中呈现整个表。我使用前端来查找所选行。我发回一个选定行的列表。现在我想转到Dataframe中的那些行,并从该数据帧中获取信息以保存到数据库中。)

1 个答案:

答案 0 :(得分:1)

您需要定义某种可以跨请求共享的同步数据存储。如果这不是外部的东西(redis,memcache,rdbms,...)那么你可能会询问带锁的内存存储。您可以将此类商店附加到registry并从每个请求中访问它。您有责任了解此处的线程问题并正确锁定存储,以便两个或多个请求不会立即更新存储。

def main(...):
    config.registry.mystore = {'frame': pd.DataFrame()}

def view(request):
    frame = request.registry.mystore['frame']

作为旁注,我不知道pandas中的数据帧是否是线程安全的,但我敢打赌它们不是,所以你需要通过序列化为更原始的形式然后以某种方式解决这个问题。每个请求反序列化为一个新的数据帧。

相关问题