优化大集合的交集

时间:2019-02-06 14:04:02

标签: python optimization

前提很简单:我有两个整数ab,并且我想查找ia + ib + i都在给定列表中。列表rs非常大(10e9个项目)。我有以下代码:

def getlist(a,b):
    a1 = set([i - a for i in rs if i>a])
    b1 = set([i-b for i in rs if i>b]) 

    tomp = list(a1.intersection(b1))
    return tomp

当前的问题是a1和b1首先被预先计算,这会导致内存问题。我可以以某种方式优化代码吗?也欢迎对该方法发表一般评论。

示例输入:

rs = [4,9,16]
a = 3
b = 8

预期输出:

getlist(3,8) = [1]

2 个答案:

答案 0 :(得分:4)

您可以通过跳过第二组(和中间列表)的创建来优化内存使用:

def getlist(a, b):
    a1 = {i - a for i in rs if i > a}
    return [i - b for i in rs if i > b and i - b in a1]

此解决方案的时间和空间复杂度为O(n)

答案 1 :(得分:2)

如果# First point: 'user_username' is redundant to say the least, # and since usernames can be changed you should use the user.id # instead - or actually not use anything, cf below # # Second point: your view should be restricted to authenticated # users only (using the `login_required` decorator being the # simple and obvious way) # # Third point: only the admins and the (authenticated) user himself # should be allowed to change the user's password. Since admins # already can do this in the (django) admin, here you actually want # to only allow the current authenticated user to change his own # password, so you shouldn't pass the user id nor username and # only work on `request.user` # # Fourth point: this view should either test on the request # method and only performs updates on a POST request, or # just plain refuse any other method than POST (there's # a decorator for this too). Remember that GET requests # **MUST** be idempotent. def change_password2(request, user_username): # this returns the User matching `user_username` var_username = get_object_or_404(User, username=user_username) # this does the exact same thing AGAIN (so it's totally useless) # and actually only works by accident - you're passing the user # object as argument where a username (string) is expected, # and it only works because the string representation of a # User is the username and the ORM lookup will force # the argument to string. u = User.objects.get(username__exact=var_username) # Here you accept just anything as password, and # if you don't have one you'll try to use `False` # instead, which is probably not what you want. # # You should actually use a Django form to validate # you inputs... password = request.POST.get('password_update', False) u.set_password(password) 已经是rs,则速度会更快:

set

如果不是,则必须先进行设置(否则上述算法将非常慢),并且性能与以前基本相同:

def getlist(a, b):
    return [i - a for i in rs if i > a and b + (i - a) in rs]

但是,如果您要对不同的def getlist(a, b): rs_set = set(rs) return [i - a for i in rs_set if i > a and b + (i - a) in rs_set] a值使用多次相同的函数,但对相同的b使用多次,则可以将rs转换为集合一次,然后每次重用。