使用SQLite,如何计算每年的最大同比增长率?

时间:2020-07-13 04:21:08

标签: sql sqlite

我正在学习有关SQL的知识,并且正在做一个名为“ Codecademy上的World Populations SQL Practice”的练习。一个表包含三列:国家,人口和年份。我有兴趣计算每年增长率最高的国家。 (这不是Codecademy建议的,我只是认为这是一个有趣的想法。)

我可以使用此查询计算所有同比增长率:

SELECT country,
       100.0 * ((SELECT population FROM population_years AS p2
                 WHERE p2.year = p1.year + 1
                 AND p2.country = p1.country)
                 - population) / population AS year_on_year_growth,
       year
FROM population_years AS p1
WHERE year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth;

并且我可以使用以下查询来计算特定年份(例如2005)的最大同比增长率:

SELECT country,
       100.0 * ((SELECT population FROM population_years AS p2
                 WHERE p2.year = p1.year + 1
                 AND p2.country = p1.country)
                 - population) / population AS year_on_year_growth,
       year
FROM population_years AS p1
WHERE year = 2005
AND year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth DESC
LIMIT 1;

如果使用python,我可以使用保存为yoy_query的第一个查询来解决此问题:

yoy_result = c.execute(yoy_query).fetchall()
sorted([record for record in yoy_result if record[1] == max([row[1] for row in yoy_result if row[2] == record[2]])],key=lambda x:x[2])

我得到了预期的结果:

[('Montserrat', 7.34177215189872, 2000), ('Montserrat', 13.4433962264151, 2001), ('Afghanistan', 5.803891762260126, 2002), ('Montserrat', 10.467706013363028, 2003), ('Liberia', 4.7976709085316545, 2004), ('Jordan', 7.088496587486171, 2005), ('Jordan', 6.764378108744186, 2006), ('Montserrat', 12.638580931263864, 2007), ('Liberia', 4.157111008408977, 2008), ('Niger', 3.737166190281749, 2009)]

但是我想不出一种使用SQL的方法。有任何想法吗?我认为在python中似乎更容易的原因是因为我能够保存中间结果,然后对此进行第二次计算。

2 个答案:

答案 0 :(得分:1)

您可以使用window functions LAG()RANK()来实现:

select country, year_on_year_growth, year
from (
  select *, rank() over (partition by year order by year_on_year_growth desc) as rnk
  from (
    select *, 
      100.0 * (population / lag(population) over (partition by country order by year) - 1) as year_on_year_growth
    from population_years 
  )
)

表达式:

lag(population) over (partition by country order by year)

返回上一年的国家人口(假设年份之间没有差距)。
所以我将增长率计算为:

(((当年的人口)/(上一年的人口))-1

答案 1 :(得分:0)

我想最简单的事情实际上是只使用如下视图:

CREATE VIEW yoy_growth
AS
SELECT country,
       100.0 * ((SELECT population FROM population_years AS p2
                 WHERE p2.year = p1.year + 1
                 AND p2.country = p1.country)
                 - population) / population AS year_on_year_growth,
       year
FROM population_years AS p1
WHERE year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth;

SELECT * FROM yoy_growth AS y1
WHERE year_on_year_growth = (
    SELECT MAX(year_on_year_growth)
    FROM yoy_growth AS y2
    WHERE y1.year = y2.year
)
ORDER BY year;

通过这种方式,我可以获得所需的结果,尽管查询似乎确实有点慢。