我构建了一个简单的OLAP多维数据集,其中包含2个日期维度,一个低基数字符串维度,以及一个只计算事实表中行数的度量。
我正试图找出过滤日期维度的最佳方法。我有一个有效的查询并产生正确的结果,但它似乎非常低效。它看起来像这样:
SELECT
[measures].[user_count] on 0,
[gender].members on 1
FROM
profiles
WHERE
NonEmptyCrossJoin(
{ [birthday].[1960].[1].[1] : [birthday].[1989].[12].[31] },
{ [created_date].[2013].[1].[1] : [created_date].[2013].[12].[31] }
)
Mondrian执行400多个SQL查询,如下所示:
select
"dates"."day" as "c0"
from
"samtest"."dates" as "dates"
where
("dates"."month" = 8 and "dates"."year" = 1989)
group by
"dates"."day"
order by
CASE WHEN "dates"."day" IS NULL THEN 1 ELSE 0 END, "dates"."day" ASC
然后大约60个查询看起来像这样:
select
"dates"."year" as "c0",
"dates"."month" as "c1",
"dates"."day" as "c2",
"dates_1"."year" as "c3",
"dates_1"."month" as "c4",
"dates_1"."day" as "c5",
"profiles"."gender" as "c6",
count("profiles"."profile_id") as "m0"
from
"samtest"."dates" as "dates",
"samtest"."profiles" as "profiles",
"samtest"."dates" as "dates_1"
where
"profiles"."created_date" = "dates"."date"
and
"dates"."year" = 2013
and
"profiles"."birthday" = "dates_1"."date"
and
"dates_1"."year" in (1978, 1979, 1980, 1981, 1982, 1983)
and
"profiles"."gender" = 'Unspecified'
group by
"dates"."year",
"dates"."month",
"dates"."day",
"dates_1"."year",
"dates_1"."month",
"dates_1"."day",
"profiles"."gender"
我第一次运行它需要大约12分钟才能完成。大部分时间似乎是在进行SQL查询,但即使我在缓存所有内容后再次运行它,Mondrian仍会花费超过3分钟来计算结果。这对我来说似乎很奇怪,因为我可以在不到一秒的时间内直接从SQL数据库中获得相同的结果。
我做错了吗?这是一个错误吗?这只是OLAP的一个不错的用例吗?
我正在使用Mondrian 3.6.1。 SQL数据库是Redshift。如果有关我的配置或架构的详细信息有用,请告诉我。