Question

我得到了一个表，该表具有以下表示文件系统的结构。
每个项目（可能是文件或文件夹）都有唯一的ID。如果是类别（文件夹），则它包含其他文件。
level表示目录深度。

|id |parent_id|is_category|level|
|:-:|:   -   :|:    -    :|: - :|
|0  |   -1    |    true   |  0  |
|1  |    0    |    true   |  1  |
|2  |    0    |    true   |  1  |
|3  |    1    |    true   |  2  |
|4  |    2    |   false   |  2  |
|5  |    3    |    true   |  3  |
|6  |    5    |   false   |  4  |
|7  |    5    |   false   |  4  |
|8  |    5    |    true   |  4  |
|9  |    5    |   false   |  4  |

任务：
提取文件夹levels中所有子项id == 1 <=3。
结果ID应该为[1,3,5]

我当前的实现是递归查询，这意味着，对于上面的示例，我的程序将首先获取id == 1，然后使用is_categorh == true和level <= 3查找所有项。

这感觉不是一种有效的方法。任何建议将不胜感激。

Answer 1

您没有提及您正在使用的数据库，所以我假设使用PostgreSQL。

您可以使用使用“递归CTE”的单个查询来检索所需的行。递归CTE由多个数据库引擎实现，例如Oracle，DB2，PostgreSQL，SQL Server，MariaDB，MySQL，HyperSQL，H2，Teradata等。

查询应使用类似于以下内容的

：

with recursive x as (
  select * from t where id = 1
  union all
  select t.*
  from x
  join t on t.parent_id = x.id and t.level <= 3
)
select id from x

为了记录，我用来测试的数据脚本是：

create table t (
  id int,
  parent_id int,
  level int
);

insert into t (id, parent_id, level) values (0, -1, 0);
insert into t (id, parent_id, level) values (1, 0, 1);
insert into t (id, parent_id, level) values (2, 0, 1);
insert into t (id, parent_id, level) values (3, 1, 2);
insert into t (id, parent_id, level) values (4, 2, 2);
insert into t (id, parent_id, level) values (5, 3, 3);
insert into t (id, parent_id, level) values (6, 5, 4);
insert into t (id, parent_id, level) values (7, 5, 4);
insert into t (id, parent_id, level) values (8, 5, 4);
insert into t (id, parent_id, level) values (9, 5, 4);

Answer 2

正如其他人所说，递归CTE是一种快速且通常有效的方法来检索您要查找的数据。如果您想避免递归CTE，因为它们不是无限可伸缩的，因此在某些使用情况下容易出现不稳定的行为，您还可以采用更直接的方法，即通过WHILE循环来实现递归搜索。请注意，这并不比递归CTE更有效，但是它可以让您更好地控制递归中发生的事情。在我的示例中，我正在使用Transact-SQL。

首先，设置代码，例如提供的@The Impaler：

drop table if exists
    dbo.folder_tree;

create table dbo.folder_tree 
    (
    id int not null constraint [PK_folder_tree] primary key clustered,
    parent_id int not null,
    fs_level int not null,
    is_category bit not null constraint [DF_folder_tree_is_category] default(0),
    constraint [UQ_folder_tree_parent_id] unique(parent_id, id)
    );

insert into dbo.folder_tree 
    (id, parent_id, fs_level, is_category)
values 
    (0, -1, 0, 1),  --|0  |   -1    |    true   |  0  |
    (1, 0, 1, 1),   --|1  |    0    |    true   |  1  |
    (2, 0, 1, 1),   --|2  |    0    |    true   |  1  |
    (3, 1, 2, 1),   --|3  |    1    |    true   |  2  |
    (4, 2, 2, 0),   --|4  |    2    |   false   |  2  |
    (5, 3, 3, 1),   --|5  |    3    |    true   |  3  |
    (6, 5, 4, 0),   --|6  |    5    |   false   |  4  |
    (7, 5, 4, 0),   --|7  |    5    |   false   |  4  |
    (8, 5, 4, 1),   --|8  |    5    |    true   |  4  |
    (9, 5, 4, 0);   --|9  |    5    |   false   |  4  |

然后是用于通过WHILE循环实现表的递归搜索的代码：

drop function if exists
    dbo.folder_traverse;
go

create function dbo.folder_traverse
    (
    @start_id int,
    @max_level int = null
    )
returns @result table
    (
    id int not null primary key,
    parent_id int not null,
    fs_level int not null,
    is_category bit not null
    )
as
    begin
        insert into 
            @result
        select
            id,
            parent_id,
            fs_level,
            is_category
        from
            dbo.folder_tree
        where
            id = @start_id;

        while @@ROWCOUNT > 0
            begin
                insert into 
                    @result
                select
                    f.id,
                    f.parent_id,
                    f.fs_level,
                    f.is_category
                from
                    @result r
                    inner join dbo.folder_tree f on
                        r.id = f.parent_id
                where
                    f.is_category = 1 and
                    (
                        @max_level is null or
                        f.fs_level <= @max_level
                    )
                    except
                select
                    id,
                    parent_id,
                    fs_level,
                    is_category
                from
                    @result;
            end;

        return;
    end;
go

最后，我建议使用此方法的唯一原因是，如果您有大量的递归成员，或者需要在两次操作之间添加日志记录或其他一些过程。在大多数情况下，这种方法速度较慢，并增加了代码的复杂性，但它是递归CTE的替代方法，并且满足您的要求。

以下递归查询的最佳可能实现是什么？

2 个答案: