使用LINQ优化基于值的搜索算法

时间:2013-11-26 03:53:54

标签: c# sql algorithm linq

我想构建一个基于值的搜索算法。这意味着,一旦我给出了一个单词列表,我想使用这些单词在数据库中搜索条目。但是,根据这些单词匹配的列/属性,我想改变返回结果的值。

这是一个实现了这个但很慢的懒惰算法。

//search only active entries
var query = (from a in db.Jobs where a.StatusId == 7 select a);
List<SearchResult> baseResult = new List<SearchResult>();
foreach (var item in search)
            {
               //if the company title is matched, results are worth 5 points
                var companyMatches = (from a in query where a.Company.Name.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 5 });

                //if the title is matched results are worth 3 points
                var titleMatches = (from a in query where a.Title.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 3 });

                //if text within the body is matched results are worth 2 points
                var bodyMatches = (from a in query where a.FullDescription.ToLower().Contains(item.ToLower()) select new SearchResult() { ID = a.ID, Value = 2 });


                 //all results are then added 
                baseResult = baseResult.Concat(companyMatches.Concat(titleMatches).Concat(bodyMatches)).ToList();
            }

              // the value gained for each entry is then added and sorted by highest to lowest
            List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

            //the query for the complete result set is built based on the sorted id value of result
            query = (from id in result join jbs in db.Jobs on id.ID equals jbs.ID select jbs).AsQueryable();

我正在寻找优化方法。我是LINQ查询的新手,所以我希望能得到一些帮助。如果有,我可以创建LINQ查询,一次完成所有这一切,而不是检查公司名称,然后检查标题和正文文本,并将它们全部组合在一起,创建一个排序列表,并再次对数据库运行以获取完整列表它会很棒。

2 个答案:

答案 0 :(得分:1)

我最好先研究这个问题。我之前的回答是优化错误的东西。这里的主要问题是多次遍历结果列表。我们可以改变这一点:

foreach (var a in query)
{
    foreach (var item in search)
    {
        itemLower = item.ToLower();
        int val = 0;
        if (a.Company.Name.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 5});
        if (a.Title.ToLower.Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 3});
        if (a.FullDescription.ToLower().Contains(itemLower))
            baseResult.Add(new SearchResult { ID = a.ID, Value = 2});
    }
}

之后,您将获得基本结果,然后您可以继续处理。

将其减少为单个查询,而不是每个搜索项的三个查询。

我不确定您是否需要baseResult中的唯一商品,或者您是否有某些原因允许重复,然后使用这些值的总和来订购它们。如果您需要唯一商品,则可以baseResultDictionary,并将ID作为密钥。

评论后编辑

您可以通过执行以下操作来减少列表中的项目数:

int val = 0;
if (a.Company.Name.ToLower.Contains(itemLower))
    val += 5;
if (a.Title.ToLower.Contains(itemLower))
    val += 3;
if (a.FullDescription.ToLower().Contains(itemLower))
    val += 2;
if (val > 0)
{
    baseResult.Add(new SearchResult { ID = a.ID, Value = val });
}

但这并不能完全消除重复,因为公司名称可以匹配一个搜索词,标题可能与另一个搜索词相匹配。但它会在一定程度上减少名单。

答案 1 :(得分:0)

感谢Jim的回答以及我身边的一些推文,我设法将完成搜索的时间减少了80%

以下是最终解决方案:

 //establish initial query
 var queryBase = (from a in db.Jobs where a.StatusId == 7 select a);

//instead of running the search against all of the entities, I first take the ones that are possible candidates, this is done through checking if they have any of the search terms under any of their columns. This is the one and only query that will be run against the database
if (search.Count > 0)
        {

            nquery = nquery.Where(job => search.All(y => (job.Title.ToLower() + " " + job.FullDescription.ToLower() + " " + job.Company.Name.ToLower() + " " + job.NormalLocation.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower()).Contains(y))); //  + " " + job.Location.ToLower() + " " + job.MainCategory.Name.ToLower() + " " + job.JobType.Type.ToLower().Contains(y)));
        }

        //run the query and grab a list of baseJobs
        List<Job> baseJobs = nquery.ToList<Job>();

        //A list of SearchResult object (these object act as a container for job ids       and their search values
        List<SearchResult> baseResult = new List<SearchResult>();

        //from here on Jim's algorithm comes to play where it assigns points depending on where the search term is located and added to a list of id/value pair list
        foreach (var a in baseJobs)
        {
            foreach (var item in search)
            {
                var itemLower = item.ToLower();

                if (a.Company.Name.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 5 });
                if (a.Title.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 3 });
                if (a.FullDescription.ToLower().Contains(itemLower))
                    baseResult.Add(new SearchResult { ID = a.ID, Value = 2 });
            }
        }

        List<SearchResult> result = baseResult.GroupBy(x => x.ID).Select(p => new SearchResult() { ID = p.First().ID, Value = p.Sum(i => i.Value) }).OrderByDescending(a => a.Value).ToList<SearchResult>();

        //the data generated through the id/value pair list are then used to reorder the initial jobs.
        var NewQuery = (from id in result join jbs in baseJobs on id.ID equals jbs.ID select jbs).AsQueryable();