在SQL查询中对很多行匹配很多行并找到百分位匹配?

时间:2014-03-30 21:34:05

标签: mysql sql database codeigniter active-record-query

目前我正在建立一个招聘平台。雇主可以发布工作并接收申请。雇主可以设定求职者必须匹配的许多技能要求。求职者还可以添加他们拥有的许多技能。

我想知道的是,为了在视图中显示百分位匹配,每个jobseekers_skills的每个employer_requirements中有多少匹配skill_string。我希望找到基于jobseeker_skills表格和employer_requirements表格中存在的id | job_string | jobseeker_string | employer_string | application_string | date_created的匹配

以下是3个表中每个表的数据库安排:

应用:

id | skill_name | requirement_level | skill_string | job_string | employer_string | date_created

employer_requirements:

id | skill_name | level | jobseeker_string | skill_string | string | date_created

jobseeker_skills:

applications

我有以下代码根据' $ job_str'获取所有function skills_match($job_str){ $this->db->select('*') ->from('applications') ->where('job_string', $job_str) ->join('users', 'users.string = applications.jobseeker_string', 'left'); $applications = $this->db->get(); return $applications; } 代码。通过了。下面的代码只是一个简单的获取但不确定从何处开始。


+--------+------------------+------------------+------------------+
| id     | job_string       | jobseeker_string | employer_string  |
+--------+------------------+------------------+------------------+
| 1      | vs71FVTBb12DdGlf | uMIsuDJaBuDmo8iq | biQxyPekn6iayIgm |
| 2      | vs71FVTBb12DdGlf | x7phHsVnwJ1K1yHy | biQxyPekn6iayIgm |
| 3      | vs71FVTBb12DdGlf | Fm1TIJLxz6Xg6QPk | biQxyPekn6iayIgm |
+--------+------------------+------+-----+---------+-------+------+

应用程序表 - 示例数据: +--------+------------------+-------------+------------------+------------------+ | id | job_string | skill_name | skill_string | employer_string | +--------+------------------+-------------+------------------+-----------------+| | 1 | vs71FVTBb12DdGlf |PHP | 9Y8XeCWqJXzkZ5dD | biQxyPekn6iayIgm | | 2 | vs71FVTBb12DdGlf |JavaScript | O6es19t5CgcRHvct | biQxyPekn6iayIgm | | 3 | vs71FVTBb12DdGlf |HTML | wx4evsXC62BWiN7p | biQxyPekn6iayIgm | | 4 | vs71FVTBb12DdGlf |Python | jx15rH1vrGLmsVmq | biQxyPekn6iayIgm | | 5 | vs71FVTBb12DdGlf |SQL | EksP7mEip0Hs4zKd | biQxyPekn6iayIgm | | 6 | vs71FVTBb12DdGlf |LESS | fj40m4hkiuDGtbzr | biQxyPekn6iayIgm | +--------+------------------+-------------+------+-----+---------+-------+------+

雇主要求 - 样本数据:

+--------+------------------+------------------+------------------+ | id | jobseeker_string | skill_name | skill_string | +--------+------------------+------------------+------------------+ | 1 | uMIsuDJaBuDmo8iq | PHP | 9Y8XeCWqJXzkZ5dD | | 2 | uMIsuDJaBuDmo8iq | Backbone | 4VIiAxZoL1VbPnTa | | 3 | x7phHsVnwJ1K1yHy | LESS | fj40m4hkiuDGtbzr | | 2 | x7phHsVnwJ1K1yHy | Ruby | gTZg4fwYuzMMFcBw | | 3 | x7phHsVnwJ1K1yHy | SQL | EksP7mEip0Hs4zKd | | 1 | Fm1TIJLxz6Xg6QPk | PHP | 9Y8XeCWqJXzkZ5dD | | 2 | Fm1TIJLxz6Xg6QPk | Python | jx15rH1vrGLmsVmq | | 3 | Fm1TIJLxz6Xg6QPk | HTML | wx4evsXC62BWiN7p | | 3 | Fm1TIJLxz6Xg6QPk | Git | aR9B9ns1sHlGrzFw | +--------+------------------+------+-----+---------+-------+------+ 求职者技能 - 样本数据:

uMIsuDJaBuDmo8iq - 1/6 (16.666%) x7phHsVnwJ1K1yHy - 2/6 (33.333%) Fm1TIJLxz6Xg6QPk - 3/6 (50%)

基于以上所述,这应输出百分比或否。相匹配的技能:

应用程序 - 以下是每个应用程序匹配技能的数量/百分比: {{1}}

如有任何问题,请随便解雇。感谢您的帮助。

2 个答案:

答案 0 :(得分:2)

首先,这些是两个问题:

  1. 哪个申请人最符合我的业务
  2. 哪个雇主最符合我的技能。
  3. 这2个问题看起来可能相同,但它们不是。

    第一个问题: 我希望所有符合我要求的申请人,按我的要求数量排序。首先我得到所有比赛:

    select *
    from Requirements r 
    inner join Jobseeker j
    on r.skill_string = j.r.skill_string 
    where job_string = 'vs71FVTBb12DdGlf';
    

    然后我将他们分组,计算em等:

    select 
      jobseeker_string, 
      count(1) / (select count(1) from Requirements where job_string = 'vs71FVTBb12DdGlf') as match_percentage
    from Requirements r 
    inner join Jobseeker j
    on r.skill_string = j.r.skill_string 
    where job_string = 'vs71FVTBb12DdGlf'
    group by jobseeker_string;
    

    第二个问题:稍微困难一些,因为申请人可能想知道他/她是否符合一定比例的工作技能,但也知道他自己的技能(这可能适用于第一个问题以及)。查询如下:

    select 
      job_string, 
      count(1) / (select count(1) from Requirements where jobseeker_string  = 'uMIsuDJaBuDmo8iq') as my_match,
      count(1) / (select count(1) from Requirements where job_string = r.job_string) as job_match
    from Requirements r 
    inner join Jobseeker j
    on r.skill_string = j.r.skill_string 
    where jobseeker_string = 'uMIsuDJaBuDmo8iq'
    group by job_string;
    

    请注意:查询是从我的脑子里写的,它可能包含一些拼写错误

    如果你想订购,你可以这样做:

    select * from
      ([[insert the above query here]]) t
    order by field.
    

    <强>结合

    select 
      job_string, 
      jobseeker_string
      count(1) / (select count(1) from Requirements where jobseeker_string  = r.jobseeker_string ) as seeker_match,
      count(1) / (select count(1) from Requirements where job_string = r.job_string) as job_match
    from Requirements r 
    inner join Jobseeker j
    on r.skill_string = j.r.skill_string 
    group by job_string, jobseeker_string;
    

    <强>领域应用

    select * from 
      (select 
        job_string, 
        jobseeker_string
        count(1) / (select count(1) from Requirements where jobseeker_string  = r.jobseeker_string ) as seeker_match,
        count(1) / (select count(1) from Requirements where job_string = r.job_string) as job_match
      from Requirements r 
      inner join Jobseeker j
      on r.skill_string = j.r.skill_string 
      group by job_string, jobseeker_string) t
    inner join applications a
    on t.job_string = a.job_string and t.jobseeker_string = a.t.jobseeker_string
    

答案 1 :(得分:1)

MySQL为您提供了一种很好的分组方式,如果是平均值。 你曾经玩过AVG(IF(..)?

假设您有两个包含几列的表格。

这样的事(抱歉,sqlfiddle失业):

first_table:

id  category    element
1   number  two
2   number  three
3   number  four
4   number  five
5   number  eleven
6   fruit   banana
7   fruit   pineapple
8   fruit   pear
9   fruit   strawberry

second_table:

id  category    element
1   number  one
2   number  five
3   number  six
4   number  seven
5   number  three
6   fruit   apple
7   fruit   banana

1)您想知道第二个表中可以找到多少个元素:

    select count(*) as total
    from first_table t1 
    join second_table t2 
    on t1.element = t2.element

将返回

 total
 3

2)使用左连接,您可能会获得有价值的信息:

    select 
        count(*) as total, 
        count(t2.element) as number_matching
    from first_table t1
    left join second_table t2
    on t1.element = t2.element

这将为您提供元素总数和匹配元素数量。除以,你有百分比。

  total    number_matching
  9        3

3)使用avg和if,我们可以直接得到0和1之间的比例:

    select
        AVG(IF(t2.element IS NULL, 0, 1)) as proportion_matching
    from first_table t1
    left join second_table t2
    on t1.element = t2.element

返回

proportion_matching
0.33333

4)格式化为百分比,按照您的方便... ...

    select
        ROUND(AVG(IF(t2.element IS NULL, 0, 1)) * 100, 1) as percent_matching
    from first_table t1
    left join second_table t2
    on t1.element = t2.element

你得到了

percent_matching
33.3

5)您实际上可以按类别分隔结果。

    select
        t1.category,
        ROUND(AVG(IF(t2.element IS NULL, 0, 1)) * 100, 1) as percent_matching
    from first_table t1
    left join second_table t2
    on t1.element = t2.element
    group by t1.category

请记住,这实际上是“表2中元素的百分比,可以在表2中找到”

category  percent_matching
fruit     25.0
number    40.0

6)将此应用于应用程序和技能组合...... 您可以查看求职者申请,如下所示:

    SELECT
        a.job_string,
        ROUND(AVG(IF(jobseeker.skill_string IS NULL, 0, 1)) * 100, 1) as percent_matching
    FROM application a  
    JOIN employer_requirements er
    ON er.job_string = a.job_string
    LEFT JOIN jobseeker js
    ON a.jobseeker_string = js.jobseeker_string
    GROUP BY a.job_string

7)当然,您可以根据需要在哪里过滤作业字符串。 实际上,此处添加的连接与应用程序表一起确保您只获得用户实际申请的作业的结果。但是如果你已经有了一个job_string,你就可以逃脱:

    SELECT
        er.job_string,
        ROUND(AVG(IF(jobseeker.skill_string IS NULL, 0, 1)) * 100, 1) as percent_matching
    FROM        employer_requirements er    
    LEFT JOIN   jobseeker js
    ON          js.jobseeker_string = er.jobseeker_string
    WHERE       er.jobseeker_string = ?

7)我留给你把它扔进一个活动记录查询(这不是我最了解的部分;)