使用python计算2D数组中特定主项目的多个项目的出现

时间:2019-02-14 14:22:04

标签: python list count

我有2D列表,每行包含COMMON_NAME条目和其他信息。我想找到每个COMMON_NAME属于多少MYFR元素的总和。

For example, this is my list 
[['SOME TEXT', 'COMMON_NAME1', None, 'CHOC', 'MYFR01'],
['SOME TEXT2', 'COMMON_NAME1', None, 'ABC',  'MYFR02'], 
['SOME TEXT3', 'COMMON_NAME1', None, 'XYZ',  'MYFR03'],
['SOME TEXT4', 'COMMON_NAME2', None, 'XYZ',  'STRAWBERRY'],
['SOME TEXT5', 'COMMON_NAME2', None, 'XYZ',  'MYFR01'],
['SOME TEXT6', 'COMMON_NAME2', None, 'XYZ',  'MYFR02'],
['SOME TEXT7', 'COMMON_NAME2', None, 'XYZ',  'APPLE'] 

对于每个COMMOM_NAME,如果它们位于{'MYFR01','MYFR02','MYFR03'}中,我想查找它们的发生次数之和

在这个示例中,我想获得COMMON_NAME1 = 3和COMMON_NAME2 = 2

有没有简单的方法可以实现这一目标?

谢谢

3 个答案:

答案 0 :(得分:3)

这是collections.Counter的解决方案:

>>> from collections import Counter
>>> data = [['SOME TEXT', 'COMMON_NAME1', None, 'CHOC', 'MYFR01'],
... ['SOME TEXT2', 'COMMON_NAME1', None, 'ABC',  'MYFR02'], 
... ['SOME TEXT3', 'COMMON_NAME1', None, 'XYZ',  'MYFR03'],
... ['SOME TEXT4', 'COMMON_NAME2', None, 'XYZ',  'STRAWBERRY'],
... ['SOME TEXT5', 'COMMON_NAME2', None, 'XYZ',  'MYFR01'],
... ['SOME TEXT6', 'COMMON_NAME2', None, 'XYZ',  'MYFR02'],
... ['SOME TEXT7', 'COMMON_NAME2', None, 'XYZ',  'APPLE']]

>>> c = Counter(i[1] for i in data if i[-1].startswith('MYFR'))
>>> c
Counter({'COMMON_NAME1': 3, 'COMMON_NAME2': 2})

这假设您的目标选择将始终以MYFR开始。仔细阅读您的问题,您还可以使用:

>>> tgt = {'MYFR01', 'MYFR02', 'MYFR03'}
>>> c = Counter(i[1] for i in data if i[-1] in tgt)
>>> c
Counter({'COMMON_NAME1': 3, 'COMMON_NAME2': 2})

Counterdict的子类)的优点是它可以接受generator expression。这意味着您无需将“过滤的”项目具体化为某种中间数据结构,例如列表。

答案 1 :(得分:1)

您也可以使用pandas

import pandas as pd

df = pd.DataFrame(data, columns=['text', 'cname', 'none', 'code', 'name'])

         text         cname  none  code        name
0   SOME TEXT  COMMON_NAME1  None  CHOC      MYFR01
1  SOME TEXT2  COMMON_NAME1  None   ABC      MYFR02
2  SOME TEXT3  COMMON_NAME1  None   XYZ      MYFR03
3  SOME TEXT4  COMMON_NAME2  None   XYZ  STRAWBERRY
4  SOME TEXT5  COMMON_NAME2  None   XYZ      MYFR01
5  SOME TEXT6  COMMON_NAME2  None   XYZ      MYFR02
6  SOME TEXT7  COMMON_NAME2  None   XYZ       APPLE


df.loc[df['name'].str.contains('MYFR'), ['name', 'cname']] \
  .groupby('cname', as_index=False) \
  .count()

          cname  name
0  COMMON_NAME1     3
1  COMMON_NAME2     2

我们还可以使用itertools:

from itertools import groupby

second = itemgetter(1)
last = itemgetter(-1)

for k, v in groupby(data, key=second):
    print(k, len([last(i) for i in v if last(i).startswith('MYFR')]))

COMMON_NAME1 3
COMMON_NAME2 2

这里唯一的警告是必须首先对数据进行排序。

答案 2 :(得分:0)

我们将保留一个字典,将using System.Collections; using System.Collections.Generic; using UnityEngine; public class Plaxercontrol2 : MonoBehaviour { public Rigidbody2D rb; public Transform groundCheck; public Transform startPosition; public float groundCheckRadius; public LayerMask whatIsGround; private bool onGround; // Start is called before the first frame update void Start() { rb = GetComponent<Rigidbody2D>(); } // Update is called once per frame void Update() { rb.velocity = new Vector2(3, rb.velocity.y); onGround = Physics2D.OverlapCircle(groundCheck.position, groundCheckRadius, whatIsGround); if (Input.GetMouseButtonDown(0) && onGround) { rb.velocity = new Vector2(5, rb.velocity.x); } } } 值映射到COMMON_NAME值的集合,然后最后测量这些集合的大小。这将确定每个MYFR的唯一MYFR元素的数量。

COMMON_NAME