创建数组的唯一元素列表

时间:2016-12-16 23:19:08

标签: python arrays

我有一个像这样的一维数组(重复值。)

Administration   Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship   Drug ,Drug Administration Schedule ,Female ,Humans ,Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use ,Metabolic Clearance Rate ,Middle Aged ,Urinary Incontinence/drug therapy ,Xerostomia/chemically induced ,

Adult ,Anti-Ulcer Agents/metabolism ,Antihypertensive Agents/metabolism ,Benzhydryl Compounds/administration & dosage/blood/*pharmacology ,Caffeine/*metabolism ,Central Nervous System Stimulants/metabolism ,Cresols/administration & dosage/blood/*pharmacology ,Cross-Over Studies ,Cytochromes/*pharmacology ,Debrisoquin/*metabolism ,Drug Interactions ,Humans ,Male ,Muscarinic Antagonists/pharmacology ,Omeprazole/*metabolism ,*Phenylpropanolamine ,Polymorphism   Genetic ,Tolterodine Tartrate ,Urinary Bladder Diseases/drug therapy ,
...
...

我需要一个包含所有唯一类别的列表,其中类别以逗号分隔。例如。行政口头是一个类别。

2 个答案:

答案 0 :(得分:2)

  

我需要一份包含所有独特类别的列表

获取任何列表并在其上应用set()。注意:这会删除排序。

  

其中类别以逗号分隔

所以split(",")字符串

例如。

s = '''Administration   Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship   Drug'''.strip()

for x in sorted(set(s.split(","))):
  print(x.strip())

输出

Administration   Oral
Aged
Area Under Curve
Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use
Circadian Rhythm/physiology
Cross-Over Studies
Delayed-Action Preparations
Dose-Response Relationship   Drug

答案 1 :(得分:0)

以下是一个例子:

categories = """Administration   Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship   Drug ,Drug Administration Schedule ,Female ,Humans ,Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use ,Metabolic Clearance Rate ,Middle Aged ,Urinary Incontinence/drug therapy ,Xerostomia/chemically induced ,Adult ,Anti-Ulcer Agents/metabolism ,Antihypertensive Agents/metabolism ,Benzhydryl Compounds/administration & dosage/blood/*pharmacology ,Caffeine/*metabolism ,Central Nervous System Stimulants/metabolism ,Cresols/administration & dosage/blood/*pharmacology ,Cross-Over Studies ,Cytochromes/*pharmacology ,Debrisoquin/*metabolism ,Drug Interactions ,Humans ,Male ,Muscarinic Antagonists/pharmacology ,Omeprazole/*metabolism ,*Phenylpropanolamine ,Polymorphism   Genetic ,Tolterodine Tartrate ,Urinary Bladder Diseases/drug therapy ,"""

category_list = [x.strip() for x in categories.split(',')]
unique_categories = filter(None, list(set(category_list)))
>>> unique_categories
['Urinary Incontinence/drug therapy', 'Debrisoquin/*metabolism', 'Cresols/administration & dosage/blood/*pharmacology', 'Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use', 'Urinary Bladder Diseases/drug therapy', '*Phenylpropanolamine', 'Drug Administration Schedule', 'Tolterodine Tartrate', 'Middle Aged', 'Dose-Response Relationship   Drug', 'Polymorphism   Genetic', 'Adult', 'Anti-Ulcer Agents/metabolism', 'Caffeine/*metabolism', 'Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use', 'Area Under Curve', 'Metabolic Clearance Rate', 'Muscarinic Antagonists/pharmacology', 'Drug Interactions', 'Delayed-Action Preparations', 'Circadian Rhythm/physiology', 'Male', 'Xerostomia/chemically induced', 'Administration   Oral', 'Cross-Over Studies', 'Benzhydryl Compounds/administration & dosage/blood/*pharmacology', 'Cytochromes/*pharmacology', 'Humans', 'Central Nervous System Stimulants/metabolism', 'Omeprazole/*metabolism', 'Female', 'Antihypertensive Agents/metabolism', 'Aged']