BeautifulSoup在同一个类别标签中找到多个属性

时间:2018-08-28 06:51:50

标签: beautifulsoup

我正在尝试编写python程序以从NIST的网站中提取CVE信息。这样,我发现我需要的所有数据都来自同一个类标记。但是我认为我的代码效率很低,因为我必须多次遍历HTML文件才能找到所有数据。如代码所示,我必须继续运行CVE_Information.find(attrs = {'data-testid':'something')

所以我的问题是有什么办法可以一次找到多个标签?

我在下面附了我的代码

import requests
from bs4 import BeautifulSoup

url = "https://nvd.nist.gov/vuln/detail/CVE-2018-8124"
page = requests.get(url)
html_status = page.status_code

if int(html_status) == 200:
    soup = BeautifulSoup(page.content, 'html.parser')

    # Search for elements by class
    CVE_Information = soup.find(class_ ="col-lg-9 col-md-7 col-sm-12")

    # Extract the Current Description
    Description = CVE_Information.find(attrs={'data-testid': 'vuln-description'}).get_text()

    # Extract the Base Score Information
    BS = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-base-score'}).get_text()

    # Extract the Base Score Severity Information
    BSS = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-base-score-severity'}).get_text()

    # Extract the Vector Information
    Vector = ''.join(CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-vector'}).get_text().split())

    # Extract Impact Score Information
    IS = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-impact-score'}).get_text().strip()

    # Extract Exploitability Score Information
    ES = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-exploitability-score'}).get_text().strip()

    # Extract Attack Vector Information
    AV = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-av'}).get_text().strip()

    # Extract Attack Complexity Information
    AC = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-ac'}).get_text().strip()

    # Extract Privileges Required Information
    PR = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-pr'}).get_text().strip()

    # Extract User Interaction Information
    UI = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-ui'}).get_text().strip()

    # Extract Scope Information
    S = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-s'}).get_text().strip()

    # Extract Confidentiality Information
    C = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-c'}).get_text().strip()

    # Extract Integrity Information
    I = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-i'}).get_text().strip()

    # Extract Availability Information
    A = CVE_Information.find(attrs={'data-testid': 'vuln-cvssv3-a'}).get_text().strip()

    print(Description)
    print(BS)
    print(BSS)
    print(Vector)
    print(IS)
    print(ES)
    print(AV)
    print(AC)
    print(PR)
    print(UI)
    print(S)
    print(C)
    print(I)
    print(A)

else:
    print("Error in connection or wrong URL")

1 个答案:

答案 0 :(得分:1)

您可以尝试:

在这里,我从类“ col-lg-9 col-md-7 col-sm-12”中获取'data-testid'属性值,并将其存储在一个数组中并使用循环,我正在打印所有结果:

import requests
from bs4 import BeautifulSoup

url = "https://nvd.nist.gov/vuln/detail/CVE-2018-8124"
page = requests.get(url)
html_status = page.status_code

if int(html_status) == 200:
    soup = BeautifulSoup(page.content, 'html.parser')
    CVE_Information = soup.find(class_ ="col-lg-9 col-md-7 col-sm-12")

    allEle=[elm['data-testid'] for elm in CVE_Information.find_all(attrs={"data-testid": True})] #storing 'data-testid' attribute values in the array

    for i in allEle:
      print(CVE_Information.find(attrs={'data-testid': i}).get_text().strip())
else:
    print("Error in connection or wrong URL")

以下是列表结果:

['vuln-description-title', 'vuln-description', 'vuln-description-source', 'vuln-description-last-modified', 'vuln-analysis-description-title', 'vuln-analysis-description', 'vuln-analysis-description-source', 'vuln-analysis-description-last-modified', 'vuln-cvss-container', 'vuln-cvssv3-score-container', 'vuln-cvssv3-base-score-link', 'vuln-cvssv3-base-score', 'vuln-cvssv3-base-score-severity', 'vuln-cvssv3-vector', 'vuln-cvssv3-impact-score', 'vuln-cvssv3-exploitability-score', 'vuln-cvssv3-metrics-container', 'vuln-cvssv3-av', 'vuln-cvssv3-ac', 'vuln-cvssv3-pr', 'vuln-cvssv3-ui', 'vuln-cvssv3-s', 'vuln-cvssv3-c', 'vuln-cvssv3-i', 'vuln-cvssv3-a', 'vuln-cvssv2-score-container', 'vuln-cvssv2-base-score-link', 'vuln-cvssv2-base-score', 'vuln-cvssv2-base-score-severity', 'vuln-cvssv2-vector', 'vuln-cvssv2-impact-subscore', 'vuln-cvssv2-exploitability-score', 'vuln-cvssv2-metrics-container', 'vuln-cvssv2-av', 'vuln-cvssv2-ac', 'vuln-cvssv2-au', 'vuln-cvssv3-c', 'vuln-cvssv2-i', 'vuln-cvssv2-a', 'vuln-cvssv2-additional', 'vuln-hyperlinks-table', 'vuln-hyperlinks-row-0', 'vuln-hyperlinks-link-0', 'vuln-hyperlinks-restype-0', 'vuln-hyperlinks-row-1', 'vuln-hyperlinks-link-1', 'vuln-hyperlinks-restype-1', 'vuln-hyperlinks-row-2', 'vuln-hyperlinks-link-2', 'vuln-hyperlinks-restype-2', 'vuln-technical-details-container', 'vuln-technical-details--1', 'vuln-technical-details-0-link', 'vuln-configurations-container', 'vuln-software-config-1', 'vuln-software-operator-1-0', 'vuln-software-cpe-1-0-0', 'vuln-software-cpe-1-0-1', 'vuln-software-cpe-1-0-2', 'vuln-software-cpe-1-0-3', 'vuln-software-cpe-1-0-4', 'vuln-software-cpe-1-0-5', 'vuln-software-config-2', 'vuln-software-operator-2-0', 'vuln-software-cpe-2-0-0', 'vuln-software-config-3', 'vuln-software-operator-3-0', 'vuln-software-cpe-3-0-0', 'vuln-software-config-4', 'vuln-software-operator-4-0', 'vuln-software-cpe-4-0-0', 'vuln-software-cpe-4-0-1', 'vuln-software-cpe-4-0-2', 'vuln-software-config-5', 'vuln-software-operator-5-0', 'vuln-software-cpe-5-0-0', 'vuln-software-cpe-5-0-1', 'vuln-software-config-6', 'vuln-software-operator-6-0', 'vuln-software-cpe-6-0-0', 'vuln-software-cpe-6-0-1', 'vuln-software-cpe-6-0-2', 'vuln-configurations-vulnerable-software-message', 'vuln-change-history-container', 'vuln-change-history-type-0', 'vuln-change-history-date-0', 'vuln-change-history-table', 'vuln-change-history-0', 'vuln-change-history-0-action', 'vuln-change-history-0-type', 'vuln-change-history-0-old', 'vuln-change-history-0-new', 'vuln-change-history-1', 'vuln-change-history-1-action', 'vuln-change-history-1-type', 'vuln-change-history-1-old', 'vuln-change-history-1-new', 'vuln-change-history-2', 'vuln-change-history-2-action', 'vuln-change-history-2-type', 'vuln-change-history-2-old', 'vuln-change-history-2-new', 'vuln-change-history-3', 'vuln-change-history-3-action', 'vuln-change-history-3-type', 'vuln-change-history-3-old', 'vuln-change-history-3-new', 'vuln-change-history-4', 'vuln-change-history-4-action', 'vuln-change-history-4-type', 'vuln-change-history-4-old', 'vuln-change-history-4-new', 'vuln-change-history-5', 'vuln-change-history-5-action', 'vuln-change-history-5-type', 'vuln-change-history-5-old', 'vuln-change-history-5-new', 'vuln-change-history-6', 'vuln-change-history-6-action', 'vuln-change-history-6-type', 'vuln-change-history-6-old', 'vuln-change-history-6-new', 'vuln-change-history-7', 'vuln-change-history-7-action', 'vuln-change-history-7-type', 'vuln-change-history-7-old', 'vuln-change-history-7-new', 'vuln-change-history-8', 'vuln-change-history-8-action', 'vuln-change-history-8-type', 'vuln-change-history-8-old', 'vuln-change-history-8-new', 'vuln-change-history-9', 'vuln-change-history-9-action', 'vuln-change-history-9-type', 'vuln-change-history-9-old', 'vuln-change-history-9-new', 'vuln-change-history-10', 'vuln-change-history-10-action', 'vuln-change-history-10-type', 'vuln-change-history-10-old', 'vuln-change-history-10-new', 'vuln-change-history-11', 'vuln-change-history-11-action', 'vuln-change-history-11-type', 'vuln-change-history-11-old', 'vuln-change-history-11-new', 'vuln-change-history-type-1', 'vuln-change-history-date-1', 'vuln-change-history-table', 'vuln-change-history-0', 'vuln-change-history-0-action', 'vuln-change-history-0-type', 'vuln-change-history-0-old', 'vuln-change-history-0-new', 'vuln-change-history-1', 'vuln-change-history-1-action', 'vuln-change-history-1-type', 'vuln-change-history-1-old', 'vuln-change-history-1-new']

希望这会对您有所帮助:)