动态查找href标签

时间:2019-02-18 16:17:37

标签: python beautifulsoup

我正试图从我美丽的汤搜索中提取“信息技术”作为输出。但是我还不能弄清楚,因为“扇区”是URL中任何种类的自动收录器的动态值。

有人可以建议我如何提取此信息吗?

<a href="http://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&amp;sector=45">Information Technology</a>

我的代码:

url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'

html = requests.get(url).text    
detail_tags_sector = BeautifulSoup(html, 'lxml')
detail_tags_sector.find_all('a')

3 个答案:

答案 0 :(得分:1)

要从锚元素获取文本,您需要访问每个锚元素上的.text变量
因此,您的代码将更改为:

url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
contents = []

html = requests.get(url).text    
detail_tags_sector = BeautifulSoup(html, 'html.paser')
for anchor in detail_tags_sector.find_all('a'):
    contents.append(anchor.text)
print(contents)

答案 1 :(得分:0)

您可以使用以下任一选项。

import requests
from lxml.html.soupparser import fromstring
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
html = requests.get(url).text
soup=fromstring(html)
findSearch = soup.xpath('//a[contains(text(), "Information Technology")]/text()')
print(findSearch[0])

from bs4 import BeautifulSoup
from lxml import html
import requests
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'

html = requests.get(url).text
detail_tags_sector = BeautifulSoup(html, 'lxml')
for link in detail_tags_sector.find_all('a'):
    print(link.text)

OR

from bs4 import BeautifulSoup    
import requests
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')
for link in soup.find_all('a'):
    print(link.text)

请告诉我这是否有帮助。

答案 2 :(得分:0)

这些答案的问题在于,它们收集了页面上所有所有链接的文本,并且有很多链接。如果仅选择Field3字符串,则只需添加:

Field1

输出:

PermissionManager permissionManager = PermissionManager.getInstance(this);
         permissionManager.checkPermissions(singleton(Manifest.permission.USE_FINGERPRINT), new PermissionManager.PermissionRequestListener() {
             @Override
             public void onPermissionGranted() {
                 startActivity(new Intent(this, MainActivity.class));

             }

             @Override
             public void onPermissionDenied() {
                 Toast.makeText(getBaseContext(), "Permissions Denied", Toast.LENGTH_SHORT).show();
             }
         });