我试图在网站https://www.yemeksepeti.com/en/istanbul/zeytinburnu-merkezefendi-mah-cevizlibag中获取所有餐厅名称 使用HtmlAgilityPack:
Uri url = new Uri("https://www.yemeksepeti.com/en/istanbul/zeytinburnu-merkezefendi-mah-cevizlibag");
WebClient client = new WebClient();
string downloadString = client.DownloadString(url);
HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
document.LoadHtml(downloadString);
HtmlNodeCollection nodes = document.DocumentNode.SelectNodes("//a[@class='restaurantName withTooltip']");
foreach(var node in nodes) {
listBox1.Items.Add(node.InnerText);
}
这很好用!
但是,另一方面,我真正想做的是深入研究并获得MainCuisineName
:
<a class="restaurantName withTooltip" href="/en/meshur-merkez-kofte-zeytinburnu-merkezefendi-mah-istanbul" target="_parent" data-hasqtip="1">
<span data-tooltip="{"MainCuisineName":"Meatball""cc_genel.gif}">Meşhur Merkez Köfte, Zeytinburnu (Merkezefendi Mah.)</span>
</a>
我如何从中获得MainCuisineName
,即“肉丸”
相同的网址?我尝试过:
HtmlNodeCollection nameNodes = doc.DocumentNode.SelectNodes("//*[@class='restaurantName withTooltip']/span='MainCuisineLabelName'");
foreach(var node in nameNodes) {
listBox1.Items.Add(node.InnerText);
}
但是它显然不起作用。
有什么建议吗?
答案 0 :(得分:0)
这就是我得到的:
Uri filteredurl = new Uri("https://www.yemeksepeti.com/en/istanbul/zeytinburnu-merkezefendi-mah-cevizlibag#kt:b5ceacf5-9724-4751-a600-78d35cfcf72b,24ef27f9-32d5-44ff-993c-21e59b0f6f83");
HtmlNodeCollection nodes =
document.DocumentNode.SelectNodes("//a[@class='restaurantName withTooltip']");
foreach(var node in nodes) {
listBox1.Items.Add(node.InnerText);
}
然后使用过滤器:
#kt:b5ceacf5-9724-4751-a600-78d35cfcf72b
,24ef27f9-32d5-44ff-993c-21e59b0f6f83
,然后再次执行搜索。每个类别(肉丸,快餐等)都有一个唯一的过滤器。也查看此link,您就知道了。