如何删除从网站中删除的\ n表单字符串

时间:2019-05-25 17:53:00

标签: python python-3.x

这是从网站(itemHtml.text)删除的文本:

 dolar amerykański 1 USD 3.8436
 euro 1 EUR 4.2989
 funt szterling 1 GBP 4.8768

如何从此文本中删除\ n?我尝试了这个:

import requests
import urllib.request
import time
from bs4 import BeautifulSoup

url = "https://www.nbp.pl/home.aspx?f=/kursy/kursya.html"
response = requests.get(url)

soup = BeautifulSoup(response.text, "html.parser")
soup.findAll("tr")

for itemHtml in soup.select('.pad5 tr'):
    currency = ['amerykański', 'euro', 'szterling']
    if itemHtml.find('td'):
        if any (cur in itemHtml.text for cur in currency):
            dane_comma = itemHtml.text
            dane_dot = dane_comma.replace(',', '.')
            dane = dane_dot.replace('\n', ' ')
            print(dane)



</i>

感谢帮助

1 个答案:

答案 0 :(得分:0)

该文本中没有换行符(\ n)。
您将看到3条打印语句,这将为您提供3行输出。
例如

import requests
from bs4 import BeautifulSoup

url = "https://www.nbp.pl/home.aspx?f=/kursy/kursya.html"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
soup.findAll("tr")
single_line = ""
cnt = 0

for itemHtml in soup.select('.pad5 tr'):
    currency = ['amerykański', 'euro', 'szterling']
    if itemHtml.find('td'):
        if any (cur in itemHtml.text for cur in currency):
            dane = itemHtml.text
            dane = dane.replace(',', '.')
            single_line += " "+dane
            cnt += 1
            print("Print count",cnt,dane)
print(single_line.strip())

赠予:

Print count 1  dolar amerykański 1 USD 3.8436 
Print count 2  euro 1 EUR 4.2989 
Print count 3  funt szterling 1 GBP 4.8768 
dolar amerykański 1 USD 3.8436   euro 1 EUR 4.2989   funt szterling 1 GBP 4.8768

在代码中没有尝试删除换行符(single_line.strip()仅用于删除前导空格和随后的空格)