如何从html表中提取值到Python?

时间:2018-01-01 08:32:38

标签: html python-3.x html-parsing

我有以下html代码结构,但我不知道如何从<td> <a href ="....."> text1 </a> text2 </td>

中提取text1和text2的值

<tbody>
        <tr class="trBgGrey"><td nowrap="nowrap">1</td><td nowrap="nowrap">11</td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/horse.asp?horseno=S205">SWEET BEAN</a>(S205)</td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/jockeyprofile.asp?jockeycode=MOJ&amp;season=Current">J Moreira</a></td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/trainerprofile.asp?trainercode=FC&amp;season=Current">C Fownes</a></td><td nowrap="nowrap">121</td><td nowrap="nowrap">1034</td><td nowrap="nowrap">7</td><td nowrap="nowrap">-</td><td align="center" nowrap="nowrap"><table width="80" border="0" cellSpacing="0" cellPadding="0"><tr><td width="16" align="center">8</td><td width="16" align="center">8</td><td width="16" align="center">8</td><td width="16" align="center">3</td><td width="16" align="center">1</td></tr></table></td><td nowrap="nowrap">1.51.13</td><td nowrap="nowrap">5.3</td></tr>
</tr><tr class="trBgGrey"><td nowrap="nowrap">3</td><td nowrap="nowrap">2</td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/horse.asp?horseno=V311">CITY WINNER</a>(V311)</td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/jockeyprofile.asp?jockeycode=RN&amp;season=Current">N Rawiller</a></td><td class="tdAlignL font13 fontStyle" nowrap="nowrap"><a href="http://www.hkjc.com/english/racing/trainerprofile.asp?trainercode=TYS&amp;season=Current">Y S Tsui</a></td><td nowrap="nowrap">132</td><td nowrap="nowrap">978</td><td nowrap="nowrap">6</td><td nowrap="nowrap">1</td><td align="center" nowrap="nowrap"><table width="80" border="0" cellSpacing="0" cellPadding="0"><tr><td width="16" align="center">9</td><td width="16" align="center">9</td><td width="16" align="center">9</td><td width="16" align="center">10</td><td width="16" align="center">3</td></tr></table></td><td nowrap="nowrap">1.51.30</td><td nowrap="nowrap">22</td></tr>
        </tbody>

我按照以下方式尝试了我的代码,但无法获取文本值

import requests
from bs4 import BeautifulSoup
import urllib.request

race_link = 'http://racing.hkjc.com/racing/info/meeting/Results/English/Local/20171227/HV'
sauce1 = urllib.request.urlopen(race_link).read()
soup1 = BeautifulSoup(sauce1, 'html.parser')

for link in soup1.find_all('tr', {'class': 'trBgGrey'}):
    for ilink in link.find_all('td'):
        print(ilink.string)

但我的结果又回到了:

1
11
None
J Moreira
C Fownes
121
1034
7
-
None
8
8
8
3
1
1.51.13
5.3
.....

我的预期结果是

1
11
SWEET BEAN
(S205)
J Moreira
C Fownes
121
1034
7
-
None
8
8
8
3
1
1.51.13
5.3
......

我可以将html结构中的值作为

<td>text1</td><td>text2</td>

但我不知道如何编写代码来从html结构中获取值

<td><a href="....">text1</a>text2</td>

如何从第二个结构中获取值?

1 个答案:

答案 0 :(得分:1)

尝试类似的东西:

public function userLogin(Request $request){
        Config::set('jwt.user', 'App\User'); 
        Config::set('auth.providers.users.model', \App\User::class);
        $credentials = $request->only('email', 'password');
        $token = null;
        try {
            if (!$token = JWTAuth::attempt($credentials)) {
                return response()->json([
                    'response' => 'error',
                    'message' => 'invalid_email_or_password',
                ]);
            }
        } catch (JWTAuthException $e) {
            return response()->json([
                'response' => 'error',
                'message' => 'failed_to_create_token',
            ]);
        }
        return response()->json([
            'response' => 'success',
            'result' => [
                'token' => $token,
                'message' => 'I am front user',
            ],
        ]);
    }

    public function adminLogin(Request $request){
        Config::set('jwt.user', 'App\Admin'); 
        Config::set('auth.providers.users.model', \App\Admin::class);
        $credentials = $request->only('email', 'password');
        $token = null;
        try {
            if (!$token = JWTAuth::attempt($credentials)) {
                return response()->json([
                    'response' => 'error',
                    'message' => 'invalid_email_or_password',
                ]);
            }
        } catch (JWTAuthException $e) {
            return response()->json([
                'response' => 'error',
                'message' => 'failed_to_create_token',
            ]);
        }
        return response()->json([
            'response' => 'success',
            'result' => [
                'token' => $token,
                'message' => 'I am Admin user',
            ],
        ]);
    }

Try It Online

相关问题