如何使用Python发布此HTML表单?

时间:2016-10-07 19:26:03

标签: python html post python-requests

这是我想在Python中发布的表单:

  <FORM METHOD="POST"
    ACTION="http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
    ENCTYPE="multipart/form-data">
    <INPUT NAME="formtype" TYPE="HIDDEN" value="simple">

    <p><b>Upload a sentence corpus file</b>:<br> 
      <INPUT NAME="corpus" TYPE="FILE" SIZE=60 VALUE="empty">
    </p>

      <INPUT TYPE="submit" VALUE="COMPILE KNOWLEDGE BASE">
    </form>

我尝试了请求

import requests

url = "http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
#url = "http://httpbin.org/post"
files = {'file': open('testfile', 'rb')}
payload = {'NAME': 'fromtype', 'TYPE': 'HIDDEN', 'value': 'simple',
'NAME': 'corpus', 'TYPE': 'FILE', 'SIZE': '60', 'VALUE': 'empty'}
r = requests.post(url, data=payload)

print r.text

和回复

Running: /home/darthtoker/programs/post.py (Fri Oct  7 15:19:21 2016)

<!DOCTYPE html
    PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US">
<head>
<title>LMTool Error</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<pre>Something went wrong. This is all I know: formtype
</pre>
</body>
</html>Content-Type: text/html; charset=ISO-8859-1

<!DOCTYPE html
    PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US">
<head>
<title>LMTool Error</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<pre>Something went wrong. This is all I know: corpus
</pre>
</body>
</html>Status: 302 Found
Location: http://www.speech.cs.cmu.edu/tools/product/1475867962_31194

我需要为我正在处理的项目自动化此表单,请提供帮助!

1 个答案:

答案 0 :(得分:1)

可能还有其他错误,但您收到的错误信息足够清晰:

表格期望:

<INPUT NAME="formtype" ...>

你发送的内容:

'NAME': 'fromtype', ...

您看到 formtype fromtype 之间存在差异?

根据RequestsMore complicated POST requestsPOST a Multipart-Encoded File)的文档,您的代码应为:

import requests

url = "http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool/run"
#url = "http://httpbin.org/post"
files = {'corpus': open('testfile', 'rb')}
payload = {'formtype': 'simple' }
r = requests.post(url, data=payload, files=files)

print r.text