没有通过登录页面

时间:2015-04-20 16:06:20

标签: forms authentication login

我似乎无法通过“登录”页面。 这是我的登录页面(aciworldwide.com/support)的删节版本,使用IE查看源代码:



<html ...>

<head ...></head>

<body>...

  <form method="post" action="/support" id="mainform">
    <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
    <input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
    <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="<stuff>" />

    <script type="text/javascript">
      //<![CDATA[
      var theForm = document.forms['mainform'];
      if (!theForm) {
        theForm = document.mainform;
      }

      function __doPostBack(eventTarget, eventArgument) {
          if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
            theForm.__EVENTTARGET.value = eventTarget;
            theForm.__EVENTARGUMENT.value = eventArgument;
            theForm.submit();
          }
        }
        //]]>
    </script>
    ...
    <input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="87894A7C" />
    <input type="hidden" name="__PREVIOUSPAGE" id="__PREVIOUSPAGE" value="<stuff>" />
    <input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="<stuff>" />...
    <div id="maincontent_0_content_0_pnlLogin" onkeypress="javascript:return WebForm_FireDefaultButton(event, &#39;maincontent_0_content_0_butLogin&#39;)">

      <h2>HELP24 eSupport Portal</h2>
      <input type="hidden" name="startURL" value="" />
      <input type="hidden" name="loginURL" value="" />
      <input type="hidden" name="useSecure" value="true" />
      <input type="hidden" name="orgId" value="00D700000008gWM" />
      <input type="hidden" name="portalId" value="06070000000DZJN" />
      <input type="hidden" name="loginType" value="2" />
      <label for="username">Username:</label>
      <input type="text" id="username" name="username" maxlength="80" value="" class="captionblack" />
      <label for="password">Password:</label>
      <input type="password" id="password" name="password" maxlength="80" class="captionblack" />


      <input type="submit" name="maincontent_0$content_0$butLogin" value="Log in" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;maincontent_0$content_0$butLogin&quot;, &quot;&quot;, false, &quot;&quot;, &quot;https://esupport.force.com/CommunityLogin&quot;, false, false))"
      id="maincontent_0_content_0_butLogin" />
    </div>
    ...
  </form>
</body>

</html>
&#13;
&#13;
&#13;

我编写了此抓取工具来处理登录页面:

import scrapy

class ACIspider(scrapy.Spider):
    name = "aci"
    allowed_domains = ["aciworldwide.com"]
    start_urls = [
        "http://aciworldwide.com/support.aspx"
        ]

    def parse(self, response):
        title = response.xpath('//title/text()').extract()
        print 'Starting title is ' + title[0]
        return scrapy.FormRequest.from_response(
         response,
         formdata={'username': 'myuser@my.com', 'password': 'mypass'},
         clickdata={ 'type': 'submit' },
         callback=self.after_login
        )

    def after_login(self, response):
        print 'Hello next page'
        # check login succeed before going on
        if "authentication failed" in response.body:
            self.log("Login failed", level=log.ERROR)
            return

        title = response.xpath('//title/text()').extract()
        print 'Title is ' + title[0]

以下是我的输出摘录:

[time] [aci] DEBUG:重定向(301)到https://www.aciworldwide.com/support.aspx>从 号码://www.aciworldwide.com/support.aspx> [时间] [aci] DEBUG:Crawled(200)https://www.aciworldwide.com/support.aspx> (引用者:无)
起始标题是支持
[时间] [aci] DEBUG:Crawled(200)https://www.aciworldwide.com/support.aspx> (引用者:https://w ww.aciworldwide.com/support.aspx)
你好,下一页
标题是支持

请注意,我在回调开头和回调后打印页面标题。这是同一页面。如果登录后的响应不是认证后的下一页,我做错了什么?

0 个答案:

没有答案