如何使用Powershell从多个域下载文件

时间:2016-03-24 15:14:58

标签: powershell login webrequest

我有一个脚本登录到一个站点并导航到一个包含一堆PDF文件的页面,我想下载它。少数PDF URL来自与其他域不同的域。尝试使用Start-BitsTransfer(或Invoke-WebRequest)下载时收到的错误如下:

Start-BitsTransfer:未执行请求的操作,因为 用户尚未登录到网络。指定的服务没有 存在。 (HRESULT异常:0x800704DD)

这条消息让我觉得浏览器会话在脚本结束时没有保持登录状态。任何线索将非常感激。感谢

# Create an instance of internet explorer
$ie = New-Object -com InternetExplorer.Application 

# Open IE and make it visible
$ie.visible = $true

# Navigate to home page
$ie.navigate("https://www.franklinamerican.com/ext/general?npage=home")

# Wait for page to finish loading
while ($ie.Busy -eq $true){Start-Sleep -Seconds 3}

# Populate username and password fields, and then click submit button
$ie.Document.getElementById("userName").value = "######"   #$WebUID
$ie.Document.getElementById("brokerPassword").value = "######"   #$WebPWD
$ie.Document.getElementsByName("Submit").item(0).Click()

# Wait 3 seconds
Start-Sleep -Seconds 3

# Go to download page
$ie.navigate("https://www.franklinamerican.com/ext/correspondent?npage=resourceCenter") 

# Wait 3 seconds
Start-Sleep -Seconds 3

# Create an object of iFrame so we can search the page for the element
$frame = $ie.Document.parentWindow.frames[0].document.getElementById("myframe")

# Creates array, $hrefValues
$hrefValues = @()

# Finds HREF, InnerText and inserts into array, $hrefValues
$hrefValues = $frame.contentDocument.getElementsByTagName("a") | ? {$_.href -match '.pdf'} | foreach{
                    $_ | Select HREF, InnerText
            }

# Loops through array, cleans href & text, builds file name, and downloads file to path
ForEach ($i in $hrefValues){

     # Remove whitespace from text
     $cleanText = $i.innerText.Trim()

     # Remove all characters after '('
     if ($i.innerText.Contains('(')){
            $cleanText = $cleanText.Substring(0,$cleanText.IndexOf('('))
        }

     # Removes illegal characters from file name
     $cleanText = $cleanText.Replace("/", "-")
     $cleanText = $cleanText.Replace("\", "-")
     $cleanText = $cleanText.Replace(":", "")
     $cleanText = $cleanText.Replace("*", "")
     $cleanText = $cleanText.Replace("?", "")
     $cleanText = $cleanText.Replace("<", "")
     $cleanText = $cleanText.Replace(">", "")
     $cleanText = $cleanText.Replace("|", "")

     # Builds file name
     $fileName = $baseFileName + '_' + $cleanText

     # Sets URL to href value
     $currentURL = $i.Href

     Invoke-WebRequest -Uri $currentURL -OutFile $historicalPath\$fileName.pdf 

0 个答案:

没有答案
相关问题