如果在for循环中发生错误,如何更改值?

时间:2013-11-03 17:12:21

标签: r loops for-loop error-handling

我有一个从大约440个网页读取HTML表格数据的循环。每个页面上的代码不完全相同,所以有时我需要表节点1,有时我需要节点2.现在我只是在列表中手动设置节点号并将其输入循环。我的问题是页面节点已经开始更改和更新节点#list变得麻烦。

如果循环遇到错误的节点#(即:1而不是2,或反向),则会出错并关闭。如果遇到错误,有没有办法让循环将错误的节点号替换为正确的节点号,然后继续运行循环,好像什么也没发生?

这是我循环中代码的readHTML部分,带有示例url:

url <- "http://espn.go.com/nba/player/gamelog/_/id/2991280/year/2013/"

html.page <- htmlParse(url)

tableNodes <- getNodeSet(html.page, "//table")

x <- as.numeric(Players$Nodes[s])

tbl = readHTMLTable(tableNodes[[x]], colClasses = c("character"),stringsAsFactors = FALSE)

这是节点#错误时得到的错误:

  

“readHTMLTable中的错误(tableNodes [[x]],colClasses = c(”character“),stringsAsFactors = FALSE):在为函数'readHTMLTable'选择方法时评估参数'doc'时出错:tableNodes中的错误[[x]]:下标超出范围“

示例代码:


A <- c("dog", "cat")

Nodes <- as.data.frame(1:1) 

#)Nodes <- as.data.frame(1:2) <-- This works without errors

colnames(Nodes)[1] <- "Col1"

Nodes2 <- 2

url <-c("http://espn.go.com/nba/player/gamelog/_/id/6639/year/2013/","http://espn.go.com/nba/player/gamelog/_/id/6630/year/2013/")

for (i in 1:length(A))  
{ 


html.page <- htmlParse(url[i])

tableNodes <- getNodeSet(html.page, "//table")

x <- as.numeric(Nodes$Col1[i])

df = readHTMLTable(tableNodes[[x]], colClasses = c("character"),stringsAsFactors = FALSE)

#tryCatch(df) here.....no clue


assign(paste0("", A[i]), df)
}

1 个答案:

答案 0 :(得分:3)

如果您收到subscript out of bounds错误消息,那么您应该尝试使用较低的x。基于您在原始问题中发布的演示代码tryCatch的一般演示(虽然我已将x替换为2,因为我不知道Players和{{s 1}}):

> msg <- tryCatch(readHTMLTable(tableNodes[[2]], colClasses = c("character"),stringsAsFactors = FALSE), error = function(e)e)
> str(msg)
List of 2
 $ message: chr "error in evaluating the argument 'doc' in selecting a method for function 'readHTMLTable': Error in tableNodes[[2]] : subscript"| __truncated__
 $ call   : language readHTMLTable(tableNodes[[2]], colClasses = c("character"), stringsAsFactors = FALSE)
 - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
> msg$message
[1] "error in evaluating the argument 'doc' in selecting a method for function 'readHTMLTable': Error in tableNodes[[2]] : subscript out of bounds\n"
> grepl('subscript out of bounds', msg$message)
[1] TRUE