在尝试访问机场RSS源时获取“访问被拒绝”

时间:2016-08-12 17:51:48

标签: c curl rss libcurl

我正在尝试使用C中的curl库访问机场RSS提要。但是,每当我尝试访问它时,我都会收到访问被拒绝错误。以下是我正在使用的代码。它与https://curl.haxx.se/libcurl/c/simple.html处的示例代码几乎完全相同:

#include <stdio.h>
#include <curl/curl.h>

int main(int argc, char **argv)
{
        CURL *curl;
        CURLcode res;
        char *feed_addr = "http://w1.weather.gov/xml/current_obs/KUCP.rss";
        //airport not in the state I live

        curl = curl_easy_init();
        if(curl) {
                curl_easy_setopt(curl, CURLOPT_URL, feed_addr);

                curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);

                // Perform the request, res will get the return code
                res = curl_easy_perform(curl);
                // Check for errors
                if(res != CURLE_OK) {
                        fprintf(stderr, "curl_easy_perform() failed: %s\n",curl_easy_strerror(res));
                }

                // always cleanup
                curl_easy_cleanup(curl);
        }
        return 0;
}

除了weather.gov(google,youtube,mit.edu)之外,我已经尝试过其他网站了,而且它们都运行得很好。但是当我尝试这个时,即使使用其他机场RSS源(可以在http://w1.weather.gov/xml/current_obs/seek.php?state=pa&Find=Find找到),我也会得到相同的访问拒绝错误。当我将feed_addr设置为任何weather.gov页面时,我也会收到同样的错误。

为了使这个陌生人,当我尝试使用Python3的urllib.request模块访问Feed时,它运行得很好。我也可以使用谷歌浏览器轻松访问它。所以我可以排除它不希望我访问Feed的想法。

有什么东西我不见了吗?有没有办法通过卷曲库获取Feed?或者有没有办法使用不同的库?

1 个答案:

答案 0 :(得分:1)

回答我自己的问题(我原本没有计划,但是在别人回答之前我解决了这个问题):

所以这是我开始工作的代码

--- 
output: html_document
runtime: shiny
---

## some text...

*some more text

<br><br>

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```


```{r, echo = FALSE, message=FALSE, warnings=FALSE}

mietspiegel <- read.table("http://www.stat.uni-muenchen.de/service/datenarchiv/miete/miete03.asc", header=TRUE)
mieten_regression <- lm(mietspiegel$nm ~ mietspiegel$wfl)
mieten_regression$coefficients

b <- mieten_regression$coefficients[1]      # Coefficient No. 1   Intercept
a <- mieten_regression$coefficients[2]      # Coefficient No. 2   mietspiegel$wfl



 # Slider ...

inputPanel(
  sliderInput("b", "Coefficient No. 1 Intercept", min = 0, max = 200, step = 10, value = b),

  sliderInput("a", "Coefficient No. 2 Wohnflaeche", min = 0, max = 20, step = 1, value = a),

  actionButton("residuen", "Zeige Residuen an") 

)


# Scatterplott

renderPlot({ 
  library(ggplot2)
  ggplot(mietspiegel, aes(y=nm, x=wfl)) +
    geom_abline(intercept = input$b, slope = input$a, colour = "red") + # Add inear regression line     
    geom_point(shape=1) + # Use hollow circles
    xlab("Flaeche") + # changed Fläche to Flaeche :)
    ylab("Price")
}) 


# Two ways of showing residual plots  when the button "Resample" is pressed:

# (i) Easy way - use conditionalPanel 
# conditionalPanel(
#   condition = "input.residuen !== 0", 
#   list(
#     hr(),
#     h3("Residuen"),
#     plotOutput("residuals"),
#     hr()
#   )
# )
# 
# output$residuals <- renderPlot({
#     par(mfrow = c(2,2))
#       plot(mieten_regression)
#     par(mfrow = c(1,1))
# })

# -----------------------------------------------------------------------------

# (ii) More difficlult but more powerful way - use render renderUI with a condition. 
# Using modulo operator you can show and hide plots by pressing 

uiOutput("dynamic_residuals")



output$dynamic_residuals <- renderUI({
  if ((input$residuen + 1) %% 2 == 0 ) { 
    return(list(
      hr(),
      h3("Residuen"),
      plotOutput("residuals"),
      hr()
    ))
  } else {
    return(NULL)
  }
})

output$residuals <- renderPlot({
    par(mfrow = c(2,2))
      plot(mieten_regression)
    par(mfrow = c(1,1))
})
# You can read it in this way:
#  - use renderPlot function that sends a plot to the plotOutput
#  - create "plotOutput" via "renderUI" and place it (together with hr and h3 tags) in the document but only if the button (input$residuen) is clicked. 

```



## Second part of your question 

<hr>


```{r, echo = FALSE, message=FALSE, warnings=FALSE}

# define functions for two Errorfields    

mean_abs_diff <- function(a,b,x,y) {mean(abs(a * x + b - y))} # middle absolute changing from y
    mean_sqr_diff <- function(a,b,x,y) {sqrt(mean((a * x + b - y)^2))} # sqrt of the middle square changing from y
```



```{r, echo = FALSE, message=FALSE, warnings=FALSE}


renderPrint({
  # Errors vs changings of a
  mad <- mean_abs_diff(input$a, input$b, mietspiegel$wfl, mietspiegel$nm)
  msd <- mean_sqr_diff(input$a, input$b,mietspiegel$wfl, mietspiegel$nm)

  cat(" Mean absolute difference: ", round(mad, 2), "\n", 
      "Mean squared difference:  ", round(msd, 2))
})


```





```{r, echo = FALSE, message=FALSE, warnings=FALSE}

# To generate new plots depending on changing values of the sliders, again,
# wrap the code into renderPlot and replace "a" and "b" with "input$a" and "input$b"

# You also can use mfrow to combine all these plots into one

x <- seq(-50, 50, 1)

renderPlot({ 
  par(mfrow = c(2,2), mar = c(3,3,3,3))

  plot(x, sapply(x, function(y) mean_sqr_diff(input$a, input$a + y,mietspiegel$wfl, mietspiegel$nm)), 
         xlab = "additive changing of b (delta b)", ylab = "sqrt of the middle sqaure error", type = "l")


    plot(x, sapply(x, function(y) mean_abs_diff(input$a, input$a + y,mietspiegel$wfl, mietspiegel$nm)), 
         xlab = "additive changing of b (delta b)", ylab = "middle absolute error", type = "l")

      # Errors vs changings of b 

  x <- seq(-1, 1, 0.1)

  plot(x, sapply(x, function(y) mean_sqr_diff(input$a + y, input$b,mietspiegel$wfl, mietspiegel$nm)), 
           xlab = "additive changing of a (delta a)", ylab = "sqrt of the middle sqaure error", type = "l")


  plot(x, sapply(x, function(y) mean_abs_diff(input$a + y, input$b,mietspiegel$wfl, mietspiegel$nm)),
           xlab = "additive changing of a (delta a)", ylab = "middle absolute error", type = "l")
  par(mfrow = c(1,1))
})
```

我解决这个问题的方法是找到一个打印出我的HTTP请求的网站。在这种情况下,它是http://rve.org.uk/dumprequest。我在一个标签中正常访问了网站,我使用该程序获取源代码,然后在另一个选项卡中启动它。那是当我手动打开页面时看到几个字段存在的时候,但是当我使用代码时却没有。

所以,我查看了curl_easy_setopt以查看是否有办法设置这些字段。事实证明,curl_easy_setopt在网站https://curl.haxx.se/libcurl/c/curl_easy_setopt.html上有相关文档和更多文档。在其中一条评论的建议下,我首先查看了CURLOPT_USERAGENT。

由于那个人拿了一个#include <stdio.h> #include <curl/curl.h> int main(int argc, char **argv) { CURL *curl; CURLcode res; char *feed_addr = "http://w1.weather.gov/xml/current_obs/KUCP.rss"; //airport not in the state I live curl = curl_easy_init(); if(curl) { curl_easy_setopt(curl, CURLOPT_URL, feed_addr); curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L); //line that solved my issue curl_easy_setopt(curl, CURLOPT_USERAGENT, <string>); /** * The line as written will not work. * I'm only using <string> as a stand-in for some personal information * If you're having the same issue as I did, I explain how to solve it under the code */ // Perform the request, res will get the return code res = curl_easy_perform(curl); // Check for errors if(res != CURLE_OK) { fprintf(stderr, "curl_easy_perform() failed: %s\n",curl_easy_strerror(res)); } // always cleanup curl_easy_cleanup(curl); } return 0; } ,我在HTTP请求中找到了以User-Agent:开头的行,复制并粘贴了其余部分,在上面的行中是char *

因此,如果请求包含以下行:

<string>

我包括的行将是:

User-Agent: Lord Voldemort (Tom Marvolo Riddle)