从多个表中仅刮取一些列

时间:2017-11-06 18:49:49

标签: r xpath web-scraping rvest

我想废弃这些表格中的候选人名称以及第三栏中报告的选票(在图像,候选人姓名之后)。就我而言,这是。

#include<stdio.h>
#include<string.h>
#include<stdlib.h>

int main()
 {
   char a[][10]={"milk","eggs","bread","cheese"};
   char *b[4],**p[4],*temp;
   int length[4];
   int i,j;

   for(i=0;i<4;i++)
   {
      length[i]=strlen(a[i]);
      b[i]=(char *)calloc(length[i],sizeof(char));
      strcpy(b[i],a[i]);
      p[i]=&b[i];
    }

  for(j=0;j<4;j++)
   {
     for(i=0; i<3-j; i++)
      {
        if(strcmp(*p[i],*p[i+1])>0)
          { /*         
            strcpy(temp,*p[i]);    This is the part of the code where the  
            strcpy(*p[i],*p[i+1]); program crashes, can someone please point            
            strcpy(*p[i+1],temp);  out the logical flaw
             */
          }
       }
    }  

  for(i=0;i<4;i++)
   {
    puts(*p[i]);
   }  


}

1 个答案:

答案 0 :(得分:0)

虽然这并没有真正使用XPath,但这是一种方法:

results <- read_html(ndp_leadership) %>%
  html_nodes(".wikitable") %>% 
  html_table(fill=TRUE) %>% 
  map(~ .[,2]) %>% 
  unlist %>% 
  setdiff(., c("Candidate", "Total"))