Scala中不推荐使用DefaultHttpClient

时间:2019-06-11 19:52:59

标签: scala web-scraping jsoup httpclient ldjson

我正在尝试使用来自另一个URL的LD-Json数据访问外部API(需要身份验证)。无法获取如何为该外部API添加密钥和密码。尝试使用DefaultHttpClient,但已弃用。

第一种方法-

import java.nio.charset.StandardCharsets
import java.util.Base64

import org.apache.http.auth.{AuthScope, UsernamePasswordCredentials}
import org.apache.http.client.methods.{HttpGet, HttpPost}
import org.apache.http.entity.StringEntity
import org.apache.http.impl.client.{BasicCredentialsProvider, DefaultHttpClient, HttpClientBuilder}
import org.jsoup.Jsoup

class Scraper(url: String) {
  def getJson(url: String) = {
    val doc = Jsoup.parse(url)
    val api_url = "external_api"
    val username = "my_username"
    val password = "my_password"
    val ldJsons = doc.select("script[type=\"application/ld+json\"]")
    val base64EncodedDoc = Base64.getEncoder.encodeToString(doc.toString().getBytes(StandardCharsets.UTF_8))

    val post = new HttpPost(api_url)
    post.setHeader("Content-type", "application/json")
    post.setEntity(new StringEntity(base64EncodedDoc))
    val response = (new DefaultHttpClient).execute(post)
//need an alternative for this in Scala
  }
}

第二种方法-

import java.nio.charset.StandardCharsets
import java.util.Base64

import org.apache.http.auth.{AuthScope, UsernamePasswordCredentials}
import org.apache.http.client.methods.{HttpGet, HttpPost}
import org.apache.http.entity.StringEntity
import org.apache.http.impl.client.{BasicCredentialsProvider, DefaultHttpClient, HttpClientBuilder}
import org.jsoup.Jsoup

class Scraper(url: String) {
  def getJson(url: String) = {
    val doc = Jsoup.parse(url)
    val api_url = "external_api"
    val username = "my_username"
    val password = "my_password"
    val ldJsons = doc.select("script[type=\"application/ld+json\"]")
    val base64EncodedDoc = Base64.getEncoder.encodeToString(doc.toString().getBytes(StandardCharsets.UTF_8))
    val credentialsProvider = new BasicCredentialsProvider()
    credentialsProvider.setCredentials(
      AuthScope.ANY,
      new UsernamePasswordCredentials(username, password)
    )

    val httpClient =
      HttpClientBuilder.create()
        .setDefaultCredentialsProvider(credentialsProvider)
        .build()

    val httpResponse = new HttpGet(api_url)
    httpResponse.setHeader("Content-type", "application/json")
    httpClient.execute(httpResponse)
//how to pass my LD-Json data here?
  }
}

这是我的第一个问题。如果太琐碎,请原谅。我正在尝试在Scala中编写一个scraper类,该类从URL获取LD-Json并将其发布到外部API。

1 个答案:

答案 0 :(得分:0)

您应该使用HttpClientBuilder

简单的示例:

String url = "http://example.com";

HttpClient client = HttpClientBuilder.create().build();
HttpGet request = new HttpGet(url);

HttpResponse response = client.execute(request);

System.out.println("Result: " + response.getStatusLine().getStatusCode());

BufferedReader bufferedReader = new BufferedReader(
        new InputStreamReader(response.getEntity().getContent()));

StringBuffer result = new StringBuffer();
String line = "";
while ((line = rd.readLine()) != null) {
        result.append(line);
}