如何处理这种包含json的日志

时间:2016-05-27 08:09:37

标签: json scala apache-spark apache-spark-sql

这是日志:

"V2|ip-10-203-5-16|PSDK_KINESIS_LOG|PSDK_DEFAULT|2016-05-12 00:00:05|VP|aa2ddfdb-4387-49c0-9651-92cc83b8e905|{"vid":"2237031","fdn":"FDNB1995115","type":"0","version":"1.0","device_id":"SCL-TL00_a0-8d-16-e6-39-13_868145026279574","ip":"58.254.0.157","timestamp":1462982395}"

我的代码:

import com.fasterxml.jackson.annotation.JsonProperty
import com.fasterxml.jackson.core.JsonParseException
import com.fasterxml.jackson.databind.ObjectMapper
import org.apache.spark.{SparkContext, SparkConf}
import org.slf4j.LoggerFactory

class JsonLong{
  @JsonProperty var fdn: String = null
  @JsonProperty("type") var typ: String = null
  @JsonProperty var vid: String = null
  @JsonProperty var version: String = null
  @JsonProperty var device_id: String = null
  @JsonProperty var ip: String = null
  @JsonProperty var timestamp: Long = 0L
  override def toString = s"JsonLong(fdn=$fdn, typ=$typ, vid=$vid,version=$version, device_id=$device_id, ip=$ip, timestamp=$timestamp)"
}



def jsonString(args:String):JsonLong ={
    val mapper = new ObjectMapper()
    val record = mapper.readValue(args, classOf[JsonLong])
    record
  }

case class log(log_version: String,log_ip: String,log_from: String,SDK: String,action_time: Date,action: String,sn: String,post_code: JsonLong)

val df = new SimpleDateFormat("yyyy-mm-dd HH:mm:ss")
val input = sc.textFile("input.snappy")
val RDD = input.map { line => 
    val p = line.split("\\|")
    val log_version = p(0)
    val log_ip = p(1)
    val log_from = p(2)
    val SDK = p(3)
    val action_time = df.parse(p(4))
    val action = p(5)
    val sn = p(6)
    val post_code = if(p.length==8){
    //to read the last JSON
    jsonString(p(7))
    } else("null")
    log(log_version,log_ip,log_from,SDK,new Date(action_time.getTime()),action,sn,post_code)}.toDF()

enter image description here

我最后一个Json有问题。我已经使用def jsonString来返回calss JsonLong,但它返回了Object。如何应对最后的Json?

1 个答案:

答案 0 :(得分:0)

用于初始化post_code的表达式不一定返回JsonLong

if(p.length==8) {
   jsonString(p(7)) // returns JsonLong
} else("null")      // returns String

因此,编译器推断出适合两种结果的最佳类型,在本例中为java.lang.Object

我假设您打算使用null代替"null",这可以解决此问题:

if(p.length==8) {
   jsonString(p(7)) // returns JsonLong
} else { null }     // returns Null, which extends all classes including JsonLong

这会使编译器推断出JsonLong的{​​{1}}类型,这可以解决您的问题。