String Hive函数用于将分割键值对分成两列

时间:2015-12-16 05:53:27

标签: scala

如何使用hive功能将此数据T_32_P_1_A_420_H_60_R_0.30841494477846165_S_0拆分为两列 例如

  

T 32

     

P 1

     

A 420

     

H 60

     

R 0.30841494477846165

     

S 0

3 个答案:

答案 0 :(得分:2)

您可以使用正则表达式实现:

def main(args: Array[String]) {
    val s = "T_32_P_1_A_420_H_60_R_0.30841494477846165_S_0"
    val pattern = "[A-Z]\\_\\d+\\.?\\d*"
    var buff = new String()
    val r = Pattern.compile(pattern)
    val m = r.matcher(s)
    while (m.find()) {
      buff = buff + (m.group(0))
      buff = buff + "\n"
    }
    buff = buff.toString.replaceAll("\\_", " ")
    println("output:\n" + buff)
  }

<强>输出:

output:
T 32
P 1
A 420
H 60
R 0.30841494477846165
S 0

答案 1 :(得分:2)

如果你需要收集数据以便进一步处理,你保证它总是正确配对,你可以这样做。

scala> val str = "T_32_P_1_A_420_H_60_R_0.30841494477846165_S_0"
str: String = T_32_P_1_A_420_H_60_R_0.30841494477846165_S_0

scala> val data = str.split("_").sliding(2,2)
data: Iterator[Array[String]] = non-empty iterator

scala> data.toList   // just to see it
res29: List[Array[String]] = List(Array(T, 32), Array(P, 1), Array(A, 420), Array(H, 60), Array(R, 0.30841494477846165), Array(S, 0))

答案 2 :(得分:1)

您可以拆分字符串,获取数组,zipWithIndex并根据索引过滤以获取两个数组col1和col2,然后将其用于打印:

val str = "T_32_P_1_A_420_H_60_R_0.30841494477846165_S_0"
val tmp = str.split('_').zipWithIndex
val col1 = tmp.filter( p => p._2 % 2 == 0 ).map( p => p._1)
val col2 = tmp.filter( p => p._2 % 2 != 0 ).map( p => p._1)

//col1: Array[String] = Array(T, P, A, H, R, S)
//col2: Array[String] = Array(32, 1, 420, 60, ...