如何使用Spark生成随机数据?

时间:2015-03-20 09:02:46

标签: hadoop apache-spark hdfs spark-streaming apache-spark-sql

一行:

//Scala Code
val rndcolumn1 = randomInt(3,8)
val rndcolumn2 = randomInt(3,8)

val column1 = randomAlphaStr(rndcolumn1)
val column2 = randomAlphaStr(rndcolumn2)
//randomAlphaStr get random int and return random string. string.size range (3,8) ex : sfjhjk, ljl, rtddhjks  
val column3 = randomInt(11111111,99999999).toString 
val column4 = randomInt(11111,99999).toString

val str = column1 +"," +column2 + "," + column3 +  "," + column4 + "\n"

我想生成大数据。如何生成此数据并使用Spark保存在Hdfs上?

我想看结果:

output/
   ---part-00000
   ---part-00001
    ....

0 个答案:

没有答案