将map(key,value)对转换为dataFrame

时间:2016-07-14 15:58:41

标签: scala apache-spark streaming apache-spark-sql rdd

我正在推送来自Twitter的数据,其格式如下:

Map(UserLang -> hi, 
    UserName -> CarterWyatt,  
    UserScreenName -> CarterWyatt1,  
    HashTags -> ,  
    UserVerification -> false,  
    Spam -> true,  
    UserFollowersCount -> 121,  
    UserLocation -> null,  
    UserStatusCount -> 146405,  
    UserCreated -> 2013-03-04T16:44:27.000+0530,  
    UserDescription -> null,  
    TextLength -> 113,  
    Text -> abcd.,  
    UserFollowersRatio -> 121.0,  
    UserFavouritesCount -> 0,  
    UserFriendsCount -> 1,  
    StatusCreatedAt -> 2016-07-14T20:52:52.000+0530,  
    UserID -> 1241101146)

我想使用如下的案例类:

  case class Foo(UserLang :String, UserName :String, UserScreenName :String, HashTags :String,
              UserVerification :String, Spam :String, UserFollowersCount :String,
              UserLocation :String, UserStatusCount :String, UserCreated :String, UserDescription :String,
              TextLength :String, Text :String, UserFollowersRatio :String, UserFavouritesCount :String,
              UserFriendsCount :String, StatusCreatedAt :String, UserID: String)

现在我想将case类用作spark-sql表列名,并希望从map(values)中获取值,简而言之,希望从流值中填充表中的数据。

我不确定如何准确地做到这一点,请提供相同的指示。

0 个答案:

没有答案