ImportTsv命令在Hbase

时间:2015-08-17 08:27:52

标签: hadoop hbase

我正在使用HBase 0.98.1-cdh5.1.3。我正在尝试将位置/user/hdfs/exp的hdfs中存在的csv文件提取到Hbase。我的文件包含以下格式的数据:

1,abc,xyz

2,def,uvw

3,ghi,rst

我正在使用以下命令:

bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv '-Dimporttsv.separator=,' -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname tablename /user/hdfs/exp

我也使用了不同的组合,比如

bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname tablename /user/hdfs/exp '-Dimporttsv.separator=,'

bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname '-Dimporttsv.separator=,' tablename /user/hdfs/exp 

但没有任何作用。它无法检测分隔符,即在我的情况下,并没有正确解析。任何人都可以帮我弄清楚我哪里出错了。

这只是一行数据集:

10000064202896309897,1000006420,2896309897,10180,HDFS:// btc5x015:8020 /用户/ mr_test / logsJan / log_jan20_29 / 10180_log201501260000.log,3.2.3.1,9,2015-01-26,15:46:12.12,REF肩部4,N,N,肩,60,17.0,M 487093458,[study_16004_16004 _],exam_16004_16004,[patient_16004_1 _],Schulter STD,SCHULTERGELENK RECHTS,8.10,NOT_EXIST,-8.1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST, NOT_EXIST,N,NOT_EXIST,NOT_EXIST,Y,N,HF,NOT_EXIST,NOT_EXIST,NOT_EXIST ,,, NOT_EXIST,NOT_EXIST,N,NOT_EXIST,1,NOT_EXIST,IMAG,FFE,T1FFE,4.0,0.72,34,NOT_EXIST,NOT_EXIST, NOT_EXIST,4.0,COR ,, NOT_EXIST,无,O,N,NOT_EXIST,1,0.0 ,,,,,,, NOT_EXIST,102,NOT_EXIST,NOT_EXIST,NOT_EXIST,15:45:29.28,15:46:12.12,15 :46:12.12,9.5,9.4,1.002,NOT_EXIST,TRUE,没有,NOT_EXIST,0.0,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,0.0,NOT_EXIST,NOT_EXIST,0.0,0.09,0.3,3.3,0,FALSE,FALSE,正中,n,26.1,LT,0.33,0.03,NOT_EXIST,NULL,1,Dres.GrafKernHausmann,E:\导出\ DataMonitoring \ p_i_20150126_154530.frame,HDFS://btc5x015.code1.emi.philips.com:8020 /用户/ mr_test / logsJan / log_jan 20_29 / 10180_log201501260000.log,317774,883,0,0,1,8,6,2,0,0,0,0,0,0,6014,0,15:44:08.15,15:44:59.93, 15:45:23.14,15:45:29.28,00:00:00.00,00:00:00.00,15:45:30.57,15:45:38.45,15:45:29.28,15:46:12.12,15: 45:38.45,15:46:12.12,42984,33967,0,7988,6014,00:00:00.00,00:00:00.00,0,00:00:00.00,00:00:00.00,0,169,102,SENSE- SHOULDER8 ,, SENSE-SHOULDER8 /(19)BODY-QUAD,190,190,CLINICAL,0,0,94166709,Radiologische Gemeinschaftspraxis,Dr。 MED。迈克尔格拉夫博士。 MED。安德烈亚斯克恩博士。 MED。豪斯曼,韦茨拉尔,35578,Hausertorstr。 47,6,NOT_EXIST,1,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,最短NOT_EXIST,最短1,NO,YES,DEFAULT,FFE,NOT_EXIST,NOT_EXIST, NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,3,NOT_EXIST,CARTESIAN,YES,NO,NOT_EXIST,NO,NOT_EXIST,低,FULL,NOT_EXIST,NOT_EXIST,NOT_EXIST,3D,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,YES, NOT_EXIST,不,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,USER_DEF,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,DEF, H,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,MPU_MTC_MODE_NO,NO,NOT_EXIST,T1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,450405,YES,仰卧,HF, NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,DEFAULT,NOT_EXIST,2,SENSE-SHOULDER8 BODY-QUAD,F,400 400 100 ,,, 100,5.625,7.23214293,4,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST ,NOT_EXIST,NOT_EX IST,NO,NO,NOT_EXIST,NOT_EXIST,80%,PARALLEL,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,0,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST, NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,OFF,NOT_EXIST,NO,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST, NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,0,5.625,NOT_EXIST,NOT_EXIST,NOT_EXIST,10180,15:45:29.29,15:46:13.76,空,,, 96,,1 ,,, 0,10180 ,PATTERN_SRN,肩,SCHULTER,MATCHED_SHOULDER,肩,上肢,ANATOMY_GROUP_MAPPED,10180,10180,10180,3.2.3,3.0T的Achieva,Achieva的,T30,3.0T,NO,F2000,Watercooled2,274-d,万事达,NONE ,,, 0,16,0,1,S26_128,NONE,8,空,CDAS,LOGFOLDER_SYSFOLDER_MATCHED_RELEASE_NOT_CHECKED,FALSE,NULL,NULL,NULL,NULL,12.15,SENSE-SHOULDER8 /(19)Q-BODY,0,SCAN_PARSE_SUCCESS, SHOULDER,-4.64757729 5.37445641,

2 个答案:

答案 0 :(得分:0)

我刚刚使用ImportTSV命令将一行代码加载到hbase表中,提供了414列,它对我来说非常合适。这是我使用的命令。

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:c1,CF:c2,CF:c3,CF:c4,CF:c5,CF:c6,CF:c7,CF:c8,CF:c9,CF:c10,CF:c11,CF:c12,CF:c13,CF:c14,CF:c15,CF:c16,CF:c17,CF:c18,CF:c19,CF:c20,CF:c21,CF:c22,CF:c23,CF:c24,CF:c25,CF:c26,CF:c27,CF:c28,CF:c29,CF:c30,CF:c31,CF:c32,CF:c33,CF:c34,CF:c35,CF:c36,CF:c37,CF:c38,CF:c39,CF:c40,CF:c41,CF:c42,CF:c43,CF:c44,CF:c45,CF:c46,CF:c47,CF:c48,CF:c49,CF:c50,CF:c51,CF:c52,CF:c53,CF:c54,CF:c55,CF:c56,CF:c57,CF:c58,CF:c59,CF:c60,CF:c61,CF:c62,CF:c63,CF:c64,CF:c65,CF:c66,CF:c67,CF:c68,CF:c69,CF:c70,CF:c71,CF:c72,CF:c73,CF:c74,CF:c75,CF:c76,CF:c77,CF:c78,CF:c79,CF:c80,CF:c81,CF:c82,CF:c83,CF:c84,CF:c85,CF:c86,CF:c87,CF:c88,CF:c89,CF:c90,CF:c91,CF:c92,CF:c93,CF:c94,CF:c95,CF:c96,CF:c97,CF:c98,CF:c99,CF:c100,CF:c101,CF:c102,CF:c103,CF:c104,CF:c105,CF:c106,CF:c107,CF:c108,CF:c109,CF:c110,CF:c111,CF:c112,CF:c113,CF:c114,CF:c115,CF:c116,CF:c117,CF:c118,CF:c119,CF:c120,CF:c121,CF:c122,CF:c123,CF:c124,CF:c125,CF:c126,CF:c127,CF:c128,CF:c129,CF:c130,CF:c131,CF:c132,CF:c133,CF:c134,CF:c135,CF:c136,CF:c137,CF:c138,CF:c139,CF:c140,CF:c141,CF:c142,CF:c143,CF:c144,CF:c145,CF:c146,CF:c147,CF:c148,CF:c149,CF:c150,CF:c151,CF:c152,CF:c153,CF:c154,CF:c155,CF:c156,CF:c157,CF:c158,CF:c159,CF:c160,CF:c161,CF:c162,CF:c163,CF:c164,CF:c165,CF:c166,CF:c167,CF:c168,CF:c169,CF:c170,CF:c171,CF:c172,CF:c173,CF:c174,CF:c175,CF:c176,CF:c177,CF:c178,CF:c179,CF:c180,CF:c181,CF:c182,CF:c183,CF:c184,CF:c185,CF:c186,CF:c187,CF:c188,CF:c189,CF:c190,CF:c191,CF:c192,CF:c193,CF:c194,CF:c195,CF:c196,CF:c197,CF:c198,CF:c199,CF:c200,CF:c201,CF:c202,CF:c203,CF:c204,CF:c205,CF:c206,CF:c207,CF:c208,CF:c209,CF:c210,CF:c211,CF:c212,CF:c213,CF:c214,CF:c215,CF:c216,CF:c217,CF:c218,CF:c219,CF:c220,CF:c221,CF:c222,CF:c223,CF:c224,CF:c225,CF:c226,CF:c227,CF:c228,CF:c229,CF:c230,CF:c231,CF:c232,CF:c233,CF:c234,CF:c235,CF:c236,CF:c237,CF:c238,CF:c239,CF:c240,CF:c241,CF:c242,CF:c243,CF:c244,CF:c245,CF:c246,CF:c247,CF:c248,CF:c249,CF:c250,CF:c251,CF:c252,CF:c253,CF:c254,CF:c255,CF:c256,CF:c257,CF:c258,CF:c259,CF:c260,CF:c261,CF:c262,CF:c263,CF:c264,CF:c265,CF:c266,CF:c267,CF:c268,CF:c269,CF:c270,CF:c271,CF:c272,CF:c273,CF:c274,CF:c275,CF:c276,CF:c277,CF:c278,CF:c279,CF:c280,CF:c281,CF:c282,CF:c283,CF:c284,CF:c285,CF:c286,CF:c287,CF:c288,CF:c289,CF:c290,CF:c291,CF:c292,CF:c293,CF:c294,CF:c295,CF:c296,CF:c297,CF:c298,CF:c299,CF:c300,CF:c301,CF:c302,CF:c303,CF:c304,CF:c305,CF:c306,CF:c307,CF:c308,CF:c309,CF:c310,CF:c311,CF:c312,CF:c313,CF:c314,CF:c315,CF:c316,CF:c317,CF:c318,CF:c319,CF:c320,CF:c321,CF:c322,CF:c323,CF:c324,CF:c325,CF:c326,CF:c327,CF:c328,CF:c329,CF:c330,CF:c331,CF:c332,CF:c333,CF:c334,CF:c335,CF:c336,CF:c337,CF:c338,CF:c339,CF:c340,CF:c341,CF:c342,CF:c343,CF:c344,CF:c345,CF:c346,CF:c347,CF:c348,CF:c349,CF:c350,CF:c351,CF:c352,CF:c353,CF:c354,CF:c355,CF:c356,CF:c357,CF:c358,CF:c359,CF:c360,CF:c361,CF:c362,CF:c363,CF:c364,CF:c365,CF:c366,CF:c367,CF:c368,CF:c369,CF:c370,CF:c371,CF:c372,CF:c373,CF:c374,CF:c375,CF:c376,CF:c377,CF:c378,CF:c379,CF:c380,CF:c381,CF:c382,CF:c383,CF:c384,CF:c385,CF:c386,CF:c387,CF:c388,CF:c389,CF:c390,CF:c391,CF:c392,CF:c393,CF:c394,CF:c395,CF:c396,CF:c397,CF:c398,CF:c399,CF:c400,CF:c401,CF:c402,CF:c403,CF:c404,CF:c405,CF:c406,CF:c407,CF:c408,CF:c409,CF:c410,CF:c411,CF:c412,CF:c413,CF:c414 '-Dimporttsv.separator=,' tablename /user/hdfs/exp

我已经提供了随机列名称,您可以根据需要更新它。

注意:确保您通过命令的列数与输入数据源匹配。当我通过412列而不是414列时,我甚至遇到了Bad Line问题。

希望这会有所帮助。:)

答案 1 :(得分:0)

  1. 在指定分隔符时,单引号似乎放错了位置。尝试使用这个:-Dimporttsv.separator=',' 而不是 '-Dimporttsv.separator=,'

  2. 如果输入文件的任何列值都由字段分隔符(在本例中为逗号)组成,则它将失败。准备 CSV 文件时最好保留不同的分隔符(例如 |)