将gz文件导入bigquery时出错

时间:2015-04-30 08:39:59

标签: google-bigquery

将gzip制表符分隔文件导入bigquery

时遇到错误

我得到的输出是:

root@a20c6fbdf9b5:/opt/batch/jobs# bq show -j bqjob_r5720e2f2267a5a5b_0000014d09571f27_1
Job infra-bedrock-861:bqjob_r5720e2f2267a5a5b_0000014d09571f27_1

  Job Type    State      Start Time      Duration   Bytes Processed
 ---------- --------- ----------------- ---------- -----------------
  load       FAILURE   30 Apr 08:00:44   0:02:05

Errors encountered during job execution. Bad character (ASCII 0) encountered: field starts with: <H:|\ufc0f\ufffd(>
Failure details:
 - File: 1 / Line:1 / Field:1: Bad character (ASCII 0) encountered:
   field starts with: <\ufff>
 - File: 1 / Line:3 / Field:1: Bad character (ASCII 0) encountered:
   field starts with: <\u0475\ufffd=\ufffd\ufffd\u03d6>
 - File: 1 / Line:4 / Field:1: Bad character (ASCII 0) encountered:
   field starts with: <-\ufffd\ufffdY\u049a\ufffd>
 - File: 1 / Line:6 / Field:1: Bad character (ASCII 0) encountered:
   field starts with: <\u018e\ufffd\ufffd\ufffd\ufffd>

我尝试手动下载文件,解压缩然后再次上传文件。未压缩的文件可以毫无问题地导入bigquery。

这看起来像是带有zip文件的bigquery中的错误

1 个答案:

答案 0 :(得分:1)

检查作业配置时,将非gzip文件包含为第一个uri,以.../20150426/_SUCCESS结尾。 BigQuery使用第一个文件来确定是否启用了压缩。

假设此文件为空,您可以将其从加载请求中删除以修复此问题。如果此文件中有数据,请附加“.gz”后缀或重新排序此文件,使其不在uri列表中的第一个。