将gzip制表符分隔文件导入bigquery
时遇到错误我得到的输出是:
root@a20c6fbdf9b5:/opt/batch/jobs# bq show -j bqjob_r5720e2f2267a5a5b_0000014d09571f27_1
Job infra-bedrock-861:bqjob_r5720e2f2267a5a5b_0000014d09571f27_1
Job Type State Start Time Duration Bytes Processed
---------- --------- ----------------- ---------- -----------------
load FAILURE 30 Apr 08:00:44 0:02:05
Errors encountered during job execution. Bad character (ASCII 0) encountered: field starts with: <H:|\ufc0f\ufffd(>
Failure details:
- File: 1 / Line:1 / Field:1: Bad character (ASCII 0) encountered:
field starts with: <\ufff>
- File: 1 / Line:3 / Field:1: Bad character (ASCII 0) encountered:
field starts with: <\u0475\ufffd=\ufffd\ufffd\u03d6>
- File: 1 / Line:4 / Field:1: Bad character (ASCII 0) encountered:
field starts with: <-\ufffd\ufffdY\u049a\ufffd>
- File: 1 / Line:6 / Field:1: Bad character (ASCII 0) encountered:
field starts with: <\u018e\ufffd\ufffd\ufffd\ufffd>
我尝试手动下载文件,解压缩然后再次上传文件。未压缩的文件可以毫无问题地导入bigquery。
这看起来像是带有zip文件的bigquery中的错误
答案 0 :(得分:1)
检查作业配置时,将非gzip文件包含为第一个uri,以.../20150426/_SUCCESS
结尾。 BigQuery使用第一个文件来确定是否启用了压缩。
假设此文件为空,您可以将其从加载请求中删除以修复此问题。如果此文件中有数据,请附加“.gz”后缀或重新排序此文件,使其不在uri列表中的第一个。