Cassandra 1.2.5 - 无效的UTF8字节

时间:2013-07-03 15:43:57

标签: utf-8 cassandra

我正在从CF中读取和写入大量数据。

过了一会儿,我收到以下错误:

INFO [MemoryMeter:1] 2013-07-03 09:41:34,438 Memtable.java (line 238) CFS(Keyspace='amlear', ColumnFamily='tmp2_rpt_rptStats_popkeywrd_sp_G') liveRatio is 4.12192 (just-counted was 4.12192).  calculation took 168ms for 2048 columns
ERROR [ReadStage:706] 2013-07-03 09:41:56,187 CassandraDaemon.java (line 175) Exception in thread Thread[ReadStage:706,5,main]
java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73
    at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73
    at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
    at org.apache.cassandra.dht.AbstractBounds.format(AbstractBounds.java:103)
    at org.apache.cassandra.dht.AbstractBounds.getString(AbstractBounds.java:96)
    at org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1387)
    at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
    at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
    at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
    at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
    ... 3 more

注意,我最近从cassandra 1.1.4升级到cassandra 1.2.5(我不知道它是否相关) java版本:1.6.0_32

有谁知道如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

  

引起:org.apache.cassandra.db.marshal.MarshalException:无效的UTF8字节37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73

中间有无效的UTF-8字节。

具体而言,从第17个字节开始的2字节序列c080无效。不确定是什么字符,可能是NUL字符(在UTF-8中应该只是00)。 UTF-8中的第一个2字节序列是c280,对应于Unicode U+0080

UTF-8编码器损坏了吗?

相关问题