HBase根据列值过滤数据

时间:2014-12-29 23:29:33

标签: hbase

我想根据特定列的值列表过滤Hbase表扫描。

Ex:对于下面给出的表Employee,我想为ID为(123,789)的员工提取记录。

 ROW                   COLUMN+CELL

 row1                 column=emp:name, timestamp=1321296699190, value=TestName1
 row1                 column=emp:id, timestamp=1321296715892, value=123

 row2                 column=emp:name, timestamp=1321296699190, value=TestName2
 row2                 column=emp:id, timestamp=1321296715892, value=456

 row3                 column=emp:name, timestamp=1321296699190, value=TestName3
 row3                 column=emp:id, timestamp=1321296715892, value=789

 row4                 column=emp:name, timestamp=1321296699190, value=TestName4
 row4                 column=emp:id, timestamp=1321296715892, value=101

 row5                 column=emp:name, timestamp=1321296699190, value=TestName5
 row5                 column=emp:id, timestamp=1321296715892, value=102

我尝试使用SingleColumnValueFilter,但它只从表中获取一条记录。以下是我的代码。请告诉我出错的地方:

HTableInterface empTableObj = service.openTable("employee");;
Scan scan = new Scan(startRow, endRow);            

FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);

Integer[] idArray = {123, 789};
for(int i=0;i<idArray.length;i++){
    SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes("emp"), Bytes.toBytes("id"), CompareOp.EQUAL, Bytes.toBytes(idArray[i].toString()));
    filterList.addFilter(filter);
}
scan.setFilter(filterList);
ResultScanner rs = empTableObj.getScanner(scan); 

由于

3 个答案:

答案 0 :(得分:1)

尝试其他构造函数:

SingleColumnValueFilter filter = new SingleColumnValueFilter(family, qualifier, compareOp, empBytes); 

其中

compareOp = CompareFilter.CompareOp.EQUAL;

和family,限定符以字节为单位,empBytes为Bytes.toBytes("emp")

或者您可以创建2个过滤器:

SingleColumnValueFilter filterLower = setFilterByCol(CompareOp.GREATER_OR_EQUAL,123);  
SingleColumnValueFilter filterUpper = setFilterByCol(CompareOp.LESS_OR_EQUAL,789);  

和一个功能:

private static SingleColumnValueFilter setFilterByCol(CompareOp compareOp,int emp) {


        byte[] family = "col_fam_name".getBytes();
        byte[] qualifier = "col_qualifier".getBytes();
        byte[] empByte = // convert emp to empByte...

        SingleColumnValueFilter filter = new SingleColumnValueFilter (family,qualifier,compareOp, empByte );
        filter.setFilterIfMissing(true);
        return filter;
    }

请注意,您还有SingleColumnValueExcludeFilter,它允许您从扫描中排除用作过滤器的列。

答案 1 :(得分:0)

由于过滤器被懒惰评估,我猜你必须继续浏览next()以扫描所有值。

如果您知道有2个值,请尝试

rs.next() // for the first value (row1)
rs.next() // again for the second row (row4)

如果不确定你会得到多少......在循环中运行它。

答案 2 :(得分:-1)

public void testFilterList() {  
    LOG.info("Entering testFilterList.");  

    Table table = null;  
    ResultScanner rScanner = null;  
    try {  
       table = conn.getTable(tableName);  
       Scan scan = new Scan();  
       scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"));  

       // Instantiate a FilterList object in which filters have "and"  
       // relationship with each other.  
       FilterList list = new FilterList(Operator.MUST_PASS_ALL);  
       // Obtain data with EmpId of greater than or equal to 200.  
       list.addFilter(new SingleColumnValueFilter(Bytes.toBytes("info"), Bytes  
           .toBytes("EmpId"), CompareOp.GREATER_OR_EQUAL, Bytes.toBytes(new Long(  
           200))));  
       // Obtain data with EmpId of less than or equal to 1000.  
       list.addFilter(new SingleColumnValueFilter(Bytes.toBytes("info"), Bytes  
           .toBytes("EmpId"), CompareOp.LESS_OR_EQUAL, Bytes.toBytes(new Long(1000))));  

       scan.setFilter(list);  

       // Submit a scan request.  
       rScanner = table.getScanner(scan);  
       // Print query results.  
       for (Result r = rScanner.next(); r != null; r = rScanner.next()) {  
         for (Cell cell : r.rawCells()) {  
           LOG.info(Bytes.toString(CellUtil.cloneRow(cell)) + ":"  
               + Bytes.toString(CellUtil.cloneFamily(cell)) + ","  
               + Bytes.toString(CellUtil.cloneQualifier(cell)) + ","  
               + Bytes.toString(CellUtil.cloneValue(cell)));  
         }  
       }  
       LOG.info("Filter list successfully.");  
     } catch (IOException e) {  
       LOG.error("Filter list failed ", e);  
     } finally {  
         if (rScanner != null) {  
             // Close the scanner object.  
             rScanner.close();  
           }  
       if (table != null) {  
         try {  
           // Close the HTable object.  
           table.close();  
         } catch (IOException e) {  
           LOG.error("Close table failed ", e);  
         }  
       }  
     }  
     LOG.info("Exiting testFilterList.");  
}
相关问题