使用Scan读取数据
功能简介
要从表中读取数据,首先需要实例化该表对应的Table实例,然后创建一个Scan对象,并针对查询条件设置Scan对象的参数值,为了提高查询效率,最好指定StartRow和StopRow。查询结果的多行数据保存在ResultScanner对象中,每行数据以Result对象形式存储,Result中存储了多个Cell。
代码样例
public void testScanData() {
LOG.info("Entering testScanData.");
Table table = null;
// Instantiate a ResultScanner object.
ResultScanner rScanner = null;
try {
// Create the Configuration instance.
table = conn.getTable(tableName);
// Instantiate a Get object.
Scan scan = new Scan();
scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"));
// Set the cache size.
scan.setCaching(1000);
// Submit a scan request.
rScanner = table.getScanner(scan);
// Print query results.
for (Result r = rScanner.next(); r != null; r = rScanner.next()) {
for (Cell cell : r.rawCells()) {
LOG.info(Bytes.toString(CellUtil.cloneRow(cell)) + ":"
+ Bytes.toString(CellUtil.cloneFamily(cell)) + ","
+ Bytes.toString(CellUtil.cloneQualifier(cell)) + ","
+ Bytes.toString(CellUtil.cloneValue(cell)));
}
}
LOG.info("Scan data successfully.");
} catch (IOException e) {
LOG.error("Scan data failed " ,e);
} finally {
if (rScanner != null) {
// Close the scanner object.
rScanner.close();
}
if (table != null) {
try {
// Close the HTable object.
table.close();
} catch (IOException e) {
LOG.error("Close table failed " ,e);
}
}
}
LOG.info("Exiting testScanData.");
}
注意事项
- 建议Scan时指定StartRow和StopRow,一个有确切范围的Scan,性能会更好些。
可以设置Batch和Caching关键参数。
Batch 使用Scan调用next接口每次最大返回的记录数,与一次读取的列数有关。
Caching RPC请求返回next记录的最大数量,该参数与一次RPC获取的行数有关。