From the API docs dynamo db does support pagination for scan and query operations. The catch here is to set the ExclusiveStartIndex of current request to the value of the LastEvaluatedIndex of previous request to get next set (logical page) of results.

I'm trying to implement the same but I'm using DynamoDBMapper, which seems to have lot more advantages like tight coupling with data models. So if I wanted to do the above, I'm assuming I would do something like below:

// Mapping of hashkey of the last item in previous query operation
Map<String, AttributeValue> lastHashKey = .. 
DynamoDBQueryExpression expression = new DynamoDBQueryExpression();

...
expression.setExclusiveStartKey();
List<Table> nextPageResults = mapper.query(Table.class, expression);

I hope my above understanding is correct on paginating using DynamoDBMapper. Secondly, how would I know that I've reached the end of results. From the docs if I use the following api:

QueryResult result = dynamoDBClient.query((QueryRequest) request);
boolean isEndOfResults = StringUtils.isEmpty(result.getLastEvaluatedKey());

Coming back to using DynamoDBMapper, how can I know if I've reached end of results in this case.


Solution 1:

You have a couple different options with the DynamoDBMapper, depending on which way you want go.

  • query - returns a PaginatedQueryList
  • queryPage - returns a QueryResultPage
  • scan - returns a PaginatedScanList
  • scanPage - returns a ScanResultPage

The part here is understanding the difference between the methods, and what functionality their returned objects encapsulate.

I'll go over PaginatedScanList and ScanResultPage, but these methods/objects basically mirror each other.

The PaginatedScanList says the following, emphasis mine:

Implementation of the List interface that represents the results from a scan in AWS DynamoDB. Paginated results are loaded on demand when the user executes an operation that requires them. Some operations, such as size(), must fetch the entire list, but results are lazily fetched page by page when possible.

This says that results are loaded as you iterate through the list. When you get through the first page, the second page is automatically fetched with out you having to explicitly make another request. Lazy loading the results is the default method, but it can be overridden if you call the overloaded methods and supply a DynamoDBMapperConfig with a different DynamoDBMapperConfig.PaginationLoadingStrategy.

This is different from the ScanResultPage. You are given a page of results, and it is up to you to deal with the pagination yourself.

Here is quick code sample showing an example usage of both methods that I ran with a table of 5 items using DynamoDBLocal:

final DynamoDBMapper mapper = new DynamoDBMapper(client);

// Using 'PaginatedScanList'
final DynamoDBScanExpression paginatedScanListExpression = new DynamoDBScanExpression()
        .withLimit(limit);
final PaginatedScanList<MyClass> paginatedList = mapper.scan(MyClass.class, paginatedScanListExpression);
paginatedList.forEach(System.out::println);

System.out.println();
// using 'ScanResultPage'
final DynamoDBScanExpression scanPageExpression = new DynamoDBScanExpression()
        .withLimit(limit);
do {
    ScanResultPage<MyClass> scanPage = mapper.scanPage(MyClass.class, scanPageExpression);
    scanPage.getResults().forEach(System.out::println);
    System.out.println("LastEvaluatedKey=" + scanPage.getLastEvaluatedKey());
    scanPageExpression.setExclusiveStartKey(scanPage.getLastEvaluatedKey());

} while (scanPageExpression.getExclusiveStartKey() != null);

And the output:

MyClass{hash=2}
MyClass{hash=1}
MyClass{hash=3}
MyClass{hash=0}
MyClass{hash=4}

MyClass{hash=2}
MyClass{hash=1}
LastEvaluatedKey={hash={N: 1,}}
MyClass{hash=3}
MyClass{hash=0}
LastEvaluatedKey={hash={N: 0,}}
MyClass{hash=4}
LastEvaluatedKey=null