Pagination of List Resources¶
https://blueprints.launchpad.net/craton/+spec/pagination-of-resources
Craton is intended to manage large quantities of devices and other objects without sacrificing performance. Craton needs to add pagination support in order to efficiently handle queries on large collections.
Problem description¶
In the current implementation, a request to one of our collection resources
will attempt to return all of the values that can be returned (based on
authentication, etc.). For example, if a user and project have access to 5000
hosts then making a GET
request against /v1/hosts
would return all
5000. Such large result sets can and likely will slow down Craton’s response
times and make it unusable.
Proposed change¶
We propose adding pagination query parameters to all collection endpoints. The new parameters would assume defaults if the user does not include them.
We specifically propose that:
- Craton choose a default page size of 30 and limit it to being at least 10 items and at most 100 items,
- Craton choose to make the next page both discoverable and calculable. In other words, using “link” hypermedia relations in a response to indicate first, previous, next, and last page URLs that are generated by the server for the client,
- Craton should assume the defaults for requests that have no query
parameters. For example, if someone makes a
GET
request to/v1/hosts
it would imply an original page size of 30 and that the first 30 results should be returned.
To provide pagination to users, it is suggested that we use limit
and
marker
parameters to indicate the page size and last seen ID. This allows
users to begin pagination after an item, rather than at a particular page. For
example, if a user is checking for new hosts in the listing and they know the
ID of the last host they encountered they can provide marker=:id&limit=30
to get the newer hosts. If instead, we used page
and per_page
there’s
the possibility they’d miss items since hosts may have been deleted changing
the page number of the last host.
This implies that the default limit
value would be 30 and the default
marker
would be null (to indicate that no last ID is seen).
This combination of parameters is practically the standard in OpenStack. Operators familiar with OpenStack’s existing Compute, Images, etc. APIs will be familiar with these parameters.
In addition to pagination parameters, this spec proposes adding link relations in the Response body - as defined by JSON Hyper-Schema and favored by the API WG
This makes API usage easier for everyone, including, people using the API directly and people writing API wrappers such as python-cratonclient. This does, however, have the downside of affecting our response bodies and JSON Schema
Finally, I’d like to strongly propose that we include these links in each response. Which relation types we include would depend on where in the pagination the user is, but it would do something like this:
Include a
self
relation for every page that tells the user exactly what page they’re presently on.If there is a page prior to the current one, we would include the
prev
andfirst
relations. These tell the user what the previous page is and what the first page is.If there is a page after the current one, we would include the
next
andlast
relations. These are the opposites toprev
andfirst
respectively.It is worth noting that without properly implemented caching the
last
relation, it could become computationally expensive to calculate for every pagination query.
Alternatives¶
Alternative query parameters to limit
and marker
are:
Use
page
andper_page
parameters to indicate the 1-indexed “page number” and number of items on each page respectively. This means that users can change how many items they get on each page request and can resume in arbitrary places by specifying thepage
parameter.This would imply that the default
page
value would be 1 and the defaultper_page
would be 30.These two parameters are presently used by a significant number of large APIs at the moment but are not common in OpenStack itself. They provide simplicity in that if the API user wants to, they can just constantly increment the page number to get the next page in the simplest way possible. They don’t have to calculate the next value from a combination of values in the response of the last request.
This does, however, prevent users from being able to resume iteration from the last item it received in a list. Further, this adds the potential that users may miss objects due to deletions or other changes in the corresponding collection. Finally, these parameters only provide users an opaque idea as to where in a paginated resource they are and how to resume pagination.
Use
limit
andoffset
parameters to provide similar functionality and opacity toper_page
andpage
respectively.The default
limit
would, again, be 30 and the defaultoffset
would be 0.This combination of parameters is also present in a small number of OpenStack projects but has some of the same negative implications as the
page
andper_page
parameters when compared tolimit
andmarker
.
An alternative way to provide pagination links are:
Link headers - as defined in RFC 6903 - using Relation Types defined in RFC 5988.
These are also commonly used outside of OpenStack and were popular to the creation of including the relations in the response body. The benefit to Craton of using this method is that it doesn’t effect our JSON Schema or existing Response bodies. A major problem with this approach is that a relation type can be repeated in a Link header. However, the HTTP library used by the majority of the Python world - Requests - does not parse such links correctly. Further, widespread support for parsing these header values is not known to the author of this specification.
Data model impact¶
This should have no impact on our data model.
REST API impact¶
This specification will have two impacts on our REST API:
It will add
limit
andmarker
query parameters that are identical to a number of existing and future endpoints.It will change the fundamental structure of our list responses in order to accommodate the link relations.
At the moment, for example, a
GET
request made to/v1/hosts
has a response body that looks like:[ { "active": true, "cell_id": null, "device_type": "Computer", "id": 1, "ip_address": "12.12.12.15", "name": "foo2Host", "note": null, "parent_id": null, "region_id": 1 }, { "active": true, "cell_id": null, "device_type": "Phone", "id": 2, "ip_address": "11.11.11.14", "name": "fooHost", "note": null, "parent_id": null, "region_id": 1 } ]
This would need to transform to
{ "items": [ { "active": true, "cell_id": null, "device_type": "Computer", "id": 1, "ip_address": "12.12.12.15", "name": "foo2Host", "note": null, "parent_id": null, "region_id": 1 }, { "active": true, "cell_id": null, "device_type": "Phone", "id": 2, "ip_address": "11.11.11.14", "name": "fooHost", "note": null, "parent_id": null, "region_id": 1 } ], "links": [ { "rel": "first", "href": "https://craton.environment.com/v1/hosts?limit=30" }, { "rel": "next", "href": "https://craton.environment.com/v1/hosts?limit=30&marker=2" }, { "rel": "self", "href": "https://craton.environment.com/v1/hosts?limit=30&marker=1" } ] }
Security impact¶
Pagination suppport reduces the potential attack surface for denial of service attacks aimed at Craton. It alone, however, is not sufficient to prevent DoS attacks and additional measures should be taken by deployers to further mitigate those possibilities.
Notifications impact¶
Craton does not yet have notifications.
Other end user impact¶
This will have a minor affect on python-cratonclient. The list
calls it
implements will need to become smarter so they can handle pagination for the
user automatically.
Performance Impact¶
There should not be any performance impact on the service created by this code although it will frequently be called.
Other deployer impact¶
None
Developer impact¶
None
Implementation¶
Work Items¶
- Add basic pagination support with tests to ensure that functionality works independent of the other features proposed in this specification
- Add link relation support to response bodies
Dependencies¶
N/A
Testing¶
This should be tested on different levels, but at a minimum on a functional level.
Documentation Impact¶
This will impact our API reference documentation