My iPhone application connects to my PHP web service to retrieve data from a MySQL database, a request can return up to 500 results.
What is the best way to implement paging and retrieve 20 items at a time?
Let's say I receive the first 20 entries from my database, how can I now request the next 20 entries?
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements). With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1): SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15 To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last: SELECT * FROM tbl LIMIT 95,18446744073709551615; With one argument, the value specifies the number of rows to return from the beginning of the result set: SELECT * FROM tbl LIMIT 5; # Retrieve first 5 rows In other words, LIMIT row_count is equivalent to LIMIT 0, row_count.
For 500 records efficiency is probably not an issue, but if you have millions of records then it can be advantageous to use a WHERE clause to select the next page:
SELECT *
FROM yourtable
WHERE id > 234374
ORDER BY id
LIMIT 20
The "234374" here is the id of the last record from the prevous page you viewed.
This will enable an index on id to be used to find the first record. If you use LIMIT offset, 20
you could find that it gets slower and slower as you page towards the end. As I said, it probably won't matter if you have only 200 records, but it can make a difference with larger result sets.
Another advantage of this approach is that if the data changes between the calls you won't miss records or get a repeated record. This is because adding or removing a row means that the offset of all the rows after it changes. In your case it's probably not important - I guess your pool of adverts doesn't change too often and anyway no-one would notice if they get the same ad twice in a row - but if you're looking for the "best way" then this is another thing to keep in mind when choosing which approach to use.
If you do wish to use LIMIT with an offset (and this is necessary if a user navigates directly to page 10000 instead of paging through pages one by one) then you could read this article about late row lookups to improve performance of LIMIT with a large offset.
limit 1000000, 10
and hoping that it will work won't get you anywhere.
Define OFFSET for the query. For example
page 1 - (records 01-10): offset = 0, limit=10;
page 2 - (records 11-20) offset = 10, limit =10;
and use the following query :
SELECT column FROM table LIMIT {someLimit} OFFSET {someOffset};
example for page 2:
SELECT column FROM table
LIMIT 10 OFFSET 10;
There's literature about it:
Optimized Pagination using MySQL, making the difference between counting the total amount of rows, and pagination.
Efficient Pagination Using MySQL, by Yahoo Inc. in the Percona Performance Conference 2009. The Percona MySQL team provides it also as a Youtube video: Efficient Pagination Using MySQL (video),
The main problem happens with the usage of large OFFSET
s. They avoid using OFFSET
with a variety of techniques, ranging from id
range selections in the WHERE
clause, to some kind of caching or pre-computing pages.
There are suggested solutions at Use the INDEX, Luke:
"Paging Through Results".
"Pagination done the right way".
This tutorial shows a great way to do pagination. Efficient Pagination Using MySQL
In short, avoid to use OFFSET or large LIMIT
OFFSET
but instead use ORDER BY
and put an index on the column that is used for ordering. Now we can filter/paginate with WHERE indexedColumn > lastSeenValue ORDER BY indexedColumn DESC LIMIT pageSize
. Requests to the webserver must then contain a lastSeen value.
you can also do
SELECT SQL_CALC_FOUND_ROWS * FROM tbl limit 0, 20
The row count of the select statement (without the limit) is captured in the same select statement so that you don't need to query the table size again. You get the row count using SELECT FOUND_ROWS();
*
results in more columns than necessary being fetched, and the SQL_CALC_FOUND_ROWS
results in those columns being read from all rows in the table, even though they are not included in the result. It would be a lot more efficient to calculate the number of rows in a separate query which doesn't read all those columns. Then your main query can stop after reading 20 rows.
Query 1: SELECT * FROM yourtable WHERE id > 0 ORDER BY id LIMIT 500
Query 2: SELECT * FROM tbl LIMIT 0,500;
Query 1 run faster with small or medium records, if number of records equal 5,000 or higher, the result are similar.
Result for 500 records:
Query1 take 9.9999904632568 milliseconds
Query2 take 19.999980926514 milliseconds
Result for 8,000 records:
Query1 take 129.99987602234 milliseconds
Query2 take 160.00008583069 milliseconds
id
.
id > 0
useful?
offset
(the first argument to limit is offset), you are still selecting all the data to the limit, then discarding that amount of the offset, then returning the section which is between offset
and limit
. with where
clause on the other hand, you are setting a kind of a start point for the query and query ONLY
that specific part.
Success story sharing
limit X, Y
, what essentially happens is that X+Y rows are retrieved and then X rows from the beginning are dropped and whatever left is returned. To reiterate:limit X, Y
results in scan of X+Y rows.OFFSET
;-)