MySQL Data - Best way to implement paging?

C

Community

The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements). With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1): SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15 To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last: SELECT * FROM tbl LIMIT 95,18446744073709551615; With one argument, the value specifies the number of rows to return from the beginning of the result set: SELECT * FROM tbl LIMIT 5; # Retrieve first 5 rows In other words, LIMIT row_count is equivalent to LIMIT 0, row_count.

When using LIMIT for paging you should also specify an ORDER BY.

@shylent: Nothing wrong with quoting the documentation, but I agree that he should have mentioned that he was copying the docs and provided a link to the original source. Also I'm surprised that the documentation would include examples of using LIMIT without an ORDER BY... that seems like a bad practice to be encouraging. Without an ORDER BY there's no guarantee that the order will be the same between calls.

anyway, when paginating large resultsets (and that's what pagination is for - break up large resultsets into smaller chunks, right?), you should keep in mind that if you do a limit X, Y, what essentially happens is that X+Y rows are retrieved and then X rows from the beginning are dropped and whatever left is returned. To reiterate: limit X, Y results in scan of X+Y rows.

I don't like your LIMIT 95, 18446744073709551615 idea.. take a look at OFFSET ;-)

This is not efficient when working with large data. Check codular.com/implementing-pagination for mutiple ways whicg are suitable for specific scenerio.

M

Mark Byers

For 500 records efficiency is probably not an issue, but if you have millions of records then it can be advantageous to use a WHERE clause to select the next page:

SELECT *
FROM yourtable
WHERE id > 234374
ORDER BY id
LIMIT 20

The "234374" here is the id of the last record from the prevous page you viewed.

This will enable an index on id to be used to find the first record. If you use LIMIT offset, 20 you could find that it gets slower and slower as you page towards the end. As I said, it probably won't matter if you have only 200 records, but it can make a difference with larger result sets.

Another advantage of this approach is that if the data changes between the calls you won't miss records or get a repeated record. This is because adding or removing a row means that the offset of all the rows after it changes. In your case it's probably not important - I guess your pool of adverts doesn't change too often and anyway no-one would notice if they get the same ad twice in a row - but if you're looking for the "best way" then this is another thing to keep in mind when choosing which approach to use.

If you do wish to use LIMIT with an offset (and this is necessary if a user navigates directly to page 10000 instead of paging through pages one by one) then you could read this article about late row lookups to improve performance of LIMIT with a large offset.

This is more like it :P While I absolutely disapprove of the implication, that 'newer' ids are always larger, than 'older' ones, most of the time this will indeed be the case and so, I think, this is 'good enough'. Anyway, yes, as you demonstrated, proper pagination (without severe performance degradation on large resultsets) is not particularly trivial and writing limit 1000000, 10 and hoping that it will work won't get you anywhere.

the late lookup link is very usefull

This pagination works backwards if you just use "DESC" for id ordering. I like it!

but how often do people want to order by ID or, by insinuation, by "date created" in the real world?

This only works if you want to order by a unique property, like the primary key. As soon as you order by something like say, date, this won't work at all.

D

Danton Heuer

Define OFFSET for the query. For example

page 1 - (records 01-10): offset = 0, limit=10;

page 2 - (records 11-20) offset = 10, limit =10;

and use the following query :

SELECT column FROM table LIMIT {someLimit} OFFSET {someOffset};

example for page 2:

SELECT column FROM table
LIMIT 10 OFFSET 10;

Don't you mean offset = 10 for page-2 ?

I did limit 10 offset 0 to get the first 10 results, then limit 10 offset 1 to get the second.. etc. I like this, but how can you tell the total amount of pages or offsets?

L

Luchostein

There's literature about it:

Optimized Pagination using MySQL, making the difference between counting the total amount of rows, and pagination.

Efficient Pagination Using MySQL, by Yahoo Inc. in the Percona Performance Conference 2009. The Percona MySQL team provides it also as a Youtube video: Efficient Pagination Using MySQL (video),

The main problem happens with the usage of large OFFSETs. They avoid using OFFSET with a variety of techniques, ranging from id range selections in the WHERE clause, to some kind of caching or pre-computing pages.

There are suggested solutions at Use the INDEX, Luke:

"Paging Through Results".

"Pagination done the right way".

getiing max id for each paging query of complex queries would result in non practical , non producation usage does rank , row number and between clause type of paging helps in performace !

That strategy is taken into consideration and properly evaluated in the provided links. It's not that simple at all.

the provided link only seems to fulfil base pivot uni-pivot , cross apply , multi CTE or derived table mechanics ? again i stand by my case with rewriting queries on such magnitude again for getting maxid is architectural overkill ! and then again permutation and combination for n" number of column with sort orders !

Am I misunderstanding that "Pagination done the right way" link, or is it simply impractical in any query that involves filtering.

@contactmatt I share your aprehension. In the end, it seems there's no way to implement efficiently the full requirement, but relaxed variations around the original.

B

Bao Le

This tutorial shows a great way to do pagination. Efficient Pagination Using MySQL

In short, avoid to use OFFSET or large LIMIT

maybe give a summary?

Yeah, i would appreciate more effort in the answer.

That's a slide show, not a tutorial. Limited usefulness.

The essence is: Don't use OFFSET but instead use ORDER BY and put an index on the column that is used for ordering. Now we can filter/paginate with WHERE indexedColumn > lastSeenValue ORDER BY indexedColumn DESC LIMIT pageSize. Requests to the webserver must then contain a lastSeen value.

H

HoldOffHunger

you can also do

SELECT SQL_CALC_FOUND_ROWS * FROM tbl limit 0, 20

The row count of the select statement (without the limit) is captured in the same select statement so that you don't need to query the table size again. You get the row count using SELECT FOUND_ROWS();

This is particularly inefficient. The * results in more columns than necessary being fetched, and the SQL_CALC_FOUND_ROWS results in those columns being read from all rows in the table, even though they are not included in the result. It would be a lot more efficient to calculate the number of rows in a separate query which doesn't read all those columns. Then your main query can stop after reading 20 rows.

Are you sure? I timed the query against a large table SQL_CALC_FOUND_ROWS and another query not using. I saw no time difference. Any ways it is faster than doing 2 queries . 1 - select * from atable limit 0 20, and then select count(*) from atable.

Yes I'm sure - here's more info. In all cases when you are using an index to filter rows, SQL_CALC_FOUND_ROWS is significantly slower than doing 2 separate queries. On the rare occasion you are not using an index, or (as in this simplified example) you have no WHERE clause and it's a MYISAM table, it makes little difference (it's around the same speed).

Also here's a discussion about it on Stackoverflow

H

Huy

Query 1: SELECT * FROM yourtable WHERE id > 0 ORDER BY id LIMIT 500

Query 2: SELECT * FROM tbl LIMIT 0,500;

Query 1 run faster with small or medium records, if number of records equal 5,000 or higher, the result are similar.

Result for 500 records:

Query1 take 9.9999904632568 milliseconds

Query2 take 19.999980926514 milliseconds

Result for 8,000 records:

Query1 take 129.99987602234 milliseconds

Query2 take 160.00008583069 milliseconds

You need to put an index on id.

How is id > 0 useful?

Like Maarten said, those two queries appear fundamentally the same, and probably break down into the same, machine-level commands either way. You must have an indexing problem or a really old version of MySQL.

thanks , as in i didnt saw your answer , i just needed to see the order in which where , order and limit comes

wrong example has been used. with offset(the first argument to limit is offset), you are still selecting all the data to the limit, then discarding that amount of the offset, then returning the section which is between offset and limit. with where clause on the other hand, you are setting a kind of a start point for the query and query ONLY that specific part.

MySQL Data - Best way to implement paging?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US