ChatGPT解决这个技术问题 Extra ChatGPT

MySQL Query GROUP BY day / month / year

Is it possible to make a simple query to count how many records I have in a determined period of time like a year, month, or day, having a TIMESTAMP field, like:

SELECT COUNT(id)
FROM stats
WHERE record_date.YEAR = 2009
GROUP BY record_date.YEAR

Or even:

SELECT COUNT(id)
FROM stats
GROUP BY record_date.YEAR, record_date.MONTH

To have a monthly statistic.

Thanks!

I guess it's supposed to be GROUP BY record_date.MONTH in your first code snippet?

c
codelogic
GROUP BY YEAR(record_date), MONTH(record_date)

Check out the date and time functions in MySQL.


You may want to add an extra column for added clarity in some cases such as where records span several years. SELECT COUNT(event_id), DATE_FORMAT(event_start, '%Y/%m')
Simple complete example: SELECT count(*), record_date FROM anytable WHERE anytable.anycolumn = 'anycondition' GROUP BY YEAR(record_date), month(record_date); note: record_date is a date type TIMESTAMP
Probably worth mentioning this didn't run on my MySQL 5.7 with a COUNT aliased column (no error, I got zero results). When I changed to select those fields with alias, I could then group by the alias. This is standard MySQL 5.7 docker image running in a local environment so I've no idea why it didn't error or return results.
Oh god, if I knew this earlier...so many lines of PHP to do something mysql can do in one line.
A
Andriy M
GROUP BY DATE_FORMAT(record_date, '%Y%m')

Note (primarily, to potential downvoters). Presently, this may not be as efficient as other suggestions. Still, I leave it as an alternative, and a one, too, that can serve in seeing how faster other solutions are. (For you can't really tell fast from slow until you see the difference.) Also, as time goes on, changes could be made to MySQL's engine with regard to optimisation so as to make this solution, at some (perhaps, not so distant) point in future, to become quite comparable in efficiency with most others.


I have a feeling that this would not perform well because a format function wouldn't be able to use an index on the date column.
@Stv: You might want to consider @fu-chi's answer then. As far as I can tell, the grouping expressions in both that answer and mine evaluate to the same thing but EXTRACT() may be more efficient than DATE_FORMAT(). (I don't have a MySQL for proper testing, though.)
I like grouping by the date and time functions year, month, etc. But I love using date_format in the select to reconstitute the date for the groups date_format(concat(year(timestamp), "-", month(timestamp), "-", day(timestamp), " ", hour(timestamp), ":00"), '%Y-%m-%d')
f
fu-chi

try this one

SELECT COUNT(id)
FROM stats
GROUP BY EXTRACT(YEAR_MONTH FROM record_date)

EXTRACT(unit FROM date) function is better as less grouping is used and the function return a number value.

Comparison condition when grouping will be faster than DATE_FORMAT function (which return a string value). Try using function|field that return non-string value for SQL comparison condition (WHERE, HAVING, ORDER BY, GROUP BY).


P
Peter O.

I tried using the 'WHERE' statement above, I thought its correct since nobody corrected it but I was wrong; after some searches I found out that this is the right formula for the WHERE statement so the code becomes like this:

SELECT COUNT(id)  
FROM stats  
WHERE YEAR(record_date) = 2009  
GROUP BY MONTH(record_date)

f
fragilewindows

If your search is over several years, and you still want to group monthly, I suggest:

version #1:

SELECT SQL_NO_CACHE YEAR(record_date), MONTH(record_date), COUNT(*)
FROM stats
GROUP BY DATE_FORMAT(record_date, '%Y%m')

version #2 (more efficient):

SELECT SQL_NO_CACHE YEAR(record_date), MONTH(record_date), COUNT(*)
FROM stats
GROUP BY YEAR(record_date)*100 + MONTH(record_date)

I compared these versions on a big table with 1,357,918 rows (), and the 2nd version appears to have better results.

version1 (average of 10 executes): 1.404 seconds version2 (average of 10 executes): 0.780 seconds

(SQL_NO_CACHE key added to prevent MySQL from CACHING to queries.)


Consider including @fu-chi's suggestion into your tests, it may prove even more efficient. Also, you tested GROUP BY YEAR(record_date)*100 + MONTH(record_date), but why not test GROUP BY YEAR(record_date), MONTH(record_date) as well?
If you use COUNT(1) insteed COUNT(*) it will be even faster, and result data are same.
What is that *100 on the versión #2? Thanks in advance.
*100 to YEAR(record_date)*100 + MONTH(record_date) == DATE_FORMAT(record_date, '%Y%m')
S
Salman A

If you want to filter records for a particular year (e.g. 2000) then optimize the WHERE clause like this:

SELECT MONTH(date_column), COUNT(*)
FROM date_table
WHERE date_column >= '2000-01-01' AND date_column < '2001-01-01'
GROUP BY MONTH(date_column)
-- average 0.016 sec.

Instead of:

WHERE YEAR(date_column) = 2000
-- average 0.132 sec.

The results were generated against a table containing 300k rows and index on date column.

As for the GROUP BY clause, I tested the three variants against the above mentioned table; here are the results:

SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY YEAR(date_column), MONTH(date_column)
-- codelogic
-- average 0.250 sec.

SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY DATE_FORMAT(date_column, '%Y%m')
-- Andriy M
-- average 0.468 sec.

SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY EXTRACT(YEAR_MONTH FROM date_column)
-- fu-chi
-- average 0.203 sec.

The last one is the winner.


H
Haijerome

If you want to group by date in MySQL then use the code below:

 SELECT COUNT(id)
 FROM stats
 GROUP BY DAYOFMONTH(record_date)

Hope this saves some time for the ones who are going to find this thread.


It's important to note that you'd also need to group by MONTH(record_date) as well to account for multiple months.
l
lreeder

You can do this simply Mysql DATE_FORMAT() function in GROUP BY. You may want to add an extra column for added clarity in some cases such as where records span several years then same month occurs in different years.Here so many option you can customize this. Please read this befor starting. Hope it should be very helpful for you. Here is sample query for your understanding

SELECT
    COUNT(id),
    DATE_FORMAT(record_date, '%Y-%m-%d') AS DAY,
    DATE_FORMAT(record_date, '%Y-%m') AS MONTH,
    DATE_FORMAT(record_date, '%Y') AS YEAR

FROM
    stats
WHERE
    YEAR = 2009
GROUP BY
    DATE_FORMAT(record_date, '%Y-%m-%d ');

Great. Working well
C
Cees Timmerman

Complete and simple solution with similarly performing yet shorter and more flexible alternative currently active:

SELECT COUNT(*) FROM stats
-- GROUP BY YEAR(record_date), MONTH(record_date), DAYOFMONTH(record_date)
GROUP BY DATE_FORMAT(record_date, '%Y-%m-%d')

k
kero

If you want to get a monthly statistics with row counts per month of each year ordered by latest month, then try this:

SELECT count(id),
      YEAR(record_date),
      MONTH(record_date) 
FROM `table` 
GROUP BY YEAR(record_date),
        MONTH(record_date) 
ORDER BY YEAR(record_date) DESC,
        MONTH(record_date) DESC

M
Minisha

The following query worked for me in Oracle Database 12c Release 12.1.0.1.0

SELECT COUNT(*)
FROM stats
GROUP BY 
extract(MONTH FROM TIMESTAMP),
extract(MONTH FROM TIMESTAMP),
extract(YEAR  FROM TIMESTAMP);

A
Arth

I prefer to optimize the one year group selection like so:

SELECT COUNT(*)
  FROM stats
 WHERE record_date >= :year 
   AND record_date <  :year + INTERVAL 1 YEAR;

This way you can just bind the year in once, e.g. '2009', with a named parameter and don't need to worry about adding '-01-01' or passing in '2010' separately.

Also, as presumably we are just counting rows and id is never NULL, I prefer COUNT(*) to COUNT(id).


N
NiKiZe

I wanted to get similar data per day, after experimenting a bit, this is the fastest I could find for my scenario

SELECT COUNT(id)
FROM stats
GROUP BY record_date DIV 1000000;

If you want to have it per month, add additional zeroes (00) I would not recommend this from "make the code readable" perspective, it might also break in different versions. But in our case this took less then half the time compared to some other more clearer queries that I tested.

This is a MySQL answer (as MySQL is tagged in the question) and well documented in the manual https://dev.mysql.com/doc/refman/8.0/en/date-and-time-type-conversion.html


Why does this work? A quirk of MySQL. While fulfilling this query, it implicitly converts 2021-03-04 05:06:07 to the number 20,210,304,050,607. (Commas added for clarity). DIV 1,000,000 of that number yields the number 20,210,304, representing the day. It's probably best to avoid this particular quirk when handling other peoples' data.
Converting dates to numeric in MySQL is documented dev.mysql.com/doc/refman/8.0/en/… "Conversion of TIME and DATETIME values to numeric form (for example, by adding +0)" so not a quirk, however it does make the query less then clear.
It's a quirk in the sense that other dbms makes and models don't work this way.
Updated answer to make it clear that this is a feature of MySQL that is well documented, and it should not be expected to work with anything else. The question has MySQL in the title and is also tagged with mysql, so it was found and answered in that context.
a
aromita sen

.... group by to_char(date, 'YYYY') --> 1989

.... group by to_char(date,'MM') -->05

.... group by to_char(date,'DD') --->23

.... group by to_char(date,'MON') --->MAY

.... group by to_char(date,'YY') --->89


This would be very very slow.
O
O. Jones

Here's one more approach. This uses [MySQL's LAST_DAY() function][1] to map each timestamp to its month. It also is capable of filtering by year with an efficient range-scan if there's an index on record_date.

  SELECT LAST_DAY(record_date) month_ending, COUNT(*) record_count
    FROM stats
   WHERE record_date >= '2000-01-01'
     AND record_date <  '2000-01-01' + INTERVAL 1 YEAR
   GROUP BY LAST_DAY(record_date) 

If you want your results by day, use DATE(record_date) instead.

If you want your results by calendar quarter, use YEAR(record_date), QUARTER(record_date).

Here's a writeup. https://www.plumislandmedia.net/mysql/sql-reporting-time-intervals/ [1]: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_last-day