ChatGPT解决这个技术问题 Extra ChatGPT

MongoDB {aggregation $match} vs {find} speed

I have a mongoDB collection with millions of rows and I'm trying to optimize my queries. I'm currently using the aggregation framework to retrieve data and group them as I want. My typical aggregation query is something like : $match > $group > $ group > $project

However, I noticed that the last parts only take a few ms, the beginning is the slowest.

I tried to perform a query with only the $match filter, and then to perform the same query with collection.find. The aggregation query takes ~80ms while the find query takes 0 or 1ms.

I have indexes on pretty much each field so I guess this isn't the problem. Any idea on what could go wrong ? Or is it just a "normal" drawback of the aggregation framework ?

I could use find queries instead of aggregation queries, however I would have to perform a lot of processing after the request and this process can be done quickly with $group etc. so I would rather keep the aggregation framework.

Thanks,

EDIT :

Here is my criteria :

{
    "action" : "click",
    "timestamp" : {
            "$gt" : ISODate("2015-01-01T00:00:00Z"),
            "$lt" : ISODate("2015-02-011T00:00:00Z")
    },
    "itemId" : "5"
}
Can you post your $match and find? In most usages, a $match and a find should be equivalent but I'd like to see exactly what statements you are comparing in order to make a precise answer. Also, did you run the aggregation first and then the find? What happens if you repeat the two over and over and compare the times? The difference could have been the cost of moving the results into memory from disk.
I added the criteria to the first post, however even without the timestamp criteria I see a big gap. But now I wonder if it isn't related to the fact that the find() returns a cursor and only shows the first results.
Ok, I had a lot of useless indexes so I cleaned everything and created just one compound index (with the fields of my $match filter). Now I have good performance and same performances for find and aggregate with $match :) Problem solved.
It probably also heavily depends on mongodb version
$match and find() are different in the sense that you cannot apply a limit to the match stage, it has to be done as a different stage, making it much less efficient

v
vladzam

The main purpose of the aggregation framework is to ease the query of a big number of entries and generate a low number of results that hold value to you.

As you have said, you can also use multiple find queries, but remember that you can not create new fields with find queries. On the other hand, the $group stage allows you to define your new fields.

If you would like to achieve the functionality of the aggregation framework, you would most likely have to run an initial find (or chain several ones), pull that information and further manipulate it with a programming language.

The aggregation pipeline might seem to take longer, but at least you know you only have to take into account the performance of one system - MongoDB engine.

Whereas, when it comes to manipulating the data returned from a find query, you would most likely have to further manipulate the data with a programming language, thus increasing the complexity depending on the intricacies of the programming language of choice.


Thanks for the information. However, I still don't understand why an aggregate query with only a $match filter isn't as fast as a simple find query with the same filter.
@Owumaro I have the exact same issue as the one in your comment. Did you manage to find the answer?
With MongoDB 4.4, projections in normal find operation support aggregation expressions & syntax. Therefore, it is now possible to create new fields in find queries with projection - mongodb.com/docs/manual/reference/method/db.collection.find/…
h
harshad

Have you tried using explain() to your find queries? It'll give you good idea about how much time find() query will exactly take. You can do the same for $match with $explain & see whether there is any difference in index accessing & other parameters.

Also the $group part of aggregation framework doesn't utilize the indexing so it has to process all the records returned by $match stage of aggregation framework. So to better understand the the working of your query see the result set it returns & whether it fits into memory to be processed by MongoDB.


G
Ghazanfar Ali

if you are concern with performance, then no doubt aggregation is time taking task rather then find clause. when you are fetching record on multiple conditions, having lookup, grouping, and some limited record ( paginated) then it is best approch to use aggregate , meanwhile in find query is fast when you have to fetch very big data set. you have some population, projection and no pagination i suggest to use find query that is fast