ChatGPT解决这个技术问题 Extra ChatGPT

MySQL vs MongoDB 1000 reads

I have been very excited about MongoDb and have been testing it lately. I had a table called posts in MySQL with about 20 million records indexed only on a field called 'id'.

I wanted to compare speed with MongoDB and I ran a test which would get and print 15 records randomly from our huge databases. I ran the query about 1,000 times each for mysql and MongoDB and I am suprised that I do not notice a lot of difference in speed. Maybe MongoDB is 1.1 times faster. That's very disappointing. Is there something I am doing wrong? I know that my tests are not perfect but is MySQL on par with MongoDb when it comes to read intensive chores.

Note:

I have dual core + ( 2 threads ) i7 cpu and 4GB ram

I have 20 partitions on MySQL each of 1 million records

Sample Code Used For Testing MongoDB

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$time_taken = 0;
$tries = 100;
// connect
$time_start = microtime_float();

for($i=1;$i<=$tries;$i++)
{
    $m = new Mongo();
    $db = $m->swalif;
    $cursor = $db->posts->find(array('id' => array('$in' => get_15_random_numbers())));
    foreach ($cursor as $obj)
    {
        //echo $obj["thread_title"] . "<br><Br>";
    }
}

$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000) ;

    }
    return $numbers;
}

?>

Sample Code For Testing MySQL

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$BASE_PATH = "../src/";
include_once($BASE_PATH  . "classes/forumdb.php");

$time_taken = 0;
$tries = 100;
$time_start = microtime_float();
for($i=1;$i<=$tries;$i++)
{
    $db = new AQLDatabase();
    $sql = "select * from posts_really_big where id in (".implode(',',get_15_random_numbers()).")";
    $result = $db->executeSQL($sql);
    while ($row = mysql_fetch_array($result) )
    {
        //echo $row["thread_title"] . "<br><Br>";
    }
}
$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000);

    }
    return $numbers;
}
?>
I'm not a DBA so this is a comment not an answer, but speed should not be the main consideration when choosing between MySQL and MongoDB. Things like schemaless vs. schema (i.e. how often does your data schema need to change) and scaling in size (i.e. how easy is it to shard your data so that a typical read requires data from only one server) are more important for a choice like this.
How can it be faster in reading? It reads from a mechanical device. Same as MySQL. It depends on the speed of the device itself, you can't employ some weird magic via code in order to break trough the limits of hardware.
@ImranOmarBukhsh, indeedy. My perspective came from not recommending a change for change's sake - and suggesting a way that you can improve performance with your existing technology :)
This question just reminds me of this: mongodb-is-web-scale.com
People are mistaken that they feel like they would go with one or another. You will need both microwave and oven in your kitchen. You just cannot say I will only use one or another. Use cases for both systems are different. If you need ACID for part of your app, then use RDBMS, if do not care about consistency and constraints and your entities can be stored as all in one (collections) then use MongoDB. You will end up using a hybrid system, key point is deciding what to store where.

S
Sean Reilly

MongoDB is not magically faster. If you store the same data, organised in basically the same fashion, and access it exactly the same way, then you really shouldn't expect your results to be wildly different. After all, MySQL and MongoDB are both GPL, so if Mongo had some magically better IO code in it, then the MySQL team could just incorporate it into their codebase.

People are seeing real world MongoDB performance largely because MongoDB allows you to query in a different manner that is more sensible to your workload.

For example, consider a design that persisted a lot of information about a complicated entity in a normalised fashion. This could easily use dozens of tables in MySQL (or any relational db) to store the data in normal form, with many indexes needed to ensure relational integrity between tables.

Now consider the same design with a document store. If all of those related tables are subordinate to the main table (and they often are), then you might be able to model the data such that the entire entity is stored in a single document. In MongoDB you can store this as a single document, in a single collection. This is where MongoDB starts enabling superior performance.

In MongoDB, to retrieve the whole entity, you have to perform:

One index lookup on the collection (assuming the entity is fetched by id)

Retrieve the contents of one database page (the actual binary json document)

So a b-tree lookup, and a binary page read. Log(n) + 1 IOs. If the indexes can reside entirely in memory, then 1 IO.

In MySQL with 20 tables, you have to perform:

One index lookup on the root table (again, assuming the entity is fetched by id)

With a clustered index, we can assume that the values for the root row are in the index

20+ range lookups (hopefully on an index) for the entity's pk value

These probably aren't clustered indexes, so the same 20+ data lookups once we figure out what the appropriate child rows are.

So the total for mysql, even assuming that all indexes are in memory (which is harder since there are 20 times more of them) is about 20 range lookups.

These range lookups are likely comprised of random IO — different tables will definitely reside in different spots on disk, and it's possible that different rows in the same range in the same table for an entity might not be contiguous (depending on how the entity has been updated, etc).

So for this example, the final tally is about 20 times more IO with MySQL per logical access, compared to MongoDB.

This is how MongoDB can boost performance in some use cases.


what if we just put one main table in mysql?
@ariso: This is optimisation by denormalisation. It can provide a performance boost. However, if you do this, then you're throwing away your clean design, and all of the power (not to mention most of the features) of a relational database. And it only really works until you hit the column limit.
@SeanReilly Your example with entities (should be edited with objects , there is no entity oriented programming :) ) is invalid . Like ariso said , you could serialize an object and store it in the db and deserialize when needed (any form of serialisation) . The true power of persisting objects is held in oodbms not documnet db systems . But I agree that each has it's own purpose and strenghts (but your example obfuscates more the vision and relevance of this topic).
20 joins, I would say, is most likely not the best query on the best database schema these could possibly be.
@SeanReilly I found your example very helpful. You could build a special interface to MySQL that automatically serializes and deserializes objects to tables and behaves the way mongodb does. But then, why not just use something specifically designed to be used that way? Also your use of "entity" makes sense. The point is that you're organizing the data as documents rather than as fields in a table. Whether or not the document is an object composed in an OO language is irrelevant to the example.
t
theAndroid

Do you have concurrency, i.e simultaneous users ? If you just run 1000 times the query straight, with just one thread, there will be almost no difference. Too easy for these engines :)

BUT I strongly suggest that you build a true load testing session, which means using an injector such as JMeter with 10, 20 or 50 users AT THE SAME TIME so you can really see a difference (try to embed this code inside a web page JMeter could query).

I just did it today on a single server (and a simple collection / table) and the results are quite interesting and surprising (MongoDb was really faster on writes & reads, compared to MyISAM engine and InnoDb engine).

This really should be part of your test : concurrency & MySQL engine. Then, data/schema design & application needs are of course huge requirements, beyond response times. Let me know when you get results, I'm also in need of inputs about this!


Can you share you results?
Ya, results on that will be very helpful
Surely this this would just scale... if it was Apples to Apples like they have been saying in the rest of this topic. So if it on avg it performs x, now simulate from multiple sources, please explain why mongo would be faster. i.e lets just say for agreement sake's that mysql was on avg faster for single request... why would mongo now become faster for multiple? I don't find this to be very scientific. Im saying the test is valid.. but not so sure on how the difference would be huge if you were comparing Apples to Apples like the rest of the topic explains.
A
Aritz

Source: https://github.com/webcaetano/mongo-mysql

10 rows

mysql insert: 1702ms
mysql select: 11ms

mongo insert: 47ms
mongo select: 12ms

100 rows

mysql insert: 8171ms
mysql select: 10ms

mongo insert: 167ms
mongo select: 60ms

1000 rows

mysql insert: 94813ms (1.58 minutes)
mysql select: 13ms

mongo insert: 1013ms
mongo select: 677ms

10.000 rows

mysql insert: 924695ms (15.41 minutes)
mysql select: 144ms

mongo insert: 9956ms (9.95 seconds)
mongo select: 4539ms (4.539 seconds)

15 min to insert 10,000 rows? That's a very anemic MySQL database. In my experience, if such an operation approaches 1s in duration, my phone lights up with complaints. :)
A few points: 1) Mysql needs to be optimized and configured properly, there are a lot of different ways to insert big amounts of data, and done properly it can take 0.1% of the 15min, see this page for example. 2) MongoDB doesn't write the data to the disk straight away which is why it "looks" faster, but if your computer crashes, the data is lost. 3) Reading is much faster in MySQL
15min for 10.000 rows? You typed each row? =))))
For 10,000 rows - Drop indexes, insert, re-index
anyone who believes a claim that it takes 1.7 seconds to insert ten rows into mysql deserves the pain they get from mongo
r
reoxey

https://github.com/reoxey/benchmark

benchmark

speed comparison of MySQL & MongoDB in GOLANG1.6 & PHP5

system used for benchmark: DELL cpu i5 4th gen 1.70Ghz * 4 ram 4GB GPU ram 2GB

Speed comparison of RDBMS vs NoSQL for INSERT, SELECT, UPDATE, DELETE executing different number of rows 10,100,1000,10000,100000,1000000

Language used to execute is: PHP5 & Google fastest language GO 1.6

________________________________________________
GOLANG with MySQL (engine = MyISAM)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      1.195444ms
100                     6.075053ms
1000                    47.439699ms
10000                   483.999809ms
100000                  4.707089053s
1000000                 49.067407174s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 872.709µs


        SELECT & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 20.717354746s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 2.309209968s
100000                  257.411502ms
10000                   26.73954ms
1000                    3.483926ms
100                     915.17µs
10                      650.166µs


            DELETE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 6.065949ms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


________________________________________________
GOLANG with MongoDB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      2.067094ms
100                     8.841597ms
1000                    106.491732ms
10000                   998.225023ms
100000                  8.98172825s
1000000                 1m 29.63203158s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 5.251337439s


        FIND & DISPLAY (with index declared)
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 21.540603252s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1                       1.330954ms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

________________________________________________
PHP5 with MySQL (engine = MyISAM)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
 10                     0.0040680000000001s
 100                    0.011595s
 1000                   0.049718s
 10000                  0.457164s
 100000                 4s
 1000000                42s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
 1000000                <1s


            SELECT & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
  1000000               20s
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

________________________________________________
PHP5 with MongoDB 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      0.065744s
100                     0.190966s
1000                    0.2163s
10000                   1s
100000                  8s
1000000                 78s


            FIND
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 <1s


            FIND & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 7s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 9s

myisam isn't innodb, also which mongodb version and storage engine?
it's important to specify MySQL and MongoDB versions.
Don't use MyISAM. Do use batched inserts!
MySQL is faster than Mongodb in Insert query?! it's seem not true until mysql need to prepare columns and rerelationships. mysql select is faster than mongodb select, but in insert query, mongo is faster
G
Gabe Rainbow

man,,, the answer is that you're basically testing PHP and not a database.

don't bother iterating the results, whether commenting out the print or not. there's a chunk of time.

   foreach ($cursor as $obj)
    {
        //echo $obj["thread_title"] . "<br><Br>";
    }

while the other chunk is spend yacking up a bunch of rand numbers.

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000) ;

    }
    return $numbers;
}

then theres a major difference b/w implode and in.

and finally what is going on here. looks like creating a connection each time, thus its testing the connection time plus the query time.

$m = new Mongo();

vs

$db = new AQLDatabase();

so your 101% faster might turn out to be 1000% faster for the underlying query stripped of jazz.

urghhh.


naturally, coding practice can make a big difference in any situation, but this isn't specific to any type of language, api, or extension. generating the random numbers before starting the timer will make a difference, but the majority of the time within the process is no doubt from the database transactions. random number generation is trivial, SQL and NoSQL databases are not.
dont pick on the rand number. clearly you missed the creating connection each time. all issues add up to testing something other than intended.
Nope, didn't miss it. MySQL wont close the connection until the script finishes unless mysqli_close() is called. Otherwise, repeat calls to mysqli_connect() will only pull the existing mysql resource from the current resource table, rather than committing to a new connection procedure. I'm not exactly sure what the AQLDatabase object is, but if it uses the mysql lib (which it likely does) it will have the same behavior. The MongoDB extension uses connection pooling, so the same basic thing occurs when creating a mongodb 'connection' more than once in a script.
I agree that his benchmark could have been done differently, but it reflects the same basic results as other MySQL vs Mongo benches that I've seen. Mongo is typically faster when inserting (much faster for more simple inserts) and MySQL is typically faster when selecting.
admittedly, i was too surly; it was that html string concat of "
" that really 'urghed' me out. you don't need pretty print in tests. even iterating it seems like a php test and not a database test. overall, that AQLDatabase 'possibly/maybe' moment... more ingredients means more unknowns.
J
Jason Hitchings

Here is a little research that explored RDBMS vs NoSQL using MySQL vs Mongo, the conclusions were inline with @Sean Reilly's response. In short, the benefit comes from the design, not some raw speed difference. Conclusion on page 35-36:

RDBMS vs NoSQL: Performance and Scaling Comparison

The project tested, analysed and compared the performance and scalability of the two database types. The experiments done included running different numbers and types of queries, some more complex than others, in order to analyse how the databases scaled with increased load. The most important factor in this case was the query type used as MongoDB could handle more complex queries faster due mainly to its simpler schema at the sacrifice of data duplication meaning that a NoSQL database may contain large amounts of data duplicates. Although a schema directly migrated from the RDBMS could be used this would eliminate the advantage of MongoDB’s underlying data representation of subdocuments which allowed the use of less queries towards the database as tables were combined. Despite the performance gain which MongoDB had over MySQL in these complex queries, when the benchmark modelled the MySQL query similarly to the MongoDB complex query by using nested SELECTs MySQL performed best although at higher numbers of connections the two behaved similarly. The last type of query benchmarked which was the complex query containing two JOINS and and a subquery showed the advantage MongoDB has over MySQL due to its use of subdocuments. This advantage comes at the cost of data duplication which causes an increase in the database size. If such queries are typical in an application then it is important to consider NoSQL databases as alternatives while taking in account the cost in storage and memory size resulting from the larger database size.


u
user1034912

Honestly even if MongoDB is slower, MongoDB definitely makes me and you code faster.... no need to worry about silly table columns, row or entity migrations...

With MongoDB, you just instantiate a class and save!


Disagree, you must choose the most suitable technology, not the easiest to code. Cos you can save everything into a text file and then no need to worry about the database at all :)
Disagreed. That's a no-brainer move. MongoDB cannot perform transactions, where MySQL can. this saves me a lot of coding wherein in mongo, I'd have to write a function block on the application to perform it. So you really need to consider the design of each system very carefully. both have their pros and cons.
Also debatable about coding faster, that's subjective. Some people prefer ORMs and micro-ORMs, to manage the model changes and / or migrations automatically which results in the developer only managing the schema in one place (the database) just the same as they would be otherwise if the application layer.
F
Fawaz Ahmed

From MongoDB Offical site

Observing some of the high-level query behaviors of the two systems, we can see that MySQL is faster at selecting a large number of records, while MongoDB is significantly faster at inserting or updating a large number of records.

Reference


z
zhuomin chen

On Single Server, MongoDb would not be any faster than mysql MyISAM on both read and write, given table/doc sizes are small 1 GB to 20 GB. MonoDB will be faster on Parallel Reduce on Multi-Node clusters, where Mysql can NOT scale horizontally.


Can you provide some evidence or more detail to back that up?
Cannot scale horizontally? What about NDB? DRBD backed MySQL?
This is not true. MongoDB has 16MD document limit. Mysql can have much more if you like