ChatGPT解决这个技术问题 Extra ChatGPT

mysql - how many columns is too many?

I'm setting up a table that might have upwards of 70 columns. I'm now thinking about splitting it up as some of the data in the columns won't be needed every time the table is accessed. Then again, if I do this I'm left with having to use joins.

At what point, if any, is it considered too many columns?

We don't have to use SELECT * all the time. We always have the option to select just the columns we need for a given situation.
70 columns?! How many of those can't be null?
The big question is... are you normalizing your tables? 70 is an unusual amount unless you are deliberately denormalizing for performance (very few things have 70 unique attributes). If you are denormalizing for the sake of performance then I would agree with ChssPly76 that you can use whatever the database will let you get away with.
@KM. is that supposed to be a joke? I'm new to MySQL and cannot get it, did you mean JOIN is a good thing or something to try and avoid?
As much as joins are a core part of SQL, joining for the sake of joining will probably degrade performance and maintainability for whatever application you have.

b
bubakazouba

It's considered too many once it's above the maximum limit supported by the database.

The fact that you don't need every column to be returned by every query is perfectly normal; that's why SELECT statement lets you explicitly name the columns you need.

As a general rule, your table structure should reflect your domain model; if you really do have 70 (100, what have you) attributes that belong to the same entity there's no reason to separate them into multiple tables.


@ChssPly76, it is a relational database not an object model. there are tables, rows and columns, work within that constraint if you want max performance, mimic your objects for convenience at the sake of performance. So should every piece of information about a person be stored within the same row? no, break them out and group them into different tables (using my example form my previous comment): "Person", "Activities" "HealthRecords". Storing a SUM for performance reasons is a completely different issue than keeping all data in 70 columns to avoid joins.
Should "numberOfTeethPulled" be a part of Person record? No, it probably shouldn't be stored at all - you'll get that info from "ToothExtractionRecord" if your domain model requires such level of detail. But that's YOUR (and, dare I say, rather contrived) example - it has nothing to do with my point: large number of columns in a table does NOT mean table is denormalized. Think real estate contracts / purchase orders / other financial documents just to name a few examples. Can they be further split up into multiple tables? Yes. Any reason to do so? Not really.
+1, that was hilarious. If you are creating another table, and it's just going to be a 1:1 relationship you should probably just include it in the main table. It's not going to save space, It's not going to perform that much better if you don't request the data vs it not being in the table at all. The only legit reason that comes to mind for me right now, is if there is sensitive information in there such as SSN, credit card info, etc...
If I have one table have 15 cols, and another one have 300 cols, the primary key of the two tables are the same. Select one column in the two tables, will the performance significantly differ?
j
jonstjohn

There are some benefits to splitting up the table into several with fewer columns, which is also called Vertical Partitioning. Here are a few:

If you have tables with many rows, modifying the indexes can take a very long time, as MySQL needs to rebuild all of the indexes in the table. Having the indexes split over several table could make that faster. Depending on your queries and column types, MySQL could be writing temporary tables (used in more complex select queries) to disk. This is bad, as disk i/o can be a big bottle-neck. This occurs if you have binary data (text or blob) in the query. Wider table can lead to slower query performance.

Don't prematurely optimize, but in some cases, you can get improvements from narrower tables.


Why does MySQL needs to rebuild all of the indexes in the table if only single one is modified?
I was wondering the same . Why MySQL rebuild all of the indexes in the table ? Is the above mentioned statement correct ?
J
JohnFx

It is too many when it violates the rules of normalization. It is pretty hard to get that many columns if you are normalizing your database. Design your database to model the problem, not around any artificial rules or ideas about optimizing for a specific db platform.

Apply the following rules to the wide table and you will likely have far fewer columns in a single table.

No repeating elements or groups of elements No partial dependencies on a concatenated key No dependencies on non-key attributes

Here is a link to help you along.


It is pretty hard to get that many columns if you are normalizing your database. Not as hard as it seems.
Definitely not that hard. People don't seem to really understand normal forms around these here parts. You can have 10000 columns and STILL be normalized (even to the highest normal form).
@foljs And that is exactly where the accepted practice of denormalization comes in. If you're at an intersection and a car is about to drive into you, it would be stupid to wait for the light to turn green. You have to get out of the way. While going through the red light might not technically be legal, you're doing what you should obviously do given the situation = denormalization
You lost me when you started talking about cars. No idea what the relevance is.
However, how do you do complex queries in this scenario with single data table, you can’t, you have to heavily rely programming language and variety of other stuff to make this work! So, I might as well go back to having a table with 170 columns, because having "JOIN" queries and extra complex programming that's require to making separate tables work seems to me like a waste of time. I guess I am a big fan of KISS principle.
W
Willie Cheng

That's not a problem unless all attributes belong to the same entity and do not depend on each other. To make life easier you can have one text column with JSON array stored in it. Obviously, if you don't have a problem with getting all the attributes every time. Although this would entirely defeat the purpose of storing it in an RDBMS and would greatly complicate every database transaction. So its not recommended approach to be followed throughout the database.


T
Today

Having too many columns in the same table can cause huge problems in the replication as well. You should know that the changes that happen in the master will replicate to the slave.. for example, if you update one field in the table, the whole row will be w