ChatGPT解决这个技术问题 Extra ChatGPT

What is the difference between char, nchar, varchar, and nvarchar in SQL Server?

What is meant by nvarchar?

What is the difference between char, nchar, varchar, and nvarchar in SQL Server?


s
shA.t

Just to clear up... or sum up...

nchar and nvarchar can store Unicode characters.

char and varchar cannot store Unicode characters.

char and nchar are fixed-length which will reserve storage space for number of characters you specify even if you don't use up all that space.

varchar and nvarchar are variable-length which will only use up spaces for the characters you store. It will not reserve storage like char or nchar.

nchar and nvarchar will take up twice as much storage space, so it may be wise to use them only if you need Unicode support.


char and varchar aren't meant to store unicode, but with some additional coding tricks and extra logic, you can still misuse a [var]char field for unicode storage.
It is collation dependant whether or not the n... versions take up twice as much storage space as my answer shows
What's the advantage to reserving storage?
On the last ppoint: Using Unicode nchar and nvarchar are still better in most cases, better collation, flexibility for the users, removes future compatibility issues. And by the way storage space is not an issue for this case, as using collation without Unicode is a lot of hassle, and memory rates will continue to decrease in future
@BenCaine char(20) will use 20 bytes (assuming an 8-bit collation); varchar(20) will use len(data)+2 bytes, ie 22 for 20 bytes of data, but only 12 for 10 bytes of data. The extra two bytes are the length records. If your data will always be the full length, then use a char, as it saves space and may be faster. Please don't ever use a varchar(1), or indeed anything smaller than a varchar(4). A single character in varchar format uses three bytes, so a char(3) will never use more space than a varchar(3).
C
Community

All the answers so far indicate that varchar is single byte, nvarchar is double byte. The first part of this actually depends on collation as illustrated below.

DECLARE @T TABLE
(
C1 VARCHAR(20) COLLATE Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS,
C2 NVARCHAR(20)COLLATE  Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS
)

INSERT INTO @T 
    VALUES (N'中华人民共和国',N'中华人民共和国'),
           (N'abc',N'abc');

SELECT C1,
       C2,
       LEN(C1)        AS [LEN(C1)],
       DATALENGTH(C1) AS [DATALENGTH(C1)],
       LEN(C2)        AS [LEN(C2)],
       DATALENGTH(C2) AS [DATALENGTH(C2)]
FROM   @T  

Returns

https://i.stack.imgur.com/jhMMM.png

Note that the and characters were still not represented in the VARCHAR version and were silently replaced with ?.

There are actually still no Chinese characters that can be reprsented by a single byte in that collation. The only single byte characters are the typical western ASCII set.

Because of this it is possible for an insert from a nvarchar(X) column to a varchar(X) column to fail with a truncation error (where X denotes a number that is the same in both instances).

SQL Server 2012 adds SC (Supplementary Character) collations that support UTF-16. In these collations a single nvarchar character may take 2 or 4 bytes.


The kind of answer I was looking for. Also to save time to the likes of me - the non-english text translates to "People's Republic of China" translate.google.com/#auto/en/…
C
Community

nchar and char pretty much operate in exactly the same way as each other, as do nvarchar and varchar. The only difference between them is that nchar/nvarchar store Unicode characters (essential if you require the use of extended character sets) whilst varchar does not.

Because Unicode characters require more storage, nchar/nvarchar fields take up twice as much space (so for example in earlier versions of SQL Server the maximum size of an nvarchar field is 4000).

This question is a duplicate of this one.


You forget one thing: nchar uses a fixed-length so nchar(10) always needs to receive ten characters. And varchar(10) is indeed Unicode and will accept any number of characters, up to 10 characters. Also see msdn.microsoft.com/en-us/library/ms186939.aspx
D
Dimuthu

Just to add something more: nchar - adds trailing spaces to the data. nvarchar - does not add trailing spaces to the data.

So, if you are going to filter your dataset by an 'nchar' field, you may want to use RTRIM to remove the spaces. E.g. nchar(10) field called BRAND stores the word NIKE. It adds 6 spaces to the right of the word. So, when filtering, the expression should read: RTRIM(Fields!BRAND.Value) = "NIKE"

Hope this helps someone out there because I was struggling with it for a bit just now!


every option is adding trailing space in my SQL Server... :/
P
PeterAllenWebb

My attempt to summarize and correct the existing answers:

First, char and nchar will always use a fixed amount of storage space, even when the string to be stored is smaller than the available space, whereas varchar and nvarchar will use only as much storage space as is needed to store that string (plus two bytes of overhead, presumably to store the string length). So remember, "var" means "variable", as in variable space.

The second major point to understand is that, nchar and nvarchar store strings using exactly two bytes per character, whereas char and varchar use an encoding determined by the collation code page, which will usually be exactly one byte per character (though there are exceptions, see below). By using two bytes per character, a very wide range of characters can be stored, so the basic thing to remember here is that nchar and nvarchar tend to be a much better choice when you want internationalization support, which you probably do.

Now for some some finer points.

First, nchar and nvarchar columns always store data using UCS-2. This means that exactly two bytes per character will be used, and any Unicode character in the Basic Multilingual Plane (BMP) can be stored by an nchar or nvarchar field. However, it is not the case that any Unicode character can be stored. For example, according to Wikipedia, the code points for Egyptian hieroglyphs fall outside of the BMP. There are, therefore, Unicode strings that can be represented in UTF-8 and other true Unicode encodings that cannot be stored in a SQL Server nchar or nvarchar field, and strings written in Egyptian hieroglyphs would be among them. Fortunately your users probably don't write in that script, but it's something to keep in mind!

Another confusing but interesting point that other posters have highlighted is that char and varchar fields may use two bytes per character for certain characters if the collation code page requires it. (Martin Smith gives an excellent example in which he shows how Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS exhibits this behavior. Check it out.)

UPDATE: As of SQL Server 2012, there are finally code pages for UTF-16, for example Latin1_General_100_CI_AS_SC, which can truly cover the entire Unicode range.


s
sth

char: fixed-length character data with a maximum length of 8000 characters.

nchar: fixed-length unicode data with a maximum length of 4000 characters.

Char = 8 bit length

NChar = 16 bit length


char couldn't have an 8-bit length. It doesn't have to store the length, and the fixed length can be up to 8000 characters.
R
Rasel

nchar[(n)] (national character)

Fixed-length Unicode string data.

n defines the string length and must be a value from 1 through 4,000.

The storage size is two times n bytes.

nvarchar [(n | max)] (national character varying.)

Variable-length Unicode string data.

n defines the string length and can be a value from 1 through 4,000.

max indicates that the maximum storage size is 2^31-1 bytes (2 GB).

The storage size, in bytes, is two times the actual length of data entered + 2 bytes

char [(n)] (character)

Fixed-length, non-Unicode string data.

n defines the string length and must be a value from 1 through 8,000.

The storage size is n bytes.

varchar [(n | max)] (character varying)

Variable-length, non-Unicode string data.

n defines the string length and can be a value from 1 through 8,000.

max indicates that the maximum storage size is 2^31-1 bytes (2 GB).

The storage size is the actual length of the data entered + 2 bytes.


J
Jason Kresowaty

nchar(10) is a fixed-length Unicode string of length 10. nvarchar(10) is a variable-length Unicode string with a maximum length of 10. Typically, you would use the former if all data values are 10 characters and the latter if the lengths vary.


Wrong comparison - question relates to nchar and varchar, not nchar and nvarchar.
W
Wim ten Brink

The differences are:

n[var]char stores unicode while [var]char just stores single-byte characters. [n]char requires a fixed number of characters of the exact length while [n]varchar accepts a variable number of characters up to and including the defined length.

Another difference is length. Both nchar and nvarchar can be up to 4,000 characters long. And char and varchar can be up to 8000 characters long. But for SQL Server you can also use a [n]varchar(max) which can handle up to 2,147,483,648 characters. (Two gigabytes, a signed 4-byte integer.)


V
Venkataraman R

nchar requires more space than nvarchar.

eg,

A nchar(100) will always store 100 characters even if you only enter 5, the remaining 95 chars will be padded with spaces. Storing 5 characters in a nvarchar(100) will save 5 characters.


Not completely true, since you're required to fill a char(100) with up to 100 characters. You would use this when you're eg store phone numbers in your database, or order numbers with a fixed length. Because the field length is fixed, you have no choice to fill it up to the maximum number of characters. But when all your data is 100 characters per record, a char(100) will take less storage than a varchar(100) because it doesn't need a length indication: every value would be exactly 100 characters.
M
Manu

nchar is fixed-length and can hold unicode characters. it uses two bytes storage per character.

varchar is of variable length and cannot hold unicode characters. it uses one byte storage per character.


Wrong. Unicode can use 1 to 4 bytes (in general) for every character. Also, a varchar can hold unicode, but it's not recognised as unicode. As a result, a varchar is considered unreliable for unicode storage. (Especially since there's a risk that the code that accesses the field will translate it incorrectly.)
@Alex: I think you made your point but I still do not agree with you. What you are saying is that an int CAN hold a long if the long happens to be smaller than 2^32. This is not only 'unreliable', it is an inherent limitation which makes it impossible to cover the whole value range.
@Workshop Alex: Wrong. Unicode encoded as UCS-2 (which happens to be the encoding used by SQL Server) stores every character in exactly two bytes, see msdn.microsoft.com/en-us/library/bb330962%28v=sql.90%29.aspx: SQL Server stores Unicode in the UCS-2 encoding scheme... UCS-2 is a fixed-length encoding that represents all characters as a 16-bit value (2 bytes). SQL Server 2008 can use SCSU compression, but is still compression of the UCS-2 encoded Unicode strings: msdn.microsoft.com/en-us/library/ee240835.aspx
G
Gustavo Rubio

NVARCHAR can store Unicode characters and takes 2 bytes per character.


WRONG! Unicode uses between 1 and 4 bytes per character! Many people forget this! Even the use of UTF-16 might result in some characters taking 4 bytes instead of 2, although the common length will be 2 bytes. Certain other subformats of Unicode might take even more than 4 bytes!
@WimtenBrink - The question is about SQL Server and nvarchar always takes 2 bytes per character.
@Wim, you are correct there are several encodings for Unicode that can produce a different number of bytes. But SQL Server does not give you a choice about Unicode encoding. SQL Server prior to 2012 only used UCS-2, two bytes wide, so Martin was correct at the time he wrote the answer. As other answers above have said, SQL Server 2012 now provides UTF-16, so two bytes for many characters (those in the Unicode Basic Multiliingual Plane), four bytes for others.