ChatGPT解决这个技术问题 Extra ChatGPT

Signed versus Unsigned Integers

Am I correct to say the difference between a signed and unsigned integer is:

Unsigned can hold a larger positive value and no negative value. Unsigned uses the leading bit as a part of the value, while the signed version uses the left-most-bit to identify if the number is positive or negative. Signed integers can hold both positive and negative numbers.

Any other differences?

Because 0 is neither positive nor negative, it is more appropriate to use the term non-negative value instead of positive value for unsigned integers.

S
Sabito 錆兎 stands with Ukraine

Unsigned can hold a larger positive value and no negative value.

Yes.

Unsigned uses the leading bit as a part of the value, while the signed version uses the left-most-bit to identify if the number is positive or negative.

There are different ways of representing signed integers. The easiest to visualise is to use the leftmost bit as a flag (sign and magnitude), but more common is two's complement. Both are in use in most modern microprocessors — floating point uses sign and magnitude, while integer arithmetic uses two's complement.

Signed integers can hold both positive and negative numbers.

Yes.


c
compie

I'll go into differences at the hardware level, on x86. This is mostly irrelevant unless you're writing a compiler or using assembly language. But it's nice to know.

Firstly, x86 has native support for the two's complement representation of signed numbers. You can use other representations but this would require more instructions and generally be a waste of processor time.

What do I mean by "native support"? Basically I mean that there are a set of instructions you use for unsigned numbers and another set that you use for signed numbers. Unsigned numbers can sit in the same registers as signed numbers, and indeed you can mix signed and unsigned instructions without worrying the processor. It's up to the compiler (or assembly programmer) to keep track of whether a number is signed or not, and use the appropriate instructions.

Firstly, two's complement numbers have the property that addition and subtraction is just the same as for unsigned numbers. It makes no difference whether the numbers are positive or negative. (So you just go ahead and ADD and SUB your numbers without a worry.)

The differences start to show when it comes to comparisons. x86 has a simple way of differentiating them: above/below indicates an unsigned comparison and greater/less than indicates a signed comparison. (E.g. JAE means "Jump if above or equal" and is unsigned.)

There are also two sets of multiplication and division instructions to deal with signed and unsigned integers.

Lastly: if you want to check for, say, overflow, you would do it differently for signed and for unsigned numbers.


G
Garbit

He only asked about signed and unsigned. Don't know why people are adding extra stuff in this. Let me tell you the answer.

Unsigned: It consists of only non-negative values i.e 0 to 255. Signed: It consist of both negative and positive values but in different formats like 0 to +127 -1 to -128

And this explanation is about the 8-bit number system.


Nice. Simple. Concise. Excellent job.
L
Leejay Schmidt

According to what we learned in class, signed integers can represent both positive and negative numbers, while unsigned integers are only non-negative.

For example, looking at an 8-bit number:

unsigned values 0 to 255

signed values range from -128 to 127


M
Michael Burr

Just a few points for completeness:

this answer is discussing only integer representations. There may be other answers for floating point;

the representation of a negative number can vary. The most common (by far - it's nearly universal today) in use today is two's complement. Other representations include one's complement (quite rare) and signed magnitude (vanishingly rare - probably only used on museum pieces) which is simply using the high bit as a sign indicator with the remain bits representing the absolute value of the number.

When using two's complement, the variable can represent a larger range (by one) of negative numbers than positive numbers. This is because zero is included in the 'positive' numbers (since the sign bit is not set for zero), but not the negative numbers. This means that the absolute value of the smallest negative number cannot be represented.

when using one's complement or signed magnitude you can have zero represented as either a positive or negative number (which is one of a couple of reasons these representations aren't typically used).


If i write unsigned int a = -2 , and signed int b = -2 , would the underlying representation same , i know it is not good to have unsigned number given a negative value , but still if i give it , what will be the underlying representation ?
Minor niggle: sign and magnitude is used in IEEE floating point, so it's actually quite common. :-)
J
Jasper Bekkers

Everything except for point 2 is correct. There are many different notations for signed ints, some implementations use the first, others use the last and yet others use something completely different. That all depends on the platform you're working with.


Is that the little-endian and big-endian thing?
little vs. big endian has to do with the order of the bytes on the platform. Little endian might do 0xFF 0xFE 0x7F while big endian will do 0x7F 0xFE 0xFF.
M
Mike Gleen

Another difference is when you are converting between integers of different sizes.

For example, if you are extracting an integer from a byte stream (say 16 bits for simplicity), with unsigned values, you could do:

i = ((int) b[j]) << 8 | b[j+1]

(should probably cast the 2nd byte, but I'm guessing the compiler will do the right thing)

With signed values you would have to worry about sign extension and do:

i = (((int) b[i]) & 0xFF) << 8 | ((int) b[i+1]) & 0xFF

J
Jonathan Leffler

Over and above what others have said, in C, you cannot overflow an unsigned integer; the behaviour is defined to be modulus arithmetic. You can overflow a signed integer and, in theory (though not in practice on current mainstream systems), the overflow could trigger a fault (perhaps similar to a divide by zero fault).


Note that signed integer overflow does trigger undefined behaviour, and modern compilers are ultra-aggressive about spotting this and exploiting it to modify your program in unexpected but technically legitimate ways because they're allowed to assume undefined behaviour won't occur — roughly speaking. This is much more of a problem now than it was 7 years ago.
t
toddk

Generally speaking that is correct. Without knowing anything more about why you are looking for the differences I can't think of any other differentiators between signed and unsigned.


s
supercat

Signed integers in C represent numbers. If a and b are variables of signed integer types, the standard will never require that a compiler make the expression a+=b store into a anything other than the arithmetic sum of their respective values. To be sure, if the arithmetic sum would not fit into a, the processor might not be able to put it there, but the standard would not require the compiler to truncate or wrap the value, or do anything else for that matter if values that exceed the limits for their types. Note that while the standard does not require it, C implementations are allowed to trap arithmetic overflows with signed values.

Unsigned integers in C behave as abstract algebraic rings of integers which are congruent modulo some power of two, except in scenarios involving conversions to, or operations with, larger types. Converting an integer of any size to a 32-bit unsigned type will yield the member corresponding to things which are congruent to that integer mod 4,294,967,296. The reason subtracting 3 from 2 yields 4,294,967,295 is that adding something congruent to 3 to something congruent to 4,294,967,295 will yield something congruent to 2.

Abstract algebraic rings types are often handy things to have; unfortunately, C uses signedness as the deciding factor for whether a type should behave as a ring. Worse, unsigned values are treated as numbers rather than ring members when converted to larger types, and unsigned values smaller than int get converted to numbers when any arithmetic is performed upon them. If v is a uint32_t which equals 4,294,967,294, then v*=v; should make v=4. Unfortunately, if int is 64 bits, then there's no telling what v*=v; could do.

Given the standard as it is, I would suggest using unsigned types in situations where one wants the behavior associated with algebraic rings, and signed types when one wants to represent numbers. It's unfortunate that C drew the distinctions the way it did, but they are what they are.


b
bhavesh

Yes, unsigned integer can store large value. No, there are different ways to show positive and negative values. Yes, signed integer can contain both positive and negative values.


e
exitcode

(in answer to the second question) By only using a sign bit (and not 2's complement), you can end up with -0. Not very pretty.


Just to add to this answer, basically it means that 10 == 00 where both those numbers are base 2.
M
Matthew

Unsigned integers are far more likely to catch you in a particular trap than are signed integers. The trap comes from the fact that while 1 & 3 above are correct, both types of integers can be assigned a value outside the bounds of what it can "hold" and it will be silently converted.

unsigned int ui = -1;
signed int si = -1;

if (ui < 0) {
    printf("unsigned < 0\n");
}
if (si < 0) {
    printf("signed < 0\n");
}
if (ui == si) {
    printf("%d == %d\n", ui, si);
    printf("%ud == %ud\n", ui, si);
}

When you run this, you'll get the following output even though both values were assigned to -1 and were declared differently.

signed < 0
-1 == -1
4294967295d == 4294967295d

C
Clearer

The only guaranteed difference between a signed and an unsigned value in C is that the signed value can be negative, 0 or positive, while an unsigned can only be 0 or positive. The problem is that C doesn't define the format of types (so you don't know that your integers are in two's complement). Strictly speaking the first two points you mentioned are incorrect.


F
Fahad Naeem

You must used unsigned Integers when programming on Embedded Systems. In loops, when there is no need for signed integers, using unsigned integers will save safe necessary for designing such systems.


A
Aaron A.

The best answer I found on this was thanks to IBM quoting the XDR standard:

Integer An XDR signed integer is a 32-bit piece of data that encodes an integer in the range [-2147483648,2147483647]. The integer is represented in two's complement notation. The most and least significant bytes are 0 and 3, respectively. The data description of integers is integer. Unsigned integer An XDR unsigned integer is a 32-bit piece of data that encodes a nonnegative integer in the range [0,4294967295]. It is represented by an unsigned binary number whose most and least significant bytes are 0 and 3, respectively. The data description of unsigned integers is unsigned.

see XDR Standard on Wikipedia


h
hl037_

This is all about modeling : When you want to design a computer, you need to adopt conventions to how you represent data, and how you compute them. And of course, provide different models with different operations and properties (performance, memory space required, hardware implementation complexity etc.)

Turns out, with computation based on electricity (thus electronics), the most convenient way we found to represent an information is to use the voltage level. ...And the most convenient way to compute with these voltage level is considering two states : Presence of a voltage and absence of voltage. Here comes the "bit".

This is why we use binary to represent numbers : a succession of electronic pins with either a high voltage (1) or a low voltage (0).

However, if you count using binary, you can only represent natural numbers (0, 1, 2, ...). Exactly 2^n (where n is the number of bits you have) numbers.

This permits you to do addition multiplication, division, and subtraction if you ensure the first operand is greater than the second, end if you check the result would not exceed the number of bits you have.

Then, some smart guys came and though : "What happens when you do n - m with m > n, using exactly the same algorithm ?"

...And what happens is that it actually kinda work : you just have to add one to your number if you have a carry (wrap around) after, and consider that both 0...0 and 1...1 represents 0. That's One's complement Ones'_complement However, by doing so, you have to reserve one bit for the sign. Technically, you can represent values from -(2^(n-1)-1) ≤ n ≤ 2^(n-1)-1 Which are : (2^n)-1 (two representations for 0). In this representation, you just have to swap all the bits to negate a number.

Then even smarter guys came and tell "What if we consider there is always a wrap around when we negate the number ?" ...That means you add one after you have swapped the bits. And you get 2's complement Two's complement Using it, your zero has only One representation, and you can again represent 2^n numbers ( with 2^(n-1) ≤ n ≤ 2^(n-1)-1 ). Plus, the computation of a-b really just is a+(-b), which only requires two kind of operation : add(a, add(swap(b), 1)))

Another nice thing about 2's complement, is that the addition algorithm is the same as the unsigned one. Therefore you get the same properties, and use the same hardware to do both. This is why it's the representation used in most computers.

In short, signed and unsigned can represent the same count of numbers, but on a different range, and now, you know which precisely and why. For more detail about the algebraic structure obtained, read this response : https://stackoverflow.com/a/23304179/1745291

Then use one or the other depending on the context (note however that for some operation, like <, the treatment is different when casting : ((signed) -1) < 5 but ((unsigned) -1) > 5