ChatGPT解决这个技术问题 Extra ChatGPT

Why doesn't C have unsigned floats?

I know, the question seems to be strange. Programmers sometimes think too much. Please read on...

In C I use signed and unsigned integers a lot. I like the fact that the compiler warns me if I do things like assigning a signed integer to an unsigned variable. I get warnings if I compare signed with unsigned integers and much much more.

I like these warnings. They help me to keep my code correct.

Why don't we have the same luxury for floats? A square-root will definitely never return a negative number. There are other places as well where a negative float value has no meaning. Perfect candidate for an unsigned float.

Btw - I'm not really keen about the single extra bit of precision that I could get by removing the sign bit from the floats. I'm super happy with floats as they are right now. I'd just like to mark a float as unsigned sometimes and get the same kind of warnings that I get with integers.

I'm not aware of any programming language that supports unsigned floating-point numbers.

Any idea why they don't exist?

EDIT:

I know that the x87 FPU has no instructions to deal with unsigned floats. Lets just use the signed float instructions. Misuse (e.g. going below zero) could be considered undefined behaviour in the same way as overflow of signed integers is undefined.

Interesting, can you post an example of a case where signedness typechecking was helpful?
litb, was your comment directed at me? if so, i dont get it
Iraimbilanja yeah :) fabs can't return a negative number, because it returns the absolute value of its argument
Right.i didnt ask how a hypothetical unsignedfloat could help corectness.what i asked was:in what situation did pipenbrinck find Int signedness typechecking helpful(leading him toseek the same mechanism for floats).the reason i ask is that i find unsigneds entirely useless with regards to typesafety
There is an unsigned micro-optimisation for point-in-range check: ((unsigned)(p-min))<(max-min), which only has one branch, but, as always, it's best to profile to see if it really helps (I mostly used it on 386 cores so I don't know how modern CPUs cope).

B
Brian R. Bondy

Why C++ doesn't have support for unsigned floats is because there is no equivalent machine code operations for the CPU to execute. So it would be very inefficient to support it.

If C++ did support it, then you would be sometimes using an unsigned float and not realizing that your performance has just been killed. If C++ supported it then every floating point operation would need to be checked to see if it is signed or not. And for programs that do millions of floating point operations, this is not acceptable.

So the question would be why don't hardware implementers support it. And I think the answer to that is that there was no unsigned float standard defined originally. Since languages like to be backwards compatible, even if it were added languages couldn't make use of it. To see the floating point spec you should look at the IEEE standard 754 Floating-Point.

You can get around not having an unsigned floating point type though by creating a unsigned float class that encapsulates a float or double and throws warnings if you try to pass in a negative number. This is less efficient, but probably if you aren't using them intensely you won't care about that slight performance loss.

I definitely see the usefulness of having an unsigned float. But C/C++ tends to chose efficiency that works best for everyone over safety.


C/C++ does not require specific machine code operations to implement the language. Early C/C++ compilers could generate floating point code for the 386 - a CPU with no FPU! The compiler would generate library calls to emulate FPU instructions. Therefore, a ufloat is could be done without CPU support
Skizz, while that is correct, Brian already addressed this - that because there is no equivalent machine code the performance will be horrible by comparison.
@Brian R. Bondy: I lost you here: "because there is no equivalent machine code operations for the CPU to execute...". Can you please explain, in simpler terms?
The reason OP wanted support for unsigned floats was for warning messages, so really it has nothing to do with the code generation phase of the compiler - only to do with how it does the type checking beforehand - so support for them in machine code is irrelevant and (as has been added to the bottom of the question) normal floating point instructions could be used for actual execution.
I'm not sure I see why this should affect performance. Just like with int's, all the sign-related type-checking could happen at compile time. OP suggests that unsigned float would be implemented as a regular float with compile-time checks to ensure that certain non-meaningful operations are never performed. The resulting machine code and performance could be identical, regardless of whether your floats are signed or not.
T
Tim Cooper

There is a significant difference between signed and unsigned integers in C/C++:

value >> shift

signed values leave the top bit unchanged (sign extend), unsigned values clear the top bit.

The reason there is no unsigned float is that you quickly run into all sorts of problems if there are no negative values. Consider this:

float a = 2.0f, b = 10.0f, c;
c = a - b;

What value does c have? -8. But what would that mean in a system without negative numbers. FLOAT_MAX - 8 perhaps? Actually, that doesn't work as FLOAT_MAX - 8 is FLOAT_MAX due to precision effects so things are even more screwy. What if it was part of a more complex expression:

float a = 2.0f, b = 10.0f, c = 20.0f, d = 3.14159f, e;
e = (a - b) / d + c;

This isn't a problem for integers due to the nature of the 2's complement system.

Also consider standard mathematical functions: sin, cos and tan would only work for half their input values, you couldn't find the log of values < 1, you couldn't solve quadratic equations: x = (-b +/- root (b.b - 4.a.c)) / 2.a, and so on. In fact, it probably wouldn't work for any complex function as these tend to be implemented as polynomial approximations which would use negative values somewhere.

So, unsigned floats are pretty useless.

But that doesn't mean to say that a class that range checks float values isn't useful, you may want to clamp values to a given range, for example RGB calculations.


@Skizz: if representation is a problem, you mean if someone can devise a method to store floats that is as efficient as 2's complement, there will be no problem with having unsigned floats?
value >> shift for signed values leave the top bit unchanged (sign extend) Are you sure about that? I thought that was implementation-defined behavior, at least for negative signed values.
@Dan: Just looked at the recent standard and it does indeed state that it is implementation defined - I guess that's just in case there's a CPU that has no shift right with sign extend instruction.
floating point traditionally saturates (to -/+Inf) instead of wrapping. You might expect unsigned subtraction overflow to saturate to 0.0, or possibly Inf or NaN. Or just be Undefined Behaviour, like the OP suggested in an edit to the question. Re: trig functions: so don't define unsigned-input versions of sin and so on, and make sure to treat their return value as signed. The question wasn't proposing replacing float with unsigned float, just adding unsigned float as a new type.
e
ephemient

(As an aside, Perl 6 lets you write

subset Nonnegative::Float of Float where { $_ >= 0 };

and then you can use Nonnegative::Float just like you would any other type.)

There's no hardware support for unsigned floating point operations, so C doesn't offer it. C is mostly designed to be "portable assembly", that is, as close to the metal as you can be without being tied down to a specific platform.

[edit]

C is like assembly: what you see is exactly what you get. An implicit "I'll check that this float is nonnegative for you" goes against its design philosophy. If you really want it, you can add assert(x >= 0) or similar, but you have to do that explicitly.


svn.perl.org/parrot/trunk/languages/perl6/docs/STATUS says yes, but of ... doesn't parse.
C
Community

I believe the unsigned int was created because of the need for a larger value margin than the signed int could offer.

A float has a much larger margin, so there was never a 'physical' need for an unsigned float. And as you point out yourself in your question, the additional 1 bit precision is nothing to kill for.

Edit: After reading the answer by Brian R. Bondy, I have to modify my answer: He is definitely right that the underlying CPUs did not have unsigned float operations. However, I maintain my belief that this was a design decision based on the reasons I stated above ;-)


Also, addition and subtraction of integers is the same signed or unsigned -- floating point, not so much. Who would do the extra work to support both signed and unsigned floats given the relatively low marginal utility of such a feature?
J
Johannes Schaub - litb

I think Treb is on the right track. It's more important for integers that you have an unsigned corresponding type. Those are the ones that are used in bit-shifting and used in bit-maps. A sign bit just gets into the way. For example, right-shifting a negative value, the resulting value is implementation defined in C++. Doing that with an unsigned integer or overflowing such one has perfectly defined semantics because there is no such bit in the way.

So for integers at least, the need for a separate unsigned type is stronger than just giving warnings. All the above points do not need to be considered for floats. So, there is, i think, no real need for hardware support for them, and C will already don't support them at that point.


P
Pete Kirkham

A square-root will definately never return a negative number. There are other places as well where a negative float value has no meaning. Perfect candidate for an unsigned float.

C99 supports complex numbers, and a type generic form of sqrt, so sqrt( 1.0 * I) will be negative.

The commentors highlighted a slight gloss above, in that I was referring to the type-generic sqrt macro rather than the function, and it will return a scalar floating point value by truncation of the complex to its real component:

#include <complex.h>
#include <tgmath.h>

int main () 
{
    complex double a = 1.0 + 1.0 * I;

    double f = sqrt(a);

    return 0;
}

It also contains a brain-fart, as the real part of the sqrt of any complex number is positive or zero, and sqrt(1.0*I) is sqrt(0.5) + sqrt(0.5)*I not -1.0.


Yes, but you call a function with a different name if you work with complex numbers. Also the return type is different. Good point though!
The result of sqrt(i) is a complex number. And since the complex numbers aren't ordered, you can't say a complex number is negativ (i.e. < 0)
quinmars, sure it's not csqrt ? or do you talk about math instead of C? i agree anyway that it's a good point :)
Indeed, I was talking about math. I've never dealt with the complex numbers in c.
" square-root will definately never return a negative number." --> sqrt(-0.0) often produces -0.0. Of course -0.0 is not a negative value.
p
phuclv

I guess it depends on that the IEEE floating-point specifications only are signed and that most programming languages use them.

Wikipedia article on IEEE-754 floating-point numbers

Edit: Also, as noted by others, most hardware does not support non-negative floats, so the normal kind of floats are more efficient to do since there is hardware support.


C was introduced long before the IEEE-754 standard appeared
@phuclv Neither were commonplace floating point hardware. It was adopted into standard C "a few" years later. There is probably some documentation floating around the internet about it. (Also, the wikipedia article mentions C99).
I don't understand what you mean. There's no "hardware" in your answer, and IEEE-754 was born after C, so floating-point types in C can't depend on IEEE-754 standard, unless those types was introduced into C much later
@phuclv C is/was also known as portable assembly, so it can be pretty close to hardware. Languages gains features over the years, even if (before my time) float was implemented in C, it was probably a software based operation and quite expensive. At the time of answering this question, I obviously had a better grasp of what I was trying to explain than I do now. And if you look at the accepted answer you might understand why I mentioned IEE754 standard. What I do not understand is that you nitpicked on a 10 year old answer which isn't the accepted one?
s
supercat

Unsigned integer types in C are defined in such a way as to obey the rules of an abstract algebraic ring. For example, for any value X and Y, adding X-Y to Y will yield X. Unsigned integer types are guaranteed to obey these rules in all cases which do not involve conversion to or from any other numeric type [or unsigned types of different sizes], and that guarantee is one of the most important feature of such types. In some cases, it's worthwhile to give up the ability to represent negative numbers in exchange for the extra guarantees only unsigned types can provide. Floating-point types, whether signed or not, cannot abide by all the rules of an algebraic ring [e.g. they cannot guarantee that X+Y-Y will equal X], and indeed IEEE doesn't even allow them to abide by the rules of an equivalence class [by requiring that certain values compare unequal to themselves]. I don't think an "unsigned" floating-point type could abide by any axioms which an ordinary floating-point type could not, so I'm not sure what advantages it would offer.


F
Ferruccio

I think the main reason is that unsigned floats would have really limited uses compared to unsigned ints. I don't buy the argument that it's because the hardware doesn't support it. Older processors had no floating point capabilities at all, it was all emulated in software. If unsigned floats were useful they would have been implemented in software first and the hardware would have followed suit.


The PDP-7, C's first platform, had an optional hardware floating point unit. The PDP-11, C's next platform, had a 32-bit floats in hardware. 80x86 came a generation later, with some technology that was a generation behind.
p
phuclv

IHMO it's because supporting both signed and unsigned floating-point types in either hardware or software would be too troublesome

For integer types we can utilize the same logic unit for both signed and unsigned integer operations in most situations using the nice property of 2's complement, because the result is identical in those cases for add, sub, non-widening mul and most bitwise operations. For operations that differentiate between signed and unsigned version we can still share the majority of the logic. For example

Arithmetic and logical shift need only a slight change in the filler for the top bits

Widening multiplication can use the same hardware for the main part and then some separate logic to adjust the result to change the signness. Not that it's used in real multipliers but it's possible to do

Signed comparison can be converted to unsigned comparison and vice versa easily by toggling the top bit or adding INT_MIN. Also theoretically possible, it's probably not used on hardware, yet it's useful on systems that support only one type of comparison (like 8080 or 8051)

Systems that use 1's complement also just need a little modification to the logic because it's simply the carry bit wrapped around to the least significant bit. Not sure about sign-magnitude systems but it seems like they use 1's complement internally so the same thing applies

Unfortunately we don't that luxury for floating-point types. By simply freeing the sign bit we'll have the unsigned version. But then what should we use that bit for?

Increase the range by adding it to the exponent

Increase the precision by adding it to the mantissa. This is often more useful, as we generally need more precision than range

But both choices need a bigger adder to accommodate for the wider value range. That increases the complexity of the logic while the adder's top bit sits there unused most of the time. Even more circuitry will be needed for multiplications, divisions or other complex operations

On systems that use software floating-point you need 2 versions for each function which wasn't expected during the time memory was so much expensive, or you'd have to find some "tricky" way to share parts of the signed and unsigned functions

However floating-point hardware existed long before C was invented, so I believe the choice in C was due to the lack of hardware support because of the reason I mentioned above

That said, there exists several specialized unsigned floating-point formats, mainly for image processing purposes, like Khronos group's 10 and 11-bit floating-point type


A
ABaumstumpf

Good Question.

If, as you say, it is only for compile-time warnings and no change in their behavior otherwise then the underlying hardware is not affected and as such it would only be a C++/Compiler change.

I have wonedered the same previously, but the thing is: It would not help much. At best the compiler can find static assignments.

unsigned float uf { 0 };
uf = -1f;

Or minimalistically longer

unsigned float uf { 0 };
float f { 2 };
uf -= f;

But that's about it. With unsigned integer types you also get a defined wraparound, namely it behaves like modular arithmetic.

unsigned char uc { 0 };
uc -= 1;

after this 'uc' holds the value of 255.

Now, what would a compiler do with the same scenario given an unsigned float-type? If the values are not know at compile time it would need to generate code that first executes the calculations and then does a sign-check. But what when the result of such a computation would be say "-5.5" - which value should be stored in a float declared unsigned? One could try modular arithmetic like for integral types, but that comes with its own problems: The largest value is unarguably infinity .... that does not work, you can not have "infinity - 1". Going for the largest distinct value it can hold also will not really work as there you run into it precission. "NaN" would be a candidate. You lose any and all information what the number originally contained - not really helpful as you now would need to check for that specifically so you might as well check if the number is positive your self.

Lastly this would not be a problem with fixed point numbers as there modulo is well defined.


D
Dwayne Robinson

I'm not aware of any programming language that supports unsigned floating-point numbers. Any idea why they don't exist?

Unsigned floats exist. See the unsigned float16 (11 fractions bit, 5 exponent bits, 0 sign bits) for GPU hardware, HDR format DXGI_FORMAT_BC6H. It's just that they're uncommon enough across most computing hardware that mainstream programming languages omit them. In this usage, the sign is omitted because colors darker than black make no sense anyway.

Even the far more common IEEE half or signed float16_t, which is used quite frequently in the field of graphics and machine learning for HDR images and lower bandwidth tensors, hasn't received the honor of being incorporated into C/C++ (though, more domain-specific languages like CUDA/HLSL do have half/float16_t, and there have been C++ proposals too). So if even signed float16 can't make into C++ outside of compiler specific extensions (e.g. gcc __fp16), then an unsigned float16 has little hope :b, and not even CUDA or HLSL have the unsigned type in the language, just in the texture definition itself (found in a .DDS file or in GPU texture memory). Until then, we'll have to continue to implement more exotic types without compiler help via helper libraries.


B
Brian Ensink

I suspect it is because the underlying processors targeted by C compilers don't have a good way of dealing with unsigned floating point numbers.


Did the underlying processors have a good way of dealing with signed floating-point numbers? C was getting popular when floating-point auxiliary processors were idiosyncratic and hardly universal.
I don't know all the historical timelines but there was emerging hardware support for signed floats, although rare as you point out. Language designers could incorporate support for it while compiler backends had varying levels of support depending on the targeted architecture.