DEV Community

Mohamed Ayman
Mohamed Ayman

Posted on

Floating Point Representation (IEEE 754 ISSUE)

How can computers understanding Floating Point Numbers?

To understand how computers interpret decimal numbers like 18.50, we need to think like a computer. For us, it's straightforward—we see 18.50, but for a computer, it's a bit more complex. Computers use binary, where 5 is represented as 101. However, 101 represents 5, not 0.5. We might think to store 0.5 as 0.101, but computers don't interpret the decimal point like we do. It's merely a visual aid for us, indicating the separation between whole numbers and fractions. Understanding how computers manage and retrieve this data is key to comprehending their processing of floating-point numbers.

How many bits for floating number?

This is a second problem if i represent floating number in computer, How many digits should i take to store mantissa and integer number. In integer data type we take 4 byte but now may number in float data type has two part integer part and mantissa part. Some one says in my application i want the integer part take more digits than mantissa so like we know float data type takes 32 bits (4 byte) some one says i will take 20 bits for integer and 11 for mantissa and the rest bit for the sign -positive or negative-
while another one will say i will take first 24 bits for integer and 7 for mantissa and the rest bit for sign. It is so clear that there is different implementations, in the programming when scientists ans manufacturer saw that the quickly go to standard we in need now to make standard to unify the implementation so coder and developer do not find it so hard to learn different implementations for data type.
The question now how we can make standard ?
we need engineers that related to electronics and electric as we know binary system is just voltage, we transistor has voltage, its value will be 1 and when has not the value will be 0. Now we will make s from Electrical and electronic engineers called IEEE -Institute of Electrical and Electronic Engineer- and when we need make institute it will be the name of the institute with number indicates to this standard. If you want read about floating number representation it will be at IEEE 754 standard.

Do not be confused, i will summarize the two problems. We have problems, first how computer will understand float number (if you said point character, we said that is human visual representation and human knows what is this so the can make their calculation in their mind ) and second one is how many bits for mantissa part and integer part. We need standard so everyone will not make its implementation.

Now i will try to explain some separated topics that we will need it in explaining IEEE 754.

How Mantissa and Exponent

This is the way how computer store and retrieve floating number
if you have number 6000.11 it can also read as 0.600011 x 104
Now computer can understand this representation
and when computer retire it will be so easy, the representation became as computer login can understand, addition and multiplication process.

The question is why this is bad behavior in floating number?

Image description

Let start step by step demonstrate this example:

Convert 9 to binary Image description
Convert 0.1 to binary -The problem start from here- In this picture i will convert 0.1 and 0.5 to binary to make sure that you understand how we can convert mantissa to binary Image description Now the number is like 1001.0001100..... It is so clear that conversion from 0.1 to binary will go to infinity
Standard says that in float data type mantissa will take 23 bits and exponent 8 bits and the last bit for sign
put it in mantissa and exponent representation 1001.00011001100110011010 -> 1.001000110011001100110011 there are 24 bits after point character. the rule is if the mantissa more than 23 bits if the 24th bit is 1 so 1 will be added for mantissa but if is it is 0 we will take the previous 23 bits without adding any thing Image description
Convert 0.1 Image description
Now i use online converter from binary to decimal to convert this two number This is for 0.1 convertion from binary representation to decimal one Image description and this for 9.1 Image description
The result is Image description

Conclusion

This bad behavior in floating point representation was counted in java language by made another data type called big Decimal. It handle this behavior but it is slower than float.
It is up to do you prefer big Decimal as you need more precision in you code but you will take slower performance than float or you do not need a high precision and will choose float data type.

Top comments (0)