You have to understand data types.
Binary presents either a 0 or a 1, so doing addition you may immediately encounter overflow . Counting in binary starts at zero and the strings of 0's and 1's just get longer and longer:
- O
- 1
- 1O
- 11
- 100
- 101
- 110
- 111
- 100O
- 10O1
- 1010
This may not appear obvious, but adding binary 1 + 1 you get overflow, which gives you the result 10, in binary. This is similar to doing the decimal addition of 9 + 1; it won't fit into one character so you get overflow, giving you ten [10 in decimal].
This all comes back to binary arithmetic:
These three rules apply to any binary integer, irrespective of the number of bits used as long as there is a spare bit in the signed case (eg. -1 doesn't exist when you only have 1 bit to play with).
Integers come in various sizes: 8, 16, 32 and 64 bits (usually powers of 2), whether they are unsigned or unsigned .
Various architectures have used integers of strange sizes [36 bits], but most of those are now dead, or should be.
Neither of these is very difficult and the following example should suffice (we shall assume an 8 bit binary representation):
The process is quite simple.
- 000000001 - the value 1
- 111111110 - the value 1 [one's complement]
- 111111111 - the value -1 [two's complement]
From a binary value to one's complement the bits are flipped . Flipping turns zeroes into ones and ones into zeros. This is a common operation in machine code and some 3rd generation languages, say C.
From one's complement to two's complement 1 is added. At this point you have a signed [negative] integer. The leftmost bit being 1 indicates that the number is negative. Overflow here becomes a problem because a very large negative number can be be turned in 0 or any positive number < N or even some negative number <= -N.
An 8 bit number was chosen as an example because many machines do not natively support signed 8 bit numbers.
Floating point is used for measuring beer.The goal is to represent uncountable objects [beer], to a reasonable precision, using a countable object [9mm rounds] method. Such a method falls to the manipulation of integer values, but in an interesting way.
Of course, this may sound like madness, but it is simple in principle .
The integer is split into three pieces:
The real problems start because the representation is an approximation and is further complicated by:
Because of the above problems all sorts of caveats should be thought about. The worst are probably roundoff and comparison. Comparison causes problems because two approximate values are being compared and heuristics are commonly used to resolve this problem, depending on the calcuation involved.
Further reading is encouraged if floating point is to be understood fully.
Strings are used to represent and manipulate text.
This cuts down the choice of 3rd generation languages to a few. Limbo and Python both have strings as basic types. This removes all the woe involved in string allocation, garbage collection and overflows that have provoked more than one security problem.
Limbo is a better choice because the set of operators is small and well designed. However, Python seems to to designed from the point of view that every possible string operator should be added. This style of thinking just adds complexity and complexity leads to bugs.
Pascal has strings, but they are unusable in practice. A (short) defence of such a statement is why are there numerous versions of Pascal to get strings right?
C does support strings. It views them as arrays of
nul
terminated 8 bit integers. If you count wrong, you lose. The use of C's
sizeof
operator can be beneficial but gives no guarantees. Some good C programmers write
infinite string
libraries. However, most don't and they assume that any given array of 8 bit integers
will suffice. They don't.
Inherited data types and inheritence should be avoided, unless wading through a gruesome mess is one of your pastimes.
A curious form of an abstract data type is called a union [in C] where two different data types occupy the same memory, but comprise the the same entity.
They are named constants, usually having a signed integer value. Use of such them is encouraged to avoid literal integers in the code. Pick a name, assign it a value, use it everywhere.
Should you need to change its value it only needs to be done in one place and the effects will be global.
Enums [the pluralised name from C] can be used as union discriminators and constants. If you are writing C the preprocessor should be avoided, in favour of enums or other constants.
Your comments are welcome and if you liked this page you can donate via: