Data Types | Part I - C Language

THE BUILT IN DATA TYPES IN C

Introduction

Data types are provided to store various types of data that is processed in real life. A student's record might contain the following data types: name, roll number, and grade percentage. For example, a student named Anil might be assigned roll number 5 and have a grade percentage of 78.67. The roll number is an integer without a decimal point, the name consists of all alpha characters, and the grade percentage is numerical with a decimal point. C supports representation of this data and gives instructions or statements for processing such data. In general, data is stored in the program in variables, and the kind of data the variable can have is specified by the data type. Using this example, grade percentage has a float data type, and roll number has an integer data type. The data type is attached to the variable at the time of declaration, and it remains attached to the variable for the lifetime of the program. Data type indicates what information is stored in the variable, the amount of memory that can be allocated for storing the data in the variable, and the available operations that can be performed on the variable. For example, the operation S1 * S2, where S1 and S2 are character strings, is not valid for character strings because character strings cannot be multipled.

Program

// the program gives maximum and minimum values of data type
#include 
main()
{
int i,j ;// A
i = 1;
while (i > 0)
{
j = i;
i++;
}
printf ("the maximum value of integer is %d
",j);
printf ("the value of integer after overflow is %d
",i);
}

Explanation

In this program there are two variables, i and j, of the type integer, which is declared in statement A.
The variables should be declared in the declaration section at the beginning of the block.
If you use variables without declaring them, the compiler returns an error.

Points to Remember

C supports various data types such as float, int, char, etc., for storing data.
The variables should be declared by specifying the data type.
The data type determines the number of bytes to be allocated to the variable and the valid operations that can be performed on the variable.

VARIOUS DATA TYPES IN C

Introduction

C supports various data types for processing information. There is a family of integer data types and floating-point data types. Characters are stored internally as integers, and they are interpreted according to the character set. The most commonly used character set is ASCII. In the ASCII character set, A is represented by the number 65.

Program/Examples

The data type families are as follows:

Integer family
 char data type
 int data type
 short int data type
 long int data type

These data types differ in the amount of storage space allocated to their respective variables. Additionally, each type has two variants, signed and unsigned, which will be discussed later.

Float family (real numbers with decimal points)
 Float data type
 Double data type

(ANSI has also specified long double, which occupies the same storage space as double)

Explanation

Data type determines how much storage space is allocated to variables.
Data type determines the permissible operations on variables.

Points to Remember

C has two main data type families: integer for representing whole numbers and characters of text data, and float for representing the real-life numbers.
Each family has sub-data types that differ in the amount of storage space allocated to them.
In general, the data types that are allocated more storage space can store larger values.

THE INTEGER DATA TYPE FAMILY

Introduction

Integer data types are used for storing whole numbers and characters. The integers are internally stored in binary form.

Program/Example

Here is an example that shows how integers are stored in the binary form.

Number =13

Decimal representation = 1*101 + 3*100
Binary representation = 1101 = 1*23 + 1*22 + 0*21 + 1*1

Each 1 or 0 is called a bit, thus the number 13 requires 4 bits.

In the same way, the number 130 is 1000 0010 in binary.

If the general data type is char, 8 bits are allocated. Using 8 bits, you can normally represent decimal numbers from 0 to 255 (0000 0000 to 1111 1111). This is the case when the data type is unsigned char. However, with signed char, the leftmost bit is used to represent the sign of the number. If the sign bit is 0, the number is positive, but if it is 1, the number is negative.

Binary representation of the following numbers in signed char is as follows:

Number = 127 Binary representation = 0111 1111 (leftmost bit is 0, indicating positive.)

Number = −128 Binary representation = 1000 0000 (leftmost bit is 1, indicating negative.)

The negative numbers are stored in a special form called "2's complement". It can be explained as follows:

Suppose you want to represent −127:

Convert 127 to binary form, i.e. 0111 1111.
Complement each bit: put a 0 wherever there is 1 and for 0 put 1. So you will get 1000 0000.

Add 1 to the above number

 1000 0000
 + 1
-------------
 1000 0001 (−127)

Thus in the signed char you can have the range −128 to +127, i.e. (−28 to 28−1).

The binary representation also indicates the values in the case of overflow. Suppose you start with value 1 in char and keep adding 1. You will get the following values in binary representation:

0000 0001 (1)
0111 1111 (127)
1000 0000 (-128)
1000 0001 (-127)

In the case of unsigned char you will get

0000 0001 (1)
0111 1111 (127)
1000 0000 (128)
1000 0001 (129)
1111 1111 (255)
0000 0000 (0)

This concept is useful in finding out the behavior of the integer family data types.

The bytes allocated to the integer family data types are (1 byte = 8 bits) shown in Table 2.1.

Table 2.1: Integer data type storage allocations
Data Type	Allocation	Range
signed `char`	1 byte	−27 to 27−1 (−128 to 127)
Unsigned `char`	1 byte	0 to 28−1 (0 to 255)
`short`	2 bytes	−215 to 215 −1 (−32768 to 32767)
Unsigned `short`	2 bytes	0 to 216 −1 (0 to 65535)
`long int`	4 bytes	231 to 231−1 (2,147,483,648 to 2,147,483,647)
`int`	2 or 4 bytes depending on implementation	Range for 2 or 4 bytes as given above

Explanation

In C, the range of the number depends on the number of bytes allocated and whether the number is signed.
If the data type is unsigned the lower value is 0 and the upper depends on the number of bytes allocated.
If the data type is signed then the leftmost bit is used as a sign bit.
The negative number is stored in 2's complement form.
The overflow behavior is determined by the binary presentation and its interpretation, that is, whether or not the number is signed.

Points to Remember

The behavior of a data type can be analyzed according to its binary representation.
In the case of binary representation, you have to determine whether the number is positive or negative.

OVERFLOW IN char AND UNSIGNED char DATA TYPES

Introduction

Overflow means you are carrying out an operation such that the value either exceeds the maximum value or is less than the minimum value of the data type.

Program

// the program gives maximum and minimum values of data type
#include 
main()
{
char i,j ;
i = 1;
while (i > 0) // A
{
j = i; // B
i++; // C
}
printf ("the maximum value of char is %d
",j);
printf ("the value of char after overflow is %d
",i);
}

Explanation

This program is used to calculate the maximum positive value of char data type and the result of an operation that tries to exceed the maximum positive value.
The while loop is terminated when the value of i is negative, as given in statement A. This is because if you try to add 1 to the maximum value you get a negative value, as explained previously (127 + 1 gives −128).
The variable j stores the previous value of i as given in statement B.
The program determines the maximum value as 127. The value after overflow is -128.
The initial value of i is 1 and it is incremented by 1 in the while loop. After i reaches 127, the next value is -128 and the loop is terminated.

Points to Remember

In the case of signed char, if you continue adding 1 then you will get the maximum value, and if you add 1 to the maximum value then you will get the most negative value.
You can try this program for short and int, but be careful when you are using int. If the implementation is 4 bytes it will take too much time to terminate the while loop.
You can try this program for unsigned char. Here you will get the maximum value, 255. The value after overflow is 0.

THE char TYPE

Introduction

Alpha characters are stored internally as integers. Since each character can have 8 bits, you can have 256 different character values (0–255). Each integer is associated with a character using a character set. The most commonly used character set is ASCII. In ASCII, "A" is represented as decimal value 65, octal value 101, or hexadecimal value 41.

Explanation

If you declared C as a character as

char c;

then you can assign A as follows:

c = 'A';
c = 65;
c = 'x41'; // Hexadecimal representation
c = '101'; // Octal representation

You cannot write c = ‘A’ because ‘A’ is interpreted as a string.

Escape Sequence

Certain characters are not printable but can be used to give directive to functions such as printf. For example, to move printing to the next line you can use the character " ". These characters are called escape sequences. Though the escape sequences look like two characters, each represents only a single character.

The complete selection of escape sequences is shown here.

a	alert (bell) character	\	backslash
	backspace	?	question mark
f	form feed	’	single quote
	new line	"	double quote
	carriage return	ooo	octal number
	horizontal tab	xhh	hexadecimal number
v	vertical tab

Points to Remember

Characters are stored as a set of 255 integers and the integer value is interpreted according to the character set.
The most common character set is ASCII.
You can give directive to functions such as printf by using escape sequence characters.

OCTAL NUMBERS

Introduction

You can represent a number by using the octal number system; that is, base 8. For example, if the number is 10, it can be represented in the octal as 12, that is, 1*81 + 2*80.

Explanation

When octal numbers are printed they are preceeded by "%0".

HEXADECIMAL NUMBERS

Introduction

Hexadecimal numbers use base 16. The characters used in hexadecimal numbers are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. For example, if the decimal number is 22, it is represented as 16 in the hexadecimal representation: 1*161 + 6*160 .

Explanation

You can print numbers in hexadecimal form by using the format "0x".

REPRESENTATION OF FLOATING POINT NUMBERS

Introduction

Floating-point numbers represent two components: one is an exponent and the other is fraction. For example, the number 200.07 can be represented as 0.20007*103, where 0.2007 is the fraction and 3 is the exponent. In a binary form, they are represented similarly. There are two types of representation: short or single- precision floating-point number and long or double-precision floating-point number. short occupies 4 bytes or 32 bits while long occupies 8 bytes or 64 bits.

Program/Example

In C, short or single-precision floating point is represented by the data type float and appears as:

float f ;

A single-precision floating-point number is represented as follows:

click to expand

Here the fractional part occupies 23 bits from 0 to 22. The exponent part occupies 8 bits from 23 to 30 (bias exponent, that is, exponent + 01111111). The sign bit occupies the 31st bit.

Suppose the decimal number is 100.25. It can be converted as follows:

Convert 100.25 into its equivalent binary representation: 1100100.01.
Then represent this number so that there is only 1 bit on the left side of the decimal point: 1.0010001*26
In a binary representation, exponent 6 means the number 110. Now add the bias, 0111 1111, to get the exponent: 1000 0101

Since the number is positive, the sign bit is 0. The significant, or fractional, part is:

1001 0001 0000 0000 0000 000

Note that up until the fractional part, only those bits that are on the right side of the decimal point are present. The 0s are added to the right side to make the fractional part take up 23 bits.

Special rules are applied for some numbers:

The number 0 is stored as all 0s, but the sign bit is 1.
Positive infinity is represented as all 1s in the exponent and all 0s in the fractional part with the sign bit 0.
Negative infinity is represented as all 1s in the exponent and all 0s in fractional part with the sign bit 1.
A NAN (not a number) is an invalid floating number in which all the exponent bits are 1, and in the fractional part you may have 1s or 0s.

The range of the float data type is 10−38 to 1038 for positive values and −1038 to −10−38 for negative values.

The values are accurate to 6 or 7 significant digits depending on the actual implementation.

Conversion of a number in the floating-point form to a decimal number

Suppose the number has the following components:

Sign bit: 1
Exponent: 1000 0011
Significant or fractional part: 1001 0010 0000 0000 0000 000

Since the exponent is bias, find out the unbiased exponent.
100 = 1000 0011 – 0111 1111 (number 4)

Represent the number as 1.1001001*24

Represent the number without the exponent as 11001.001

Convert the binary number to decimal: −25.125

For double precision, you can declare the variable as double d; it is represented as

click to expand

Here the fractional part occupies 52 bits from 0 to 51. The exponent part occupies 11 bits from 52 to 62 (the bias exponent is the exponent plus 011 1111 1111). The sign bit occupies bit 63. The range of double representation is +10−308 to +10308 and −10308 to −10−308. The precision is to 10 or more digits.

Formats for representing floating points

Following are the valid representions of floating points:

0.23456
2.3456E-1
 .23456
 .23456e-2
2.3456E-4
-.232456E-4
2345.6
23.456E2
-23456
23456e3

Following are the invalid formats:

e1
 2.5e-.5
25.2-e5
 2.5.3

You can determine whether a format is valid or invalid based on the following rules:

The value can include a sign, it must include a numerical part, and it may or may not have exponent part.
The numerical part can be of following form:

d.d, d., .d, d, where d is a set of digits.
If the exponent part is present, it should be represented by ‘e’ or ‘E’, which is followed by a positive or negative integer. It should not have a decimal point and there should be at least 1 digit after ‘E’.
All floating numbers have decimal points or ‘e’ (or both).
When ‘e’ or ‘E’ is used, it is called scientific notation.
When you write a constant, such as 50, it is interpreted as an integer. To interpret it as floating point you have to write it as 50.0 or 50, or 50e0.

You can use the format %f for printing floating numbers. For example, printf("%f ", f);

%f prints output with 6 decimal places. If you want to print output with 8 columns and 3 decimal places, you can use the format %8.3f. For printing double you can use %lf.

Floating-point computation may give incorrect results in the following situations:

If the calculated value has a precision that exceeds the precision limit of the type;
If the calculated value exceeds the range allowable for the type;
If the two calculated values involve approximation then their operation may involve approximation.

Points to Remember

C provides two main floating-point representations: float (single precision) and double (double precision).
A floating-point number has a fractional part and a biased exponent.
Float occupies 4 bytes and double occupies 8 bytes.

TYPE CONVERSION

Introduction

Type conversion occurs when the expression has data of mixed data types, for example, converting an integer value into a float value, or assigning the value of the expression to a variable with different data types.

Program/Example

In type conversion, the data type is promoted from lower to higher because converting higher to lower involves loss of precision and value.

For type conversion, C maintains a hierarchy of data types using the following rules:

Integer types are lower than floating-point types.
Signed types are lower than unsigned types.
Short whole-number types are lower than longer types.
The hierarchy of data types is as follows: double, float, long, int, short, char.

These general rules are accompanied by specific rules, as follows:

If the mixed expression is of the double data type, the other operand is also converted to double and the result will be double.
If the mixed expression is of the unsigned long data type, then the other operand is also converted to double and the result will be double.
Float is promoted to double.
If the expression includes long and unsigned integer data types, the unsigned integer is converted to unsigned long and the result will be unsigned long.
If the expression contains long and any other data type, that data type is converted to long and the result will be long.
If the expression includes unsigned integer and any other data type, the other data type is converted to an unsigned integer and the result will be unsigned integer.
Character and short data are promoted to integer.
Unsigned char and unsigned short are converted to unsigned integer.

FORCED CONVERSION

Introduction

Forced conversion occurs when you are converting the value of the larger data type to the value of the smaller data type, for example, if the declaration is char c;

and you use the expression c = 300; Since the maximum possible value for c is 127, the value 300 cannot be accommodated in c. In such a case, the integer 300 is converted to char using forced conversion.

Program/Example

In general, forced conversion occurs in the following cases:

When an expression gives a larger data type but the variable has a smaller data type.
When a function is written using a smaller data type but you call the function by using larger data type. For example, in printf you specify %d, but you provide floating-point value.

Forced conversion is performed according to following rules:

Normally, when floating points are converted to integers, truncation occurs. For example, 10.76 is converted to 10.
When double is converted to float, the values are rounded or truncated, depending on implementation.
When longer integers are converted to shorter ones, only the lower bits are preserved and high-order bits are skipped. For example, the bit representation of 300 is 1 0010 1100. If it is assigned to character, the lower bits are preserved since a character can have 8 bits. So you will get the number 0010 1100 (44 in decimal).

In the case of type conversion, lower data types are converted to higher data types, so it is better to a write a function using higher data types such as int or double even if you call the function with char or float. C provides built-in mathematical functions such as sqrt (square root) which take the argument as double data type. Suppose you want to call the function by using the integer variable ‘k’. You can call the function

sqrt((double) n)

This is called type casting, that is, converting the data type explicitly. Here the value ‘k’ is properly converted to the double data type value.

Points to Remember

C makes forced conversion when it converts from higher data type to lower data type.
Forced conversion may decrease the precision or convert the value to one that doesn't have a relation with the original value.
Type casting is the preferred method of forced conversion.

TYPE CASTING

Introduction

Type casting is used when you want to convert the value of a variable from one type to another. Suppose you want to print the value of a double data type in integer form. You can use type casting to do this. Type casting is done to cast an operator which is the name of the target data type in parentheses.

Program


 #include 
 main()
 {
 double d1 = 123.56; \ A
 int i1=456; \ B

 printf("the value of d1 as int without cast operator %d
",d1); \ C
 printf("the value of d1 as int with cast operator %d
",(int)d1);
\ D
 printf("the value of i1 as double without cast operator %f
",i1); \
E
 printf("the value of i1 as double with cast operator %f
",(double)i1);
\ F
 i1 = 10;
 printf("effect of multiple unary operator %f
",(double)++i1); \ G
 i1 = 10; \ H
 //printf("effect of multiple unary operator %f
",(double) ++ -i1);
error \ I i1 = 10;
 printf("effect of multiple unary operator %f
",(double)- ++i1);\ J
 i1 = 10; \ K
 printf("effect of multiple unary operator %f
",(double)- -i1); \ L
 i1 = 10; \ M
 printf("effect of multiple unary operator %f
",(double)-i1++); \ N

 }

Explanation

Statement A defines variable d1 as double.
Statement B defines variable i1 as int.
Statement C tries to print the integer value of d1 using the placeholder %d. You will see that some random value is printed.
Statement D prints the value of d1 using a cast operator. You will see that it will print that value correctly.
Statements E and F print the values of i1 using a cast operator. These will print correctly as well.
Statements from G onwards give you the effects of multiple unary operators. A cast operator is also a unary operator.
Unary operators are associated from right to left, that is, the left unary operator is applied to the right value.
Statement G gives the effect of the cast operator double. The increment operator, in this case i1, is first incremented and then type casting is done.
If you do not comment out statement I you will get errors. This is because if unary +, − is included with the increment and decrement operator, it may introduce ambiguity. For example, +++i may be taken as unary + and increment operator ++, or it may be taken as increment operator ++ and unary +. Any such ambiguous expressions are not allowed in the language.
Statement J will not introduce any error because you put the space in this operator, which is used to resolve any ambiguity.

Points to Remember

Type casting is used when you want to convert the value of one data type to another.
Type casting does not change the actual value of the variable, but the resultant value may be put in temporary storage.
Type casting is done using a cast operator that is also a unary operator.
The unary operators are associated from right to left.

Part I - C Language

Part II - Data Structures

Part III - Advanced Problems in Data Structures