Working with Files

Working with Files                               

Introduction to files:

Many applications require the information be written to or read from an auxiliary memory device. Such information is stored on the memory device in the form of a data file. Thus, data files allow us to store information permanently, and to access and alter the information whenever necessary.
Previously, we were using Console Oriented I/O functions. “Console Application” means an application that has a text-based interface.
Most applications require a large amount of data, if this data is entered through console then it will be quite time consuming task and main drawback of using Traditional I/O is that data is temporary and will not be available during re-execution.
New way of dealing with data is file handling. In this, the data is stored onto the disk and can be retrieved whenever required. Output of the program may be stored onto the disk. In C we have many functions that deal with file handling. A file is a collection of bytes stored on a secondary storage device (generally a disk).

Types of Files:

A file is simply a machine decipherable storage media where programs and data are stored for machine usage. A file can be of two types

•    Text file
•    Binary file

The best example for a text file is our C program which comprises of human understandable characters. A binary file on the other side consists of binary numbers that represent certain character. Upon compiling a C program, it is converted to a binary file so that the machine can understand it. The two major differences between a text file and binary file are

•    Each line of a text file ends with one or more special character that marks the end of the line. A text file is usually divided into lines. Although a binary file is not divided into lines.
•    Text files contain a special end of line which denotes that this is the point where the file ends. Although in case of a binary file, there is no end of file marker.

Text files:

A text file can be a stream of characters that a computer can process sequentially. A text stream in C is a special kind of file.  It is not only processed sequentially but only in forward direction. For this reason a text file is usually opened for only one kind of operation (reading, writing, or appending) at any given time.
Similarly, since text files only process characters, they can only read or write data one character at a time.

Binary files:

A binary file is different to a text file. It is a collection of bytes. In C Programming Language a byte and a character are equivalent. Hence a binary file is also referred to as a character stream, but there are two essential differences. No special processing of the data occurs and each byte of data is transferred to or from the disk unprocessed. C Programming Language places no constructs on the file, and it may be read from, or written to, in any manner chosen by the programmer.
Binary files can be either processed sequentially or, depending on the needs of the application, they can be processed using random access techniques. In C Programming Language, processing a file using random access techniques involves moving the current file position to an appropriate place in the file before reading or writing data. This indicates a second characteristic of binary files. They are generally processed using read and write operations simultaneously.

File pointers:
A file pointer is a pointer to a structure, which contains information about the file, including its name, current position of the file, whether the file is being read or written, and whether errors or end of the file have occurred. The user does not need to know the details, because the definitions obtained from stdio.h include a structure declaration called FILE. The only declaration needed for a file pointer is
 FILE *ptvar;
Where FILE (uppercase letters are required) is a special structure type that establishes the buffer area, and ptvar is a pointer variable that indicates the beginning of the buffer area. The structure type FILE is defined within a system include file, typically stdio.h. The pointer ptvar is often referred to as a stream pointer, or simply a stream.

Opening and Closing files:
Opening a file:
A file must be opened before it can be created or processed. This associates the file name with the buffer area i.e., with a stream. It also specifies how the file be utilized, i.e., as a read-only file, a write-only file, or a read/write file, in which both operations are permitted. The library function fopen is used to open a file. This is written as
ptvar = fopen (file-name, file-type);
Where file-name and file-type are strings that represent the name of the file and the manner in which the file will be utilized. The name chosen for the file-name must be consistent with the rules for naming files, as determined by the computer’s operating system.
File-type specifications:
‘r’ – open an existing file for reading only.
‘w’ – open a new file for writing only. If a file with specified file-name currently exists, it will be destroyed and a new file is created in its place.
‘a’ – open an existing file for appending i.e., for adding new information at the end of the file. A new file will be created if the file with the specified file-name does not exist.
‘r+’ – open an existing file for both reading and writing.
‘w+’ – open a new file for both reading and writing. If a file with the specified file-name currently exists, it will be destroyed and new file created in its place.
‘a+’ – open an existing file for both reading and appending. A new file will be created if the file with the specified file-name does not exist.
The fopen function returns a pointer to the beginning of the buffer area associated with the file. A null value is returned if the file cannot be opened as, for example, when an existing file cannot be found.
Reading from a file:
After a file is opened the file pointer fp would be pointing to the first character of file.txt as per the previous example. In order to read the data from a file, we will use the fgetc () function.
Syntax: fgetc (file pointer);
The character read should be saved in a variable.
char ch;
ch =fread (fp);
The fgetc () function reads a character from where the file pointer in pointing to in the file. After reading the file pointer is shifted to the next character and similarly all the characters are read one by one. This process continues until the EOF (end of file) is reached.
Writing to a file:
After opening a file we can even write to that file by using the fprintf () function.
fprintf (file pointer, “format specifiers”, identifiers);
int x=4;
fprintf (fp, “%d”, x);
Closing a file:
Finally, a file must be closed at the end of the program.  As said earlier, there is usually a limit on the number of files that can be opened at one time, and so it is important to close the file once it has been used. This ensures that various system resources will be free and reduces the risk of overshooting the set limit. The fclose function closes a stream that was opened by a call to fopen. It writes any data still remaining in the disk buffer to the file. The syntax is simply
fclose (ptvar);
The function fclose returns an integer value 0 for successful closure; any other value indicates an error. The fclose generally fails when a disk has been prematurely removed from the drive or there is no more space on the disk.
The other function used for closing streams is the fcloseall function. This function is useful when many open streams have to be closed at the same time. It closes all open streams and returns the number of streams closed of streams closed or EOF if any error is detected.
FILE *fpt;
fpt = fopen (“sample.dat”, “w”);
fclose (fpt);

Modifying and Deleting files:
Most file applications require that a data file be altered as it is being processed. For example, in an application involving the processing of customer records, it may be desirable to add new records to the file either at the end of the file or interspersed among the existing records, to delete existing records, to modify the contents of existing records, or to rearrange the records. These requirements in turn suggest several different computational strategies.

 Remove or deleting a file:
With function remove () we can delete a file. Let us take a look at an example:
int main (void)
char buffer [101];
printf ("Name of file to delete:  ");
gets_s (buffer, 100);
if (remove (buffer) == 0)
printf ("File %s deleted.\n", buffer);
fprintf (stderr, "Error deleting the file %s.\n", buffer);
First a buffer is made with room for 100 characters, plus ‘\0′. After the question “Name of file to delete:” is printed on the screen the gets_s is waiting for input of a maximum of 100 characters. In the “if statement” the function remove is used to remove the file. If the file is successfully deleted then a success message is printed. If something goes wrong for example the file doesn’t exist an error message is printed on the standard error stream.

Renaming a File:
With the function rename () we can rename a file.
int main(void)
char buffer_old[101], buffer_new[101];
printf ("Current filename: ");
gets_s (buffer_old, 100);
printf ("New filename: ");
gets_s (buffer_new, 100);
if (rename(buffer_old, buffer_new) == 0)
printf ("%s has been rename %s.\n", buffer_old, buffer_new);
fprintf (stderr, "Error renaming %s.\n", buffer_old);
As in the previous example a buffer for the filename is created of 100 characters, plus a character for ‘\0′. This time there is also a second buffer created that will hold the new name.
In the “if statement” the function rename () is used. The function rename () needs two variables, one for the old name and one for the new file name. If the renaming either succeeds or fails, a message is printed.

Interacting with Text files:

Text files contain textual data and may be saved in plain text or rich text formats. While most text files are documents created and saved by users, they can also be used by software developers to store program data. Examples of text files include word processing documents, log files, and saved email messages. Common text file extensions include .TXT, .RTF, .LOG, and .DOCX.
•    Text File has .txt Extension.
•    Text File Format contains very little formatting.
•    The precise definition of the .txt format is not specified, but typically matches the format accepted by the system terminal or simple text editor.
•    Files with the .txt extension can easily be read or opened by any program that reads text and, for that reason, are considered as universal or platform independent.
•    Text Format contains mostly English characters.
.TXT files extension: Standard text document that contains unformatted text; recognized by any text editing or word processing program; can also be processed by most other software programs. Generic text files with filenames that ending in ".txt" are created by Notepad for Windows.
.DOCX files extension: Document created by word processing software; contains text, images, formatting, styles, drawn objects, and other document elements; used for authoring, business, academic, and personal documents; is one of the most popular word processing document formats.
Non-text files:
Data files or non-text files are the most common type of computer files. They are also called as binary files. They may be installed with applications or created by users. Most data files are saved in a binary format, though some store data as plain text. Examples of data files include libraries, project files, and saved documents. Common data file extensions include .DAT, .XML, and .VCF.
•    Binary Files contain information coded mostly in Binary Format.
•    Binary Files are difficult to read for human.
•    Binary Files can be processed by certain applications or processors.
•    Only Binary File Processors can understand complex formatting information stored in Binary Format.
•    Humans can read binary files only after processing.
•    All Executable Files are Binary Files.
.DAT file extension: Generic data file created by a specific application; typically accessed only by the application that created the file; may contain data in text or binary format, and text-based DAT files can be viewed in a text editor. Many programs create, open, or reference DAT files. Additionally, many DAT files are only used for application support and are not meant to be opened manually by the user.
.XML files extension: XML (Extensible Mark up Language) data file that uses tags to define objects and object attributes; formatted much like an .HTML document, but uses custom tags to define objects and the data within each object; can be thought of as a text-based database. XML files have become a standard way of storing and transferring data between programs and over the Internet. Because they are formatted as text documents, they can be edited by a basic text editor.

Command Line Arguments:
Most versions of C permit two arguments that allow parameters to be passed to main from the operating system, which are traditionally called argc and argv, respectively. The first of these, argc, must be an integer variable, while the second, argv, is an array of pointers to characters; i.e., an array of strings. Each string in this array will represent a parameter that is passed to main. The value of argc will indicate the number of parameters passed.
argc     - number of arguments in the command line including program name
argv [] - This is carrying all the arguments.
The following indicates how the arguments argc and argv are defined within main.
 main (int argc, char *argv [])
A program is normally executed by specifying the name of the program within a menu-driven environment. Some compilers also allow a program to be executed by specifying the name of the program at the operating system level. The program name is then interpreted as an operating system command. Hence, the line in which it appears is generally referred to as command line. In order to pass one or more parameters to the program when it is executed from the operating system, the parameters must follow the program name on the command line
program-name parameter 1 parameter 2 …… parameter n
The program name will be stored as the first item in argv, followed by each of the parameters. Hence, if the program name is followed by n parameters, there will be (n+1) entries in argv, ranging from argv [0] to argv [n]. Moreover, argc will automatically be assigned the value (n+1). The value for argc is not applied explicitly from the command line.
void main(int argc , char * argv[])
int i,sum=0;
if (argc!=3)
printf ("you have forgot to type numbers.");
exit (1);
printf ("The sum is : ");
for (i=1; i<argc;i++)
sum = sum + atoi (argv[i]);
printf ("%d", sum);
The sum is: 30


Post a Comment