Files: Introduction

On the previous pages we talked about strings, but we need to make a little bit of a detour and discuss a very different topic: reading from files. The reason for this detour is simple: you could've noticed that all exercises related to strings were using "hardcoded" inputs, that is, the strings were given right in the code. Of course, it's not how things normally work in the real life, but to figure out how to read a string–other than using getchar() character by character–we need to learn about files and some new standard functions.

Programs normally read the data from files. Programs also write to files, but we won't need that at this moment. In C, just like in many other programming languages, working with files–reading or writing–requires opening a file, then performing the read or write operation with it, and closing it.

Opening a file

To open a file, that is, to make the content of the file available for your program, or to make it possible to write to the file, you can use the standard function fopen from stdio.h, which returns FILE * (a pointer to a FILE type).

Why a pointer, and what is a FILE, after all? This is defined by the implementation (basically, by the compiler authors), but you can assume it stores some information about the file state (maybe some buffers for temporary storing the data in memory), and the read and write functions working with the file might need to change the information stored in the FILE variable. As you know, changing a parameter is is only possible if all those functions accept a pointer; that's the reason why all file related functions work with FILE *.

To open a file for reading, do this:

FILE *f;

f = fopen("input.txt", "r");

The first parameter is the name of the file to open, the second parameter is a string that can have one of the several predefined values. For now, remember that "r" stands for reading; we'll talk about writing files later in the course.

If, for whatever reason, the file cannot be opened, fopen will return a special "null pointer" NULL, which, in all real life cases you could possibly see, equals to 0, pointing to an invalid "zero address" in memory. A special function perror exists to print an error message in this case. If it all does not make much sense for now, bear with me; I'll show you an example soon!

Closing the file

To close the file which was previously opened, use fclose:

fclose(f);

If the file was opened for reading, not closing it does not cause any really bad consequences (especially if your program exits soon), but it's still a good idea to close all files you have opened, and it becomes really important when you open your file for writing (which, as I said, we'll discuss later).

fclose returns int, which is either 0 or EOF if an error occurred. Since there's not much we can do if we could not close the file which is opened for reading, we'll ignore errors returned by fclose for now.

Reading from file

A bunch of functions exist in stdio.h to read the data from the file. For now, we'll discuss three of them:

Reading one character

The function fgetc reads one character from a file, and returns it, or EOF. It is very similar to getchar, but reads from the given file, not from the standard input:

int c;

c = fgetc(f);

Reading according to a format

The file counterpart of scanf is fscanf, which accepts the file as its first parameter:

int a, b, count;

count = fscanf(f, "%d%d", &a, &b);

Just like scanf, it returns the number of values successfully read, and you need to pass int variables as pointers so that it could change them.

Reading a line

Now, this is the real reason of this "file detour". The function fgets reads one line from the file, including the trailing \n character. It accepts a buffer–a character array of some size–and won't read more than you ask it for:

char buf[100];
char *result;

result = fgets(buf, 100, f); /* file as the last parameter */

fgets will always put zero character at the end of the string it reads, making it possible to use all the string functions we discussed before.

In the majority of cases where you need to read a text file line by line, fgets is the function you will use.

fgets returns the string it reads into, or NULL if they were unable to read any characters. On the next page, we'll practice with fgets more to make sure you understand how it works.

A note about files in the browser

For this course, I made the code execution work right here, in your browser. With files, it's becoming weird: your browser won't get access to your real files. Whenever needed, I'll provide a fake input file for your code so you could still run your program right here, but at this point you might want to start looking into using the real compiler on your laptop or desktop. We'll talk more about it later.

Example

For now, here's an example for you to play with. The file input.txt contains two lines:

first line
42 560

The example below reads this file twice, first using fgets to read a string, and then using fscanf to read integer variables. Feel free to play with this example to understand how things work. Try reading a nonexisting file to see how an error will be printed!

#include <stdio.h>

int main() {
	FILE *f;
	int count, a, b;
	char buf[20];

	f = fopen("input.txt", "r");
	if (!f) {  /* if it's NULL, it's false */
    perror("fopen"); /* it will print: "fopen: <reason of the error>" */
    return 1; /* non-zero exit code to say that we have an error */
  }

	if (!fgets(buf, 20, f)) {
		printf("fgets could not read anything\n");
		return 1;
	}

	printf("fgets read the line: <%s>\n", buf); /* note: \n is included in buf */
  printf("note that the line break was also read into the string!\n");

	count = fscanf(f, "%d%d", &a, &b);
	printf("fscanf read %d values: %d, %d\n", count, a, b);

	fclose(f); /* ignoring any unlikely error here, because we can't do much about it anyway */

	return 0;
}

On the next page, we'll practice more, and we'll also see how the standard input can be accessed as a file.