CPS 346 &
444/544 Lecture notes: System Libraries and I/O
Coverage: [UPE] Chapters 3 and
6
Standard I/O vs. file I/O
- file streams and standard streams
(really one in the same)
- to what do you connect the stream? file or device?
- what streams are automatically open for you?
- stdin
- C analog of cin in C++
- connected to keyboard by default
- stdout
- C analog of cout in C++
- connected to display by default
- stderr (connected to display by default)
- can redirect stdin, stdout, and stderr to files
Standard I/O redirection
(setup for free by the shell)
- < (redirects stdin)
- << (redirects stdin to HERE file)
- > (redirects stdout, overwrites)
- >> (redirects stdout, but appends)
Demo of cat
- cat [<file(s)>] (concatenate):
displays contents of one or more files to standard output
- capable of reading from file input or standard input
- only writes to standard output
Redirecting standard I/O
 |
devices in UNIX are represented as files,
for instance, /dev/console.
UNIX commands receive their input from the standard input (stdin) and
send their output to the standard output (stdout), by default these files
are the console or terminal.
|
- shells allow I/O to be redirected to other devices;
thus UNIX commands are unaware from what device their input may originate
or to what device their output may be sent.
- the output redirection symbol, >, sends a command's output
to the specified file instead of the console/terminal screen.
command name [args] > filename
ls -l > ls.out
the file ls.out
is created if it does not exist (or emptied and overwritten
if it exists prior to command execution) and sent the directory listing.
- programs which send their output to the console/terminal may also be
redirected.
./a.out > out.txt
this allows execution results of a program to be captured in a file.
- the output redirection symbol >> appends the output to a file,
rather than overwriting an existing file.
./a.out >> out.txt
the results of several program runs can thus be saved in a single file
out.txt.
- the input redirection symbol, <, sends the command input
from the specified file instead of the console/terminal keyboard.
command name [args] < filename
./a.out < in.txt > out.txt
this allows program execution with several data sets very easily.
- the input redirection symbol << is known as the "here is"
symbol, and provides a mechanism for reading data from the same file as
a command is contained in (why might one want to do this?).
./a.out > out.txt << HERE
$ cat << HERE
hello folks
stop reading when you
see
HERE
./a.out > out.txt << HERE
first line of data
second line of data
. . .
last line of data
HERE
the string following <<, in this example, the term HERE,
when appearing at the beginning of a line, terminates the data input. The
HERE here logically acts like EOF (<ctrl-d>).
- pipes | are the logical extension of I/O redirection.
- pipes allow the stdout of one program to become the stdin of another
program.
- specifically, a pipe
redirects the standard output of the command to the immediate left
from the screen to the standard input of the command to the immediate
right.
ls -l | more
this command allows the viewing of the long listing of a large
directory one screen at a time.
ls -l > ls.out
more < ls.out
rm ls.out
this I/O redirection is an equivalent set of commands as
the prior, but requires the ls.out temporary file.
- pipes support interprocess communication and introduce concurrency
- recall UNIX model of computation
- pipes are the powerful communication mechanism
- pipes are the glue
- how can you verify that the processes in a pipeline are
running concurrently?
- tee: reads
from standard input and writes to standard output and files
(e.g., cat .profile | tee profile.bak)
More on redirecting standard error
- shell time vs. UNIX time
- former writes to terminal
- latter writes to standard error;
use fully qualified path: /bin/time
- examples:
$ time cat /etc/termcap >/dev/null 2>timelog.txt
$ /bin/time cat /etc/termcap >/dev/null 2>timelog.txt
- redirecting stderr to the same place as stdout
$ /bin/time 2>&1 | wc -l
$ wc ~/.profile ~/.kshrc > output 2>output # which will happen first?
$ wc ~/.profile ~/.kshrc > output 2>&1 # order not preserved
$ cat ~/.profile doesnotexist &>output-and-error
- investigate tee (e.g.,
$ cat ~/.profile doesnotexist | tee output-and-error)
File descriptors
- 0 for stdin, 1 for stdout, 2 for stderr
- the 0 is implicit in < when redirecting stdin
- the 1 is implicit in > and >>
when redirecting stdout
- must use the file descriptor to redirect stderr
- for instance, wc -q 2> errors
- often to /dev/null
(e.g., wc -q 2> /dev/null)
I/O in C
- scanf and printf
- what do they return?
- we must develop the habit of checking return values
- opening and closing files: fopen and fclose
- file pointer: FILE*
- fscanf is the C analog of the C++ extraction operator
(>>)
- fprintf is the C analog of the C++ insertion operator
(<<)
- conversion specifiers
- %d for decimal
- %f for floating-point
- %c for single character
- %s for string
- %x for hex
- formatted output ([CPL] §7.2, pp. 153-155),
between % and conversion character,
there may be, in order
- a minus sign, - indicating left justification
- a minimum field-width
- a period which separates the
field-width from the precision
- a precision
- EOF
- a #defined constant in stdio.h
- is <crtl-d> character on UNIX system
- getchar
- declared in stdio.h: the C analog of iostream in C++
Effect of a Successful Open on a File
(ref. [C] 7-17)
| "r" read | "w" write | "a" append
| File Exists
|
| - | Old contents discarded | -
| Error | File created | File created
|
|
|---|
File Does Not Exist
|
|---|
- "r+", "w+", "a+" Updating;
allows reading and writing
- "r+" Commonly
used to read and change an
existing file
Analogs from C++ to C
| C++ |
C |
| iostream |
stdio.h |
| cin |
stdin |
| cout |
stdout |
| >> |
fscanf |
| << |
fprintf |
Also, unlike C++, in C you must declare all variables in any function
prior to any other code; in other words, you cannot declare variables
in lexically scoped blocks in C.
Review of standard I/O functions
(ref. [C] 7-7)
|
stdin and stdout |
file I/O |
| character |
getchar putchar |
getc putc fgetc fputc ungetc |
| line |
gets puts |
fgets fputs |
| formatted |
scanf printf |
fscanf fprintf |
| record |
- - |
fread fwrite |
Developing cat in C
(ref. [CPL] Chapter 7, §§7.5-7.6, pp. 160-164)
/* ref. [CPL] Chapter 7, 7.5, p. 162 with minor modification by Perugini */
#include<stdio.h>
/* cat: version 1 */
void filecopy (FILE* ifp, FILE* ofp) {
char c;
while ((c = getc (ifp)) != EOF)
putc (c, ofp);
}
int main (int argc, char** argv) {
FILE* fp = NULL;
if (argc == 1)
filecopy (stdin, stdout);
else
while (--argc > 0)
if ((fp = fopen (*(++argv), "r")) == NULL) {
printf ("cat: can't open %s\n", *argv);
} else {
filecopy (fp, stdout);
fclose (fp);
}
return 0;
}
/* ref. [CPL] Chapter 7, 7.6, p. 163 with minor modifications by Perugini */
#include<stdio.h>
#include<stdlib.h>
/* cat: version 2 */
int main (int argc, char** argv) {
void filecopy (FILE* ifs, FILE* ofs);
int exit_status = 0;
char* prog = *argv;
FILE* fp = NULL;
if (argc == 1)
filecopy (stdin, stdout);
else
while (--argc > 0)
if ((fp = fopen (*(++argv), "r")) == NULL) {
fprintf (stderr, "%s: can't open %s\n", prog, *argv);
exit (1);
/* or use following line to continue processing */
exit_status = 1;
} else {
filecopy (fp, stdout);
fclose (fp);
}
if (ferror (stdout)) {
fprintf (stderr, "%s: error writing stdout\n", prog);
exit_status = 2;
}
exit (exit_status);
}
void filecopy (FILE* ifp, FILE* ofp) {
int c;
while ((c = getc (ifp)) != EOF)
putc (c, ofp);
}
Portability (safety)
char c;
while ((c = getchar()) != EOF) { ... }
use /* C-style comments */ vs. // C++-style comments
also, do not use TABs in your code
String copy code from first day of class
#include <stdio.h>
main() {
char* q = "copy this";
char* p = (char*) malloc (sizeof (char)*10);
char* r = p;
printf ("%s\n", q);
while (*p++ = *q++);
*p = '\0'; /* necessary? no */
printf ("%s\n", r);
}
String functions
- prototyped, not defined, in <string.h>
- int strlen (char*),
- int strcmp (char*, char*),
int strncmp (char*, char*, int)
- char* strcpy (char*, char*),
char* strncpy (char*, char*, n),
- char* strcat (char*, char*),
char* strncat(char*, char*)
- char* strdup (const char*)
- when copying or concatenating strings, make
sure destination string has sufficient space (memory)
`s' family of printf/scanf functions
Command-line arguments
- argc (argument count; command name is included)
- argv (argument vector, termined by null pointer;
argv[0] is command name)
- main (int argc, char* argv[]) or main (int argc, char** argv)
- echoargs.c
#include<stdio.h>
#include<stdlib.h>
int main (int argc, char* argv[]) {
int i;
printf ("argc is %d\n", argc);
for (i = 0; i < argc; i++)
printf ("argv[%1d] is %s\n", i, argv[i]);
exit (0);
}
- echopargs.c
#include<stdio.h>
#include<stdlib.h>
int main (int argc, char** argv) {
printf ("argc is %d\n", argc);
for (; *argv; argv++)
printf ("Next argument is %s\n", *argv);
exit (0);
}
argv array for the call ./a.out -wlc myfile
(regenerated with minor modifications from [USP] Fig. 2.2, p. 32)
Using a pointer to traverse an array
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#ifndef MAX_CANON
/* #define LINELEN 256 */
#define MAX_CANON 8192
#endif
/* traverse.c */
int main() {
/* char line[LINELEN+1]; */
char line[MAX_CANON+1];
char* p = NULL;
/* same as p = &line[0], right? */
p = line;
/* notice the parentheses */
while ((*p++ = getchar()) != '\n');
*p = '\0';
/* why can't we just print p? */
printf ("%20s\n", line);
exit (EXIT_SUCCESS);
}
* is faster than []
when a program requires limits,
use standard system-defined limits (e.g., MAX_CANON,
#defined
in limits.h
rather than arbitrary constants)
we cannot develop systems programs like application programmers
Demo of wc
- word, line, and byte count program
- capable of standard or file input
- always writes to standard output
- example: $ wc ~/.login ~/.tcshrc
Files
(regenerated from [USP] Fig. 4.3)
- buffering
- stdout is line buffered; means buffer is flushed when full,
when a newline is written to the buffer, or when a scanf is executed
- most disk files are fully buffered; means buffer is only
flushed when full
- stderr is unbuffered
- can explicitly flush a buffer by calling fflush
- can set type of buffering by calling setvbuf
- a return from main causes the buffers to be flushed
- in summary, disk I/O (fully buffered) vs. terminal I/O (line buffered) vs.
(regenerated from [USP] Fig. 4.2, p. 120)
the file descriptor table contains an entry for each open file
in the process
`the system file table, which is shared by all the processes
in the system, has an entry for each active open' ([USP] p. 120)
each entry in the system file table contains the file offset which
gives the current position in the file, the
access mode, and a count of the number of file descriptor table
entries pointing to it
`the in-memory inode table contains an entry for each active
file in the system' ([USP] p. 120)
each inode table entry contains a count of the system file table
entries pointing to it
what happens when a process executes close? the operating
system
- deletes the corresponding entry in the file
descriptor table
- decrements the count of the number of file descriptor table entries
pointing to the corresponding system file table entry
- deletes the corresponding system file table entry if that count is zero
- decrements the count of the number of system file table entries
pointing to the corresponding inode table entry
- deletes the corresponding inode table entry from memory if that count is zero
References
| [C] |
C Language for Experienced Programmers, Version 2.0.0, AT&T, 1988. |
| [CPL] |
B.W. Kernighan and D.M. Ritchie. The C Programming Language.
Prentice Hall, Upper Saddle River, NJ, Second edition, 1988. |
| [UPE] |
B.W. Kernighan and R. Pike. The UNIX Programming Environment.
Prentice Hall, Upper Saddle River, NJ, Second edition, 1984.
|
| [USP] |
K.A. Robbins and S. Robbins.
UNIX Systems Programming: Concurrency, Communication, and Threads.
Prentice Hall, Upper Saddle River, NJ, Second edition, 2003.
|
|