Exercise 1-13

(source code for vertical histogram)
Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with the bars horizontal; a vertical orientation is more challenging.

Printing a histogram horizontally should not be too difficult. We can use the same code from the previous exercise, except this time, we print dashes in place of the actual characters in the words.

    #include <stdio.h>
    
    #define IN 1    /* inside a word */
    #define OUT 0   /* outside a word */
    
    main()
    {
        int c, state;
    
        state = OUT;
        while ((c = getchar()) != EOF) {
            if ((c == ' ' || c == '\n' || c == '\t') && state == IN) {
                putchar('\n');
                state = OUT;
            } else {
                putchar(c);
                putchar('-');
                state = IN;
            }
        }
    }

In order to print the histogram vertically, we first need to store the lengths of the words, because we cannot print in columns. We do this by creating the integer array len to store the word lengths and the index i to keep track of which word length we are counting. Whenever our state changes from OUT to IN, we increment i since that means we have come across the next word. Whilst state is equal to IN, we increment len[i]—the length of the current word—for every character in the word.

    #include <stdio.h>
    
    #define IN 1    /* inside a word */
    #define OUT 0   /* outside a word */
    #define MAX 100 /* maximum number of words */
    
    main()
    {
        int c, i, state;
        int len[MAX];   /* lengths of words */
    
        for (i = 0; i < MAX; ++i)
            len[i] = 0;
        i = -1;
        state = OUT;
        while ((c = getchar()) != EOF) {
            if (c == ' ' || c == '\n' || c == '\t')
                state = OUT;
            else if (state == OUT) {
                ++i;
                state = IN;
            }
            if (state == IN)
                ++len[i];
        }
    }

Note: never forget to initialize integer arrays to zero; otherwise, the array gets filled with garbage values. What makes this particularly scary is that oftentimes, the garbage values are reasonable such as zero, which can make it hard to spot bugs like these.

Note: we initialize state to OUT because otherwise, if the input started with whitespace, i would get incremented on the first non-whitespace character causing len[0] to equal zero, and the first length being stored in len[1]. On the other hand, if state starts as OUT, we know that i will always increment on the first word regardless of whether there is leading whitespace or not, which is why i is initialized to -1.

Now that we have the lengths of the words, we need to start thinking about how we can print the histogram. Assuming we want to print it from bottom to top, we need a way to keep track of the longest word so that we can know in advance how many lines to use for the histogram. We can do this by declaring the integer longest. Every time we go "out" of a word, we can check if our current length is greater than longest: if it is, we update longest to equal our current length.

    #include <stdio.h>
    
    #define IN 1    /* inside a word */
    #define OUT 0   /* outside a word */
    #define MAX 100 /* maximum number of words */
    
    main()
    {
        int c, i, state;
        int longest;    /* longest word length */
        int len[MAX];   /* lengths of words */
    
        for (i = 0; i < MAX; ++i)
            len[i] = 0;
        i = -1;
        state = OUT;
        longest = 0;
        while ((c = getchar()) != EOF) {
            if (c == ' ' || c == '\n' || c == '\t') {
                state = OUT;
                if (i >= 0 && len[i] > longest)
                    longest = len[i];
            } else if (state == OUT) {
                ++i;
                state = IN;
            }
            if (state == IN)
                ++len[i];
        }
    }

To display the graph, we write a while-loop that decrements longest after every iteration. Within every iteration, we use a for-loop that runs through every element of len to check if each length is greater than or equal to longest. If it is, we print a '|'; otherwise, we print a space. After the for-loop finishes executing, we print a newline to begin the next row. The while-loop runs until longest reaches zero.

    #include <stdio.h>
    
    #define IN 1    /* inside a word */
    #define OUT 0   /* outside a word */
    #define MAX 100 /* maximum number of words */
    
    main()
    {
        int c, i, state;
        int longest;    /* longest word length */
        int len[MAX];   /* lengths of words */
    
        for (i = 0; i < MAX; ++i)
            len[i] = 0;
        i = -1;
        state = OUT;
        longest = 0;
        while ((c = getchar()) != EOF) {
            if (c == ' ' || c == '\n' || c == '\t') {
                state = OUT;
                if (i >= 0 && len[i] > longest)
                    longest = len[i];
            } else if (state == OUT) {
                ++i;
                state = IN;
            }
            if (state == IN)
                ++len[i];
        }
        /* print the histogram */
        while (longest > 0) {
            for (i = 0; len[i] > 0; ++i) {
                if (len[i] >= longest)
                    putchar('|');
                else
                    putchar(' ');
            }
            putchar('\n');
            --longest;
        }
    }

Note: because all elements in len were initialized to zero and no word can have a length of zero, we know that when len[i] is zero, we have reached the end of our lengths.