7. Structures

A structure is a collection of one or more variables, possible of different types, grouped together under a single name for convenient handling. Structures are called “records” in some languages, notably Pascal. Name  record is familiar also to databases like Oracle. Structures help to organize complicated data, particularly in large programs, because they permit a group of related variables to be treated as a unit instead of as separate entities. As matter of fact, when we unite structs together with arrays, the database is created.

Usage:
struct name {
int variable1;  /* member*/
char variable2;  /* member*/

float variable3  /* member*/
};

Let’s create structure for graphics. The basic object is a point , which has a x coordinate and an y coordinate, both integers:

struct point {
int x;
int y;
};

Next declaration

struct point pt;

defines a variable pt, which is  a structure of type struct point. A structure can be initialized by following its definition with a list of initializers, each a constant expression, for the members:

struct point maxpt = { 320, 200 };

An automatic structure may also be initialized by assignment or by calling a function that returns a structure of the right type. A member of a particular structure is referred to in an expression by a construction of the form

structure-name.member

The structure member operator “.” connects the structure name and the member name. To print the coordinates of the point pt, for instance,

printf(“%d,%d”, pt.x, pt.y);

or to compute the distance from the origin (0,0) to pt,

double dist, sqrt(double);
dist = sqrt((double)pt.x * pt.x + (double)pt.y * pt.y);

Structures can be nested. One representation of a rectangle is a pair of points that denote the diagonally opposite corners:

struct rect {
struct point pt1;
struct point pt2;
};

The rect structure contains two point structures. If we declare screen as

struct rect screen;

then

screen.pt1.x

refers to the x coordinate of the pt1 member of screen.

Structures and functions

The only legal operations on a structure are copying it or assigning to it as a unit, taking its address with &, and accessing its members. Copy and assignment include passing arguments to functions and returning values from functions as well. Structures may not be compared.

First example, function makepoint, will take 2 integers and return a point structure.

/* makepoint: make a point from x and y components */
struct point makepoint (int x, int y)
{
  struct point temp;
  temp.x = x;
  temp.y = y;
  return temp;
}

Notice that there is no conflict between the argument name and the member with same name.
Makepoint can now be used to initialize any structure dynamically, or to provide structure arguments to a function:

  struct rect screen;
  struct point middle;
  struct point makepoint (int, int);
  screen.pt1 = makepoint(0,0);
  screen.pt2 = makepoint (XMAX, YMAX);
  middle = makepoint((screen.pt1.x + screen.pt2.x)/2, (screen.pt1.y + screen.pt2.y)/2);

The next step is a set of functions to do arithmetic on points. For instance,

/* addpoint: add two points */
struct point addpoint(struct point p1, struct point p2)
{
  pi.x += p2.x;
  pi.y += p2.y;
  return p1;
}

As another example, the function ptinrect tests whether a point is inside a rectangle, where we have adopted the convention that a rectangle includes its left and bottom sides but not its top and right sides.

/* ptinrect: return 1 if p in r, 0 if not */
int ptinrect(struct point p, struct rect r)
{
  return p.x >= r.pt1.x && p.x < r.pt2.x && 
         p.y >= r.pt1.y && p.y < r.pt2.y;
}

The following function returns a rectangle guaranteed to be in canonical form:
Here we have used #define to create a macro, which is very useful method in many cases.

#define min(a,b) (((a) < (b) ? (a) : (b)) 
#define max(a,b) (((a) > (b) ? (a) : (b))

/* canonrect: canonicalize coordinates of rectangle */
struct rect canonrect(struct rect r)
{
  struct rect temp;
  temp.pt1.x = min(r.pt1.x, r.pt2.x);
  temp.pt1.y = min(r.pt1.y, r.pt2.y);
  temp.pt2.x = max(r.pt1.x, r.pt2.x);
  temp.pt2.y = max(r.pt1.y, r.pt2.y);
  return temp;
}

If a large structure is to be passed to a function, it is generally more efficient to pass a pointer than to copy the whole structure. Structure pointers are just like pointers to ordinary values. The declaration
struct point *pp;
says that pp is a pointer to a structure of type struct point. If pp points to a point structure, *pp is a structure, and (*pp).x and (*pp).y are the members. To use pp, we might write, for example,

struct point origin, *pp;
pp = &origin;
printf("Origin is (%d, %d)\n", (*pp).x, (*pp).y);

The parenthesis are necessary in (*pp).x because the precedence of the structure member operator . is higher than *.

Arrays of structures ( a way to databases)

If we compare this to a database, construction struct is a way to create a record in database. Together with arrays it creates a database.

Consider writing a program to count the occurrences of each C keyword. We need an array of character string to hold the names, and an array of integers for the counts. One possibility is to use 2 parallel arrays, keyword and keycount, as in
char *keyword[NKEYS];
int keycount[NKEYS];

But the very fact that the arrays are parallel suggests a different organization, an array of structures. Each keyword is a pair:

char *word;
int count;

and there is an array of pairs. The structure declaration

struct key {
  char *word;
  int count;
} keytab[NKEYS];

declares a structure type key, defines an array keytab of structures of this type, and sets aside storage for them. Each element of the array is a structure. This could also be written

struct key {
  char *word;
  int count;
};
struct key keytab[NKEYS];

And the structure initializing is analogous to earlier ones – the definition is followed by a list of initializers enclosed in braces:

struct key {
char *word;
int count;
} keytab[] = {
"auto", 0,
"break", 0,
"case", 0,
/* ... */
"while", 0
};

The keyword-counting program begins with the definition of keytab. The main routine reads the input by repeatedly calling a function getword that fetches one word at a time.

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXWORD 100
int getword(char *, int);
int binsearch(char *, struct key *, int);
/* count C keywords */
int main()
{
  int n;
  char word[MAXWORD];
  while (getword(word, MAXWORD) != EOF) /* if EOF dont work, try \n */
    if(isalpha(word[0]))
      if(n=binsearch(word, keytab, NKEYS)) >= 0)
          keytab[n].count++;
  for(n=0; n<NKEYS; n++) 
    if(keytab[n].count > 0)
      printf("%4d %s\n", keytab[n].count, keytab[n].word);
  return 0;
}
/* binsearch: find word in tab[0] ... tabn-1] */
int binsearch(char *word, struct key tab[], int n)
{
  int cond, low, high, mid;
  low=0;
  high= n-1;
  while (low <= high)
  {
    mid = (low+high)/2;
    if((cond = strcmp(word, tab[mid].word)) <0) 
      high = mid - 1; 
    else if(cond > 0)
      low = mid + 1;
    else
      return mid;
  }
  return -1;
}

The quantity NKEYS is the number of keywords in keytab. That number could count by hand, but it is safer to do it by machine using operator called sizeof, that can be used to compute the size of any object.
Usage: sizeof object or sizeof (type name)
The size of the array is the size of one entry times the number of entries, so the number on entries is
size of keytab / size of struct key
This computation is used in a #define statement to set the value of NKEYS:
#define NKEYS (sizeof keytab / sizeof(struct key)/*add this to the program*/

Function getword fetches the next “word” from the input, where a word is either a string of letters and digits beginning with a letter, or a single non-white character.

/* getword: get next word or character from input */
int getword(char *word, int lim)
{
  int c, getch(void);
  void ungetch(int);
  char *w = word;
  while(isspace(c = getch()))
  ;
  if(c != EOF)
    *w++ =c;
  if(!isalpha(c))
  {
    *w ='\0';
    return c;
  }
  for( ; --lim > 0; w++)
  if (!isalnum(*w = getch()))
  {
    ungetch(*w);
    break;
  }
  *w = '\0';
  return word[0];
}

Getword uses function getch and ungetch which are declared and defined on the picture below. When the collection of an alphanumeric token stops, getword has gone one character too far. The call ungetch pushes that character back on the input for the next call.c Getword also uses isspace to skip white space, isalpha to identify letters and isalnum to identify letters and digits; all are from the standard header .

NOTE! This chapter  exercises can be found in quiz exercises to chapter 7 This information will be tested in quizzes 7-9! It is very important that the student will make all examples shown in this chapter.