~ Introduction to Programming: C ~

Session 13 - Working with Text and Strings

<<  Back to Contents Page

Session 13 - Working with text and strings 

In This Session...

In this session we will be covering the following topics:-

Review of Last Session

Last session we covered:-

Recap on String Handling Functions

As we saw in the previous session, the string.h library contains a list of most of the commonly used functions for working with strings.

A string is really an array of characters - i.e. a number of characters stored in a single variable one-after-another to form a sentence, with each character referenced by an index number, the first being 0 (zero).

And remember - to store a string in a character array variable, you have to leave at least enough room to hold the string PLUS an extra terminating NULL (ASCII value 0 - represented by a \0 in C) character.

So, to hold the string "Hello", we would need an array that would hold the 5 characters in "Hello" plus the terminating zero. We would set this up as follows:-

  char my_string[6];
  strcpy( my_string, "Hello" );

Note, there's nothing wrong with making an array bigger if you anticipate that it may hold quite a large value - for example, if you are getting a string from the keyboard using gets(my string); you might decide that the user could input quite a large string, and allows 500 characters.

Note that you could use scanf("%s",&my_string); to read a string, but this suffers from the fact that scanf will stop reading the string once it reaches any word breaks (e.g. a space or tab) even though it lets you type in a full sentence. This can cause  big problems.

Finally, you can use printf("My string is: %s\n", my_string); to output your string to the console. There is alos an easy-to-use function called puts that you could use if you prefer - e.g. puts("My string is: "); puts(my_string); putc('\n'); - note the putc puts out a single character (in this case a newline) to the console. This does he same thing as printf, but using three instructions instead of one.

In this session, we will be looking at a few more useful functions from string.h for working with strings: Specifically, strlen() for finding the length of a string, strcmp() for comparing two strings, and strcat() for joining two strings together. We will also look at sprintf() for performing the same function as strcat() but in a different way.

We will also take a brief look at converting from numbers and strings - e.g. atof() for converting a string to a floating point number (alpha to floating point), atoi() for converting a string to an integer number (alpha to integer). We will take a quick look at using sprintf for converting numbers to strings if required.

Length of a string

We can determine the length of a string by using the strlen() function from the string.h library.

The length is the number of characters in a character array (i.e. string) up to and excluding the terminating ASCII zero character. So, for example we might define an array of 30 characters, but only put the phrase "Hello World" in it. The length of the string would be 11 (not 12 or 30).

Why might we wish to do this? Often for validation (i.e. checking) purposes. For example, just say we wanted to check that a postcode had been entered correctly. We might know that it should be in the format of (say) LLn[n] n[n]LL - where an n is a digit and an L is a letter, and an [n] is an optional digit. So the postcode should be between 7 and 9 digits.

We could perform a check as follows to ensure that this is the case:-

  char post_code[10];
  gets (post_code);
 
  if ( (strlen(post_code) < 7) || (strlen(post_code) > 9) )
    printf( "Post code is incorrect length\n" );
  else
    printf( "Post code is correct length" );

We could check to see whether anything has been entered at all by checking to see whether the length of the string is greater than 0. Another method is to check to see if the first character in the character array is ASCII value zero, showing that the string terminates before there are any characters in the array:-

  char str[10];
  gets (str);
 
  if ( strlen(str) == 0 )
    printf( "Nothing entered\n" );
  else
    printf( "Something entered" );
 
  /* or alternatively... */
 
  if ( str[0] == '\0' )
    printf( "Nothing entered\n" );
  else
    printf( "Something entered" );

Exercise 49 - Validate a postcode

So let's try this out in a practical example.

Write a program that defines a character array variable to hold a postcode, but  which could accept up to 50 characters, just in case the user types something daft in / falls on the keyboard etc.

In the main part of the program, create a loop that asks for the postcode and  reads data into the variable from the keyboard. The loop will carry on asking for a postcode while the length of the variable is not between 7 and 9 characters long (i.e. less than 7 (<7) or (||) greater than 9 (>9) characters long).

Finally, output the string as part of a printf statement, to say:-

The postcode NN3 5XT is Okay.

for example (where you would use %s and the variable name instead of NN3 5XT - see details of printf in last session).

Compare two strings

We can compare two strings by using the strcmp() function from the string.h library.

How can we compare two strings? We would compare them alphanumerically. This means that we treat letters as being bigger than numbers, so that (for example) A comes after 9, but 0 comes before 9. Similarly, Z comes after A. Thus, every character is compared between two strings, starting from the left. If every character is the same, then the two strings are equal, and strcmp() will give us a result of 0. If when checking, the first string contains a character that comes before (is less than) the character at the same position in the second string, then it is said to be smaller (less than) the second string, and gives a result that is less than zero (<0). If the first string contains a character that comes after (is greater than) the character at the same position in the second string, then it is said to be greater than the second string, and gives a result that is greater than zero (>0).

This is a bit like comparing names in a telephone directory - for example, Adams comes before Addams in the address book, so if you compared these two as strings, then strcmp() would give you a result less than 0 to indicate this. If you swapped them around, then Addams comes after Adams, so you would get a result greater than 0 to indicate this.

Can you see why? The first two characters Ad are the same, but the third character is different - in the first string, it is a, and in the second, it is d. Thus, as a comes before d, the first string must be before (or less than) the second string.

So why would be want to use this? Typically, it is used to check whether two string are the same rather than if they are less than or greater than each other. Thus if, the result of strcmp() is zero (==0), we can say that the two strings are the same, and if the result is non-zero (!=0) then we can say that the two strings are different.

The less/greater than use of strcmp is often used if we wish to sort a list of strings into alphanumeric order.

As we cannot use a string in a switch..case statement to select from a number of possible values, we can use strcmp combined with an if statement instead. For example, here is a code snippet that checks a hard-coded (i.e. written into the program) password:-

  char pword[21];
 
  printf( "Password: " ); gets( pword );
 
  if ( strcmp( pword, "letmein" ) == 0 )
    printf( "Password correct - continue" );
  else
    printf( "Password incorrect. Go away, intruder!" );

As you can see, the strcmp function takes two parameters (i.e. inputs) - the two strings to be compared, separated with a comma. The result will be 0 only if the two words are exactly the same.

Note that strcmp is case-sensitive, which means that A is treated differently to a, as these are two different characters with two different ASCII values.

Exercise 50 - Which colour?

Let's create a program that allows the user to choose one of the three paint colours (either magenta, cyan, or yellow) by typing in the colour, and then another colour. The program will then display what colour is produced by mixing these two colours together - the result is green for cyan+yellow, red for magenta+yellow and blue for magenta+cyan.

So how do we go about doing this?

You will need two variables to hold the two variables to be input - maybe one called colour1 and the next called colour2. Make them long enough to hold about ten characters, in case of misspelling.

You will need a do..while loop, inside which you will asks for the first colour:-

Enter first colour (magenta, cyan or yellow):

and retrieve the user's input from the keyboard into the colour1 variable.

The loop will carry on while an incorrect input has been made - so the condition would be if the colour1 variable is not magenta and is not cyan and is not yellow. For example, (strcmp(colour1,"orange")!=0) might be the condition you use for saying that colour1 is not orange.

You will then need to repeat this for the colour2 variable.

Finally, you will need to use strcmp again to check whether colour1 is cyan and (&&) colour2 is yellow, or (||) the reverse - if colour1 is yellow and (&&) colour2 is cyan.  You are best to break the condition down into two parts, put brackets around each of the two parts, and then join them up with an && operator.  If the condition is true, then output the text result is green.

Do the same check with a result of red for magenta+yellow and blue for magenta+cyan.

Joining Strings

You will often need to join several strings together, and output the result in some way - e.g. to the screen.

There are two main ways of joining two strings together:-

For example, just say we wish to ask for a forename and surname, and store the entire name in a third variable. using strcat, we might do the following:-

  char first_name[21], surname[21], full_name[41];
  printf( "First name: " ); gets( first_name );
  printf( "Second name: " ); gets( surname );
  strcpy( full_name, first_name );
  strcat( full_name, " " );
  strcat( full_name, surname );
  printf( "Full name is: %s", full_name )

Notice that to build up a new string, we need first to copy the first name into the full name string, and then a space, and then the surname. Also note that if we add two strings of a maximum of 20 characters together (plus 1 for the zero-terminator), then the new string can be a maximum of 40 characters (plus 1 for the zero-terminator).

Here's how we might do it using sprintf:-

  char first_name[21], surname[21], full_name[41];
  printf( "First name: " ); gets( first_name );
  printf( "Second name: " ); gets( surname );
  sprintf( full_name, "%s %s", first_name, surname );
  printf( "Full name is: %s", full_name )

sprintf works the same way as printf, except the result goes into a string variable (full_name in this case) instead of to the screen. The s in sprintf refers to the results going to a string.

So in this case, we have specified that we wish to have a string followed by a space followed by a string as the results, where the first string is first_name and the second string is surname. Although the construction of a sprintf is a little more complex, you can get the same job achieved in only one line of programming, making your program easier-to-read and quicker-to-type!

Exercise 51 - Comma-separated list

A common task in programming is to produce a list of items, separated by commas - often to be output to a disk file (which can then be used by other computers, or perhaps a spreadsheet program).

We will create a program that will read in five items of data, and output them with commas between each of the items.

To do this, we will use three variables: one to count from 1 to 5 (let's call it count and make it an integer); one called data which is a string variable that can hold up to twenty characters (used to read data in from the keyboard), and one called result which is the list of data, separated by commas, built up from keyboard input.

Use a for loop to count from 1 to 5.

Inside the for loop, show the string Enter data item %d: where %d will be replaced with the for loop counter variable (count).

Then get the data from the keyboard into the data variable.

Next, we will need to:-

When the loop has completed, we will output the results of the list, built up in the data variable.

The result might look like this:-

Enter data item 1: hello

Enter data item 2: bonjour

Enter data item 3: guten tag

Enter data item 4: buenos dias

Enter data item 5: buon giorno

List is: hello,bonjour,guten tag,buenos dias,buon giorno

Try changing the program to add an extra space after the comma. This looks neater on the screen:-

List is: hello, bonjour, guten tag, buenos dias, buon giorno

... but wouldn't be a good idea if we were going to output the data to a file. Lucky we're not covering that on this course!

Exercise 52 - Converting strings and numbers

What would happen if you tried the following program? 

#include <stdio.h>

#include <conio.h>

#include <string.h>



main()

{

  int result;

  char num1[21], num2[21];



  strcpy( num1, "2" );

  strcpy( num2, "3" );

  result = num1 + num2;



  printf( "Result of %s+%s=%d", num1, num2, result );

  getch();

}

Try it - you should get an error message on the line:-

  result = num1 + num2;

saying "Invalid Pointer Addition".

This means that you are trying to add two strings together. Now, a string might contain digits, as they do in this case, but they could just as easily contain "Fred" and "blue". What would we get if we added these together?

You cannot do arithmetic on strings in C. What we really want is to convert the string to a number and do the arithmetic on the results. To convert a string to an integer value, we use the atoi function, taking the string as the input parameter (i.e. between the brackets). The output (result) is the integer version of the string. The first character that atoi does not understand in the string will stop the conversion. If it doesn't understand even the first character of the string, it will give a result of zero (0).

The atoi function is located in the stdlib.h library, so make sure you include this too. Our program now becomes:-

#include <stdio.h>

#include <conio.h>

#include <string.h>

#include <stdlib.h>



main()

{

  int result;

  char num1[21], num2[21];



  strcpy( num1, "2" );

  strcpy( num2, "3" );

  result = atoi(num1) + atoi(num2);



  printf( "Result of %s+%s=%d", num1, num2, result );

  getch();

}

There is a similar function for converting string to floating point numbers (e.g. into float variables) - this is called atof(), which works in the same way, but make sure the result goes to a float variable rather than an integer variable.

So how would we convert back to a string? This is where we use sprintf again. The following is an example to convert from an integer to a string:- 

  int i;
  char result[21];
  i = 10;
  sprintf( result, "%d", i );

... would take the contents of i and convert it to the string form - so result[0] would contain '1', result[1] would contain '0' and result[2] would contain '\0' (the zero terminator character).

Similarly, to convert from a floating point value to a string:-

  float f;
  char result[21];
  f = 3.5;
  sprintf( result, "%3.1f", f );

... would take the contents of f and convert it to the string form - so result[0] would contain '3', result[1] would contain '.', result[2] would contain '5' and result[3] would contain '\0' (the zero terminator character).

Exercise 53 - Numerology: Master Number

Numerology is a method of decomposing a name down to numbers. Each of your birth (or adoptive) names that comprise your whole name are taken separately, and the numbers that represent the letters are added together. If the total of each name is >9 (unless 11 or 22 which are considered MASTER NUMBERS), they are broken down by adding the individual digits together.

The totals of each name are then added together, and decomposed into a number from 1-9 or 11 or 22 in the same way. This is then your expression number, from which a very generalised description can be derived.

In order to turn this into a C program, we will need to take individual characters from the string and turn them into numbers, and also take individual digits for the number obtained for each name (i.e. turn the number into a string) and add them up to produce a final number, repeatedly until we get a number 1-9 or 11 or 22.

Then, we will take each of the possible numbers, and give a very generalised summary of potential profession / character traits.

The numbers for each letter are given in the table below:-

1

2

3

4

5

6

7

8

9

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

 

If you would like more information on this, click here for a link to a web site from which these rules were obtained.

Here's the program to do this. Try it out and execute it, and see if you can work out how it comes to its answers:-

#include <stdio.h>

#include <stdlib.h>

#include <conio.h>

#include <string.h>

#include <ctype.h>



main()

{

  int i, j, result, subtotal, num, num_in_alphabet;

  char name[41], num_str[11], ch;



  printf( "Enter your name: " );

  gets( name ); strcat(name," ");



  subtotal = 0; result = 0;

  for ( i=0; i<strlen(name); i++ )

  {

    ch = toupper(name[i]);

    if ( ch != ' ' )

    {

      num_in_alphabet = ch - 'A'; /* gives 'A' as 0 */

      num = ( num_in_alphabet % 9 ) + 1;

      subtotal += num;

    }

    else

    {

      while ( (subtotal>9) && (subtotal != 11) && (subtotal != 22) )

      {

        sprintf( num_str, "%d", subtotal );

        subtotal = 0;

        for ( j=0; j<strlen(num_str); j++ )

        {

          ch = num_str[j];

          subtotal += ch - '0';

        }

      }

      result += subtotal;

      subtotal = 0;

    }

  }



  while ( (result>9) && (result != 11) && (result != 22) )

  {

    sprintf( num_str, "%d", result );

    result = 0;

    for ( j=0; j<strlen(num_str); j++ )

    {

      ch = num_str[j];

      result += ch - '0';

    }

  }



  printf( "Your expression number is: %d\n\n", result);

  printf( "The traits are potentials - use them as you will...\n  ");

  switch(result)

  {

    case(1): printf("Skilled Executive; Achiever; Self-Centred"); break;

    case(2): printf("Diplomat/Partnership; Modest; Easily Hurt"); break;

    case(3): printf("Writer/Speaker; Optimist; Superficial"); break;

    case(4): printf("Craftsman; Responsible; Rigid/Dogmatic"); break;

    case(5): printf("Thinker/Seller; Love Freedom; Same mistakes"); break;

    case(6): printf("Carer/Artist; Loving/Generous; Martyr/Doormat"); break;

    case(7): printf("Scientist/Occultist; Rational/Balanced; Introverted/Intolerant"); break;

    case(8): printf("Manager/Entrepeneur; Ambitious/Confident; Status-driven/Stubborn"); break;

    case(9): printf("Teacher/Creative; Tolerant/Friendly; Aloof/Insensitive"); break;

    case(11): printf("MASTER NUMBER\nInspirational/Spiritual\n");

              printf("Intuitive, Analytical, Spiritual\nHigh-minded/Temperemental/Over-Sensitive");

              break;

    case(22): printf("MASTER NUMBER\nMaster Builder/Strong Leader\n");

              printf("Practical/Inner-Strength\nDominating/Overbearing;Eccentric");

  }

  printf("\n\n...That's you, that is!");



  getch();

}

Solution to Exercise 49

The following is a solution exercise 49:-

#include <stdio.h>

#include <conio.h>

#include <string.h>

main()

{

  char post_code[51];
 
  do
  {
    printf("Enter post code: ");
    gets(post_code);
  } while ( (strlen(post_code) < 7) || (strlen(post_code) > 9) );
 
  printf( "The postcode %s is Okay.\n", post_code );
  getch();

}

Solution to Exercise 50

The following is a solution exercise 50:-

#include <stdio.h>

#include <conio.h>

#include <string.h>



main()

{

  char colour1[51], colour2[11];



  do

  {

    printf("Enter first colour (magenta, cyan or yellow):");

    gets(colour1);

  } while ( (strcmp( colour1, "magenta" ) != 0) &&

            (strcmp( colour1, "cyan" ) != 0) &&

            (strcmp( colour1, "yellow" ) != 0)

          );



  do

  {

    printf("Enter second colour (magenta, cyan or yellow):");

    gets(colour2);

  } while ( (strcmp( colour2, "magenta" ) != 0) &&

            (strcmp( colour2, "cyan" ) != 0) &&

            (strcmp( colour2, "yellow" ) != 0)

          );



  if ( (

         (strcmp( colour1, "cyan") == 0 ) && (strcmp ( colour2, "yellow" ) == 0 )

       ) ||

       (

         (strcmp( colour1, "yellow") == 0 ) && (strcmp ( colour2, "cyan" ) == 0 )

       )

     )

     printf("result is green\n");



  if ( (

         (strcmp( colour1, "magenta") == 0 ) && (strcmp ( colour2, "yellow" ) == 0 )

       ) ||

       (

         (strcmp( colour1, "yellow") == 0 ) && (strcmp ( colour2, "magenta" ) == 0 )

       )

     )

     printf("result is red\n");



  if ( (

         (strcmp( colour1, "magenta") == 0 ) && (strcmp ( colour2, "cyan" ) == 0 )

       ) ||

       (

         (strcmp( colour1, "cyan") == 0 ) && (strcmp ( colour2, "magenta" ) == 0 )

       )

     )

     printf("result is blue\n");



  getch();

}

Solution to Exercise 51

The following is a solution exercise 51:-

#include <stdio.h>

#include <conio.h>

#include <string.h>



main()

{

  int count;

  char data[21], result[101];



  for ( count=1; count<=5; count++ )

  {

    printf( "Enter data item %d: ", count );

    gets( data );



    if ( count==1 )

      strcpy( result, data );

    else

    {

      strcat( result, ", " );

      strcat( result, data );

    }

  }



  printf( "List is: %s\n", result );

  getch();

}

In the Next Session...

In the next session we will be covering the following topics:-

(c) Copyright 2003-4 Simon Huggins.   All Rights Reserved.
If you have any issues or questions regarding the content of this web site, please contact the author by clicking here.
Alternatively, you can leave a voice message on 00 44 (0)7050-618-297 or fax on 00 44 (0)7050-618-298

This Page was last updated: 29 January 2004 13:07