Chapter 8 Conditional statements

So far we have just talked about coding that runs in a single flow–all commands are executed in the given order, no matter what. This section is devoted to decisionmaking in code, running different parts of code selectively, depending on a “condition”, the results we have computed so far. We start with the very basic conditional statement–the if-statement (Section 8.1), then add the “otherwise” option to it (Section 8.2), and finally discuss the complications, related to conditional statements in case of vectors (Section 8.3).

8.1 if statement

The idea of conditional statements is to test a “condition”, and execute a certain part of the code only if it is true. This is the basic way of decision-making in computer programs.

We can write it in an abstract manner as

IF a condition is true THEN
  do something

the “something” will only be done if the condition is true. For instance, a store may implement a decision-making rule like

IF age >= 21 THEN
  sell liqueur

Such rules translate easily to R code as

if(age >= 21) {
   sell("liqueur")
}

Now the code will only sell liqueur to someone who is at least 21 years old. (Note that we still have to write the function sell() 😐).

You can write such an if-block as a separate script but that is of little use. More likely, it will be a part of a larger program as the decisions must be made based on something that happens before. So you probably want to write some code both before and maybe after the if-block. That code, the one before and the after the if-block, is executed unconditionally, no matter what the condition. So we can expand our store rules as

ask age
IF age >= 21 THEN
  sell liqueur
sell tea

The store asks age from everyone, sells liqueur only if age is at least 21, but afterward sells tea to everyone, no matter what the age. In R code we can write this as:

age <- ask("what is your age")
if(age >= 21) {
   sell("liqueur")
}
sell("tea")

This example displays all the basic usage of if-block. Here are some more details:

  • age <- ask(...) is executed unconditionally, no matter the age, as it is before the if-block. So everyone will be asked their age.
  • if-block is made of if()-statement and the block of code thereafter. The condition is written in parentheses () and the block of code in curly braces { }.
  • the condition, here age >= 21 should be something that R can calculate (called expression), and that results a single TRUE/FALSE value–a logical vector of length 1. It is a common mistake for the expression to result a longer logical vector, but that results in error the condition has length > 1. See Section 8.3.
  • finally, sell("tea") is outside the if-block (past the closing curly brace }). Hence it is not part of the block, and is always executed, no matter what the age.

Such an if-block is the most fundamental decision-making mechanism in programs. All coding languages contain comparable tools, conditional execution is present even in the extremely low level assembly language.

The condition inside if() can be any variable or expression that results to a logical value (TRUE or FALSE). Here an example using a logical expression, similar to how we tested age above:

porridge_temp <- 115
if(porridge_temp > 120) {  # logical expression
   cat("This porridge is too hot!\n")
}
                           # nothing printed: it is not too hot

But instead of a doing logical computation inside of if(), we can also use a logical variable:

## Exactly the same code as above
too_cold <- porridge_temp < 70
if(too_cold) {  # a logical value
   cat("This porridge is too cold!\n")
}

Obviously, you can use more complex logical expressions, both inside of if() or separately:

if(porridge_temp > 70 & porridge_temp < 120) {
   cat("This porridge is just right!\n")
                           # prints this message
}
## This porridge is just right!

Exercise 8.1 Define a function compareStrings that takes two strings as arguments. It should print a sentence “The second string is longer” if the second string is longer. If not, it should not print anything.

Hint: nchar(string) tell how long is the string (how many characters long).

See the solution

Exercise 8.2 Write a for loop from 1 to 10. In the loop, print out the number, followed by the work “even” if the number is even.

  1. the output should look like
1
2
 - even
3
4
 - even
... 
  1. the output should look like:
1
2 - even
3
4 - even
...

Hint: think if you need to print new line conditionally or unconditionally.

Remember: you can test evenness using the modulo operator %%, and the new line character is "\n".

See the solution

8.2 if-else

An easy extension of if-statement is if-else statement. This allows to do something if the condition is true, and something else if it is false.

8.2.1 if-else basics

We can write it as

IF a condition is true
  do something
OTHERWISE
  do something else

In R, we write these conditional statements using the keywords if and else using the following syntax:

if(condition) {
  # do something if true
} else {
  # do something else if false
}

Note that the the else needs to be on the same line as the closing } of the if block–otherwise R thinks that it was just an if statement and gets confused when it suddenly encounters an else.

Repeating the porridge example from above, we can write

porridge_temp <- 115
if(porridge_temp > 120) {
   cat("This porridge is too hot!\n")
} else {
   cat("This porridge is not too hot\n")
}
## This porridge is not too hot

The if-else sequence can be extended to more conditions using else if. For example, here is a function that can be used to test temperature of different bowls of porridge:

test_food_temp <- function(temp) {
   ## The function takes a single argument, the temperature
   ## of the porridge
   if(temp > 120) {
      status <- "This porridge is too hot!"
   } else if(temp < 70) {
      status <- "This porridge is too cold!"
   } else {
      status <- "This porridge is just right!"
   }
   status
}
test_food_temp(119)  # just right!
## [1] "This porridge is just right!"
test_food_temp(60)   # too cold!
## [1] "This porridge is too cold!"
test_food_temp(150)  # too hot!
## [1] "This porridge is too hot!"

See more about if, else and other related constructs in Section 16.

Exercise 8.3 Write a for loop from 1 to 10. In the loop, print out the number, followed by the work “even” or “odd”, depending if he number is even or odd. So the output should look like

1 odd
2 even
3 odd
4 even
...

Remember: you can test evenness using the modulo operator %%.

See the solution

Exercise 8.4 Can you afford to invite your friends to dinner?

How much money do you have?–put this into a vairalbe. How many friends do you have?–put this into a variable. How much does a mean cost?–put into a variable. What is the total price for all of you (including you!)? What is the total price with tip?

Print “can afford” or “cannot afford”, depending on if you have more or less money than the total price.

See the solution

8.2.2 if() has return value

Sometimes it is useful to use the fact that if() and if-else have return value–this is the return value of the last expression they evaluate. This feature can be used for assignments, or when returning a value in a function. Also, the ifelse() function we discuss in Section 8.3.3 is exclusively used through its return value. For instance, we can write

money <- 1000
comment <- if(money > 100) {
              "rich"
           } else {
              "poor"
           }
comment
## [1] "rich"

This piece of code first computes the if() statement, and given money is 1000, it results in value “rich”. This value is assigned to variable comment. We can re-write this statement as a function that comments on your financial situation:

financialStatus <- function(money) {
   if(money > 100) {
      "Congrats, you are rich :-)"
   } else {
      "You are poor :-("
   }
}
financialStatus(50)
## [1] "You are poor :-("

Now depending on the amount of money, the function comments your financial situation. As it is the last thing the function does, the comment will also its return value, here printed on screen.

Such code tends to be easier to read than code that uses auxiliary variables and a return statement.

Exercise 8.5 Rewrite the function test_food_temp() from above using these tools: it should contain no assignments and no return statements.

What do you think, which version is easier to understand?

See the solution

Exercise 8.6 Use this approach to implement absolute value: the function should test if the number is positive. In that case it should return the number, otherwise the negative of the number.

Note: this approach does not work with vectors, see Section 8.3.

See the solution

8.3 Conditional statements and vectors

Above, we discussed the decision-making based on a single condition. But it turns out that we need a slightly different approach if we have multiple different conditions–a data vector.

8.3.1 The problem

The basic if-else decisionmaking above in is based on a logical condition:

IF the condition is true THEN
   do something

However, R is a vectorized language, and the conditions may end up being not a single logical value, but logical vectors instead. For instance, what happens to our porridge-tester if we pass it not a single temperature, but temperatures of two bowls of porridge? Say, one bowl has temperature 115 and the other 125:

porridge_temp <- c(115, 125)
tooHot <- porridge_temp > 120
tooHot  # a vector of two logical values
## [1] FALSE  TRUE

Now let’s attempt to test the temperature with if() as in Section 8.1 above:

if(tooHot) {
   cat("Too hot!\n")
}
## Error in if (tooHot) {: the condition has length > 1

This results in an error “the condition has length > 1”. How can we fix it?

Before we can even start fixing it, let’s think what should the code do if it encounters two bowls of porridge–two logical values. In the example above, the first porridge is not too hot but the second one is too hot. What should the code do now? Print the “Too hot” message? But the other bowl is just fine… Should it print “too hot” when only one bowl is too hot? Or maybe we need to print a more complex message, something like “the first porridge is fine but the second one is too hot”? Or maybe something else?

Whatever is your answer here, it is fairly obvious that we cannot just live with a single message as above. We have to do something else. Here are a few suggestions. These may or may not be good options, depending on what exactly you want to achieve:

  • Report if at least one porridge is too hot. So there will be at least one child who cannot eat their meal.
  • Report if all porridges are are too hot. So even if you are alone with all these bowls, you still cannot eat.
  • Report a list of bowls and temperature, something like
  porridge 1 is just right
  porridge 2 is too hot
  ...

You cannot any further code unless you have answered this basic question–what do you want to do?

Exercise 8.7 You want to check out the new boba shop that just opened next door. You know it tends to be a pricy place, but they have a wide variety of interesting choices.

You write code to check if the items in their online menu are affordable (you set affordability limit at $7). What should the code report so you can decide if it is worth going?

See the solution

8.3.2 any() and all()

any() and all() are logical functions that mean pretty much the same as these words in the ordinary language. These operate on logical vectors, any() tests if any of the logical values in the vector is true (and returns a single TRUE in that case), all() tests if all values are true (and returns a single TRUE in that case). In effect, they are multi-way logical OR and AND operations.

Here is an example how to use these. First, let’s create the logical vector:

temp <- c(50, 115, 125)  # temperature of 3 bowls
temp > 120
## [1] FALSE FALSE  TRUE

So the porridge in the first two bowls is not too hot, but in the third one it is. any() returns TRUE as there is at least one too hot bowl:

any(temp > 120)
## [1] TRUE

but all() returns FALSE because not all porridge is too hot:

all(temp > 120)
## [1] FALSE

Hence we can use any() in case we want to test if any of the bowls is too hot, and all() will tell us if all bowls are too hot. The former is what we need if each child has just one bowl of porridge, the latter is what we need if you are alone with several bowls. This is an extremely important point: both any() and all() may be correct or incorrect, and it depends on what exactly do you want to do!

Now we can use any() and all() to do some more porridge testing. For instance, we can write code that tells if any of porridge is too hot, if all porridge is too hot, or if none is too hot:

if(all(temp > 120)) {
   cat("All porridge is too hot! 😱\n")
} else if(any(temp > 120)) {
   ## We only get here if all porridge is not too hot!
   cat("Some porridge is too hot! 🙄\n")
} else {
   ## We only get her if neither all, nor any porridge
   ## is too hot!
   cat("None of the porridge is too hot 👌\n")
}
## Some porridge is too hot! 🙄

This outputs “Some porridge is too hot” because only one bowl is too hot, the other too are fine (well, the first one is perhaps too cold…).

Note how the code works: first we test if all is too hot, and print the message if so. If this is not the case (that is what the first else does) then we test if any of it is too hot (that is what the else if does) and print the message if so. If neither of these was the case, then the code prints the third message.

Exercise 8.8

The boba shop that just opened next door has the following menu:
  • Brown sugar boba milk ($5)
  • Creme Brulee boba milk ($6)
  • Red bean boba milk ($7)
  • Grass jelly boba milk ($8)

Put these prices in a vector. Write code that tests if any of the menu items is affordable (price no more than $7) and prints either “you can get a drink” or “this is a too expensive place”.

Show that your code works:
  • using the original prices above
  • when the place rises the price of all drinks by $3.

See the solution

8.3.3 ifelse()

Both any() and all() allow to make a decision, depending on whether any or all of the logical conditions are true. But in both cases, it can only be a single decision: we either print this message or another message; we either compute this value or another value.

ifelse() is a way to perform multiple decisions: one for each logical value in the condition vector. In a way, it is just vectorised if-else. Let’s explain it with an example.

We can implement absolute value, using ifelse(). For a single number \(x\), we can compute it as:

absv <- function(x) {
   if(x < 0) {
      -x
   } else {
      x
   }
}
absv(-3)
## [1] 3
absv(3)
## [1] 3

But this only works for single numbers. If we want to compute it for a whole vector, then we need to repeat the above for every single component, for instance by putting it in a loop inside of the absv() function.

Exercise 8.9 Use the absv() function above. Compute absv(c(-3, 3)). What happens? Why?

Note: this exercise is about using the absv function above. R includes built-in abs() function that is vectorized. That is not what this exercise is about.

See the solution

Exercise 8.10 Amend the absv() function above with a for loop, so that absv(c(-3, 3)) will give the correct answer.

(No solution provided)

ifelse() contains such a loop. So we can re-define absolute value as

abs <- function(x) {
   ifelse(x < 0,
          -x,
          x)
}
abs(c(-3, 3, -1, 1))
## [1] 3 3 1 1

ifelse() takes three arguments: the first is the logical condition, a logical vector. This can be of any length. The second argument is what to do if the condition is true. This is also a vector of the same length as the condition. And the argument is the vector of values to pick when the condition is false. So we can write it schematically as

ifelse(condition, true value, false value)

ifelse() selects either true values or false values, depending on the logical conditions. This is done independently for each row.

Here is another explanation how the ifelse() example above works. Its first argument is x < 0. Depending on the value of x, it is either TRUE or FALSE. If it is true, ifelse picks the true value \(-x\). If it is false, it picks the false value \(x\). So we end up with true values \(-x\) for 1st and 3rd component, and false values \(x\) for 2nd and 4th component. The result is absolute value.

Exercise 8.11 Use ifelse() to emulate step function: \[\begin{equation*} f(x) = \begin{cases} 0 \quad\text{if } x \le 0 \\ 1 \quad\text{otherwise} \end{cases} \end{equation*}\]

Show that it works with the example vector c(-3, 3, -1, 1) above.

See the solution

Exercise 8.12 Use ifelse() to implement leaky relu: \[\begin{equation*} f(x) = \begin{cases} x \quad\text{if } x > 0 \\ 0.1\,x \quad\text{otherwise} \end{cases} \end{equation*}\]

Show that it works with the example vector c(-3, 3, -1, 1) above.

See the solution

Exercise 8.13 Use ifelse() to implement sign function: \[\begin{equation*} f(x) = \begin{cases} -1 \quad\text{if } x < 0 \\ 0 \quad\text{if } x = 0 \\ 1 \quad\text{otherwise} \end{cases} \end{equation*}\]

Show that it works with vector c(-3, 3, -1, 1, 0).

Hint: you need to use one ifelse() inside another ifelse().

See the solution

Exercise 8.14 Consider four bowls of porridge with temperatures 100, 130, 110, 140F. Use ifelse() to write code that takes the vector of temperatures, and outputs a character vector:

"Bowl 1 is all right" "Bowl 2 is too hot" ...

Depending on if the porridge is hotter than 120F.

Hint: use paste() and sequences to attach the “Bowl 1” part in front of the text.

See the solution

8.4 A few useful and useless tools

8.4.1 %in%: is the value in a set of given values?

It is possible to test if the value belongs to a set using multiple logical OR-s (|): For instance, to test if a given number is either “1”, “3”, or “7”, we can write:

if(x == 1 | x == 3 | x == 7) ...

However, in case of a long lists, this will be hard to read. A very useful alternative is the operator %in% that returns “true” if the value is in the set and “false” otherwise. So we instead of a sequence of logical OR-s, we can write

if(x %in% c(1, 3, 7)) ...

%in% is not directly related to if(). It is a separate operator that tests if x is either “1”, “3” or “7”, and returns “true” or “false” accordingly. For instance:

x <- 7
x %in% c(1,3,7)
## [1] TRUE

Indeed, “7” is in the given set.

But note that %in% is vectorized as many other operators in R–it returns “true” or “false” for each component of x:

x <- c(2,3,4,5)
x %in% c(1,3,7)
## [1] FALSE  TRUE FALSE FALSE

Here only the second and fourth component are part of the set. This allows to use all() and any() to find if any of the element is in the list, or to use ifelse() if we need to make a decision for every single element of the vector.

Exercise 8.15

Consider a vector c("a", "b", "c") and a set c("c", "b", "d"). Use if-else to Write code that
  • prints “all in” if all the elements of the vector are in the set
  • prints “some in” if some of the elements are in the set
  • otherwise prints “none in”

See the solution

Exercise 8.16

Indian states and the Southern Zone
Indian states and the Southern Zone

The states of India. The Southern Zone, consisting of five states and the Puducherry territory, is circled.

By Rajeshodayanchal at Malayalam Wikipedia, CC BY-SA 3.0. Original at wikimedia commons

Consider a vector of Indian States: “Madhya Pradesh”, “Orissa”, “Andra Pradesh”, “Karnataka”, “Gujarat”, “Andra Pradesh”, “Kerala”, “West Bengal”, “Punjab” and “Karnataka”.
Which of these states belong to the Southern zone, consisting of “Telangana”, “Andra Pradesh”, “Karnataka”, “Tamil Nadu”, “Kerala” and “Puducherry”?
  • Test it using %in% and produce a vector of “trues” and “falses”
  • Instead of a logical vector, create a character vector of “South” and “Not South”, according to if the state is in the South.

See the solution

8.4.2 x == TRUE is not needed

Beginners sometimes write code along the lines:

if(x == TRUE) {
   ## do something...
}

It is not wrong, but x == TRUE is almost never needed. Why?

This is because x == TRUE is true only if x is true! Hence the code above is equivalent (but see the exercise below) just with just

if(x) {
   ## do something...
}

Exercise 8.17

However, there are some devilish details why x == TRUE and x actually differ.
  • What type of variable you expect x to be for x == TRUE to be true? (See Section 2.5). Note: it is about data type of the variable, not about its value!
  • What happens if x is of a different data type? Explain this!
  • Now pick a different value, e.g. x <- "true" and try both constructs:
if(x == TRUE) {
   "true"
}

and

if(x) {
   "true"
}

Explain why do you get different results!

See the solution

8.5 Summary

Single condition (not a vector), single decision:

  • if() One decision:

    if(x == 1) {
       ## do something
    }

    See Section 8.1

  • if() - else One decision with the alternative

    if(x == 1) {
       ## do something
    } else {
       ## do something else
    }

    See Section 8.2

  • if() - else if() - else: Series of alternative decisions

    if(x == 1) {
       ## do something for x == 1
    } else if(x == 2) {
       ## do something for x == 2
    } else {
       ## do something for other x values
    }

    See Section 8.2

Multiple conditions, single decision (x is a vector):

  • if(any(x == 1)): at least one condition true

    if(any(x == 1)) {
       ## do something if at least a single x equals to '1'
    }
  • if(all(x == 1)): all conditions true

    if(all(x == 1)) {
       ## do something if all x values are '1'
    }

    Combine these with else and else if(). See Section 8.3.2

Multiple conditions, multiple decisions (x is a vector):

  • ifelse(x == 1, true value, false value): produces true value for each component where the condition is true, and false value for each component where the value is false
ifelse(x == 1, "one", "not one")
## [1] "not one" "not one" "not one" "not one"

You may replace the true value and/or false value with another ifelse() for more complex selections. See Section 8.3.3