Chapter 8 Conditional statements

So far we have just talked about coding that runs in a single flow–all commands are executed in order, no matter what. This section is devoted to decisionmaking in code, running different parts of code selectively, depending on the results we have got so far.

8.1 if statement

The basic idea of conditional statements is to test a “condition”, and execute a certain part of the code only if it is true. This is the basics of decision-making in computer programs.

We can write it in an abstract manner as

IF a condition is true THEN
  do something

the “something” will only be done if the condition is true. For instance, a store may implement a decision-making rule like

IF age >= 21 THEN
  sell liqueur

Such rules translate easily to R code as

if(age >= 21) {
   sell("liqueur")
}

Now the code will only sell liqueur to someone who is at least 21 years old. (Note that we still have to write the function sell() 😐).

You can write such an if-block as a separate script but that is of little use. More likely, it will be a part of a larger program as the decisions must be made based on something that happens before. So you probably want to write some code both before, and maybe after the block. That code, the one before and the after the if-block, is executed unconditionally, no matter what the condition. So we can expand our store rules as

ask age
IF age >= 21 THEN
  sell liqueur
sell tea

This should be understood that the store asks age from everyone, sells liqueur only if age is at least 21, but then sells tea to everyone, no matter what the age. In R code we can write this as:

age <- ask("what is your age")
if(age >= 21) {
   sell("liqueur")
}
sell("tea")

This example displays all the basic usage of if-block. Here are some more details:

  • age <- ask(...) is executed unconditionally, no matter the age, as it is before the if-block. So everyone will be asked their age.
  • if-block is made of if()-statement and the block of code thereafter. The condition is written in parentheses () and the block of code in curly braces { }.
  • the condition, here age >= 21 should be something that R can calculate (called expression), and that results in a single TRUE/FALSE value–a logical vector of length 1. It is a common mistake for the expression to result in a longer logical vector, but that results in error the condition has length > 1. See Section 8.3.
  • finally, sell("tea") is outside of the if-block (past the closing curly brace }). Hence it is not part of the block, and is always executed, no matter what the age.

Such an if-block is the most fundamental decision-making mechanism in programs. All coding languages contain comparable tools, conditional execution is present even in the extremely low level assembly language.

The condition inside if() can be any variable or expression that results to a logical value (TRUE or FALSE). Here an example using a logical expression, similar to how we tested age above:

porridge_temp <- 115
if(porridge_temp > 120) {  # logical expression
   cat("This porridge is too hot!\n")
}
                           # nothing printed: it is not too hot

But instead of a doing logical computation inside of if(), we can also use a logical variable:

too_cold <- porridge_temp < 70
if(too_cold) {  # a logical value
   cat("This porridge is too cold!\n")
}
                           # nothing printed: it is not too cold

Obviously, you can use more complex logical expressions, both inside of if() or separately:

if(porridge_temp > 70 & porridge_temp < 120) {
   cat("This porridge is just right!\n")
                           # prints this message
}
## This porridge is just right!

Exercise 8.1 Define a function compareStrings that takes two strings as arguments. It should print a sentence “The second string is longer” if the second string is longer. If not, it should not print anything.

Hint: nchar(string) tell how long is the string (how many characters long).

See the solution

Exercise 8.2 Write a for loop from 1 to 10. In the loop, print out the number, followed by the work “even” if the number is even.

  1. the output should look like
1
2
 - even
3
4
 - even
... 
  1. the output should look like:
1
2 - even
3
4 - even
...

Hint: think if you need to print new line conditionally or unconditionally.

Remember: you can test evenness using the modulo operator %%, and the new line character is "\n".

See the solution

8.2 if-else

An easy extension of if-statement is if-else statement. This allows to do something if the condition is true, and something else if it is false.

8.2.1 if-else basics

We can write it as

IF a condition is true
  do something
OTHERWISE
  do something else

In R, we write these conditional statements using the keywords if and else and the following syntax:

if(condition) {
  # do something if true
} else {
  # do something else if false
}

Note that the the else needs to be on the same line as the closing } of the if block–otherwise R thinks that it was just an if statement and gets confused when it suddenly encounters and else.

Repeating the porridge example from above, we can write

porridge_temp <- 115
if(porridge_temp > 120) {
   cat("This porridge is too hot!\n")
} else {
   cat("This porridge is not too hot\n")
}
## This porridge is not too hot

The if-else sequence can be extended to more conditions using else if. For example:

test_food_temp <- function(temp) {
  if(temp > 120) {
    status <- "This porridge is too hot!"
  } else if(temp < 70) {
    status <- "This porridge is too cold!"
  } else {
    status <- "This porridge is just right!"
  }
  status
}
test_food_temp(119)  # "This porridge is just right!"
## [1] "This porridge is just right!"
test_food_temp(60)   # "This porridge is too cold!"
## [1] "This porridge is too cold!"
test_food_temp(150)  # "This porridge is too hot!"
## [1] "This porridge is too hot!"

See more about if, else and other related constructs in Section 16.

Exercise 8.3 Write a for loop from 1 to 10. In the loop, print out the number, followed by the work “even” or “odd”, depending if he number is even or odd. So the output should look like

1 odd
2 even
3 odd
4 even
...

Remember: you can test evenness using the modulo operator %%.

See the solution

Exercise 8.4 Can you afford to invite your friends for dinner?

How much money do you have?–put this into a vairalbe. How many friends do you have?–put this into a variable. How much does a mean cost?–put into a variable. What is the total price for all of you (including you!)? What is the total price with tip?

Print “can afford” or “cannot afford”, depending on if you have more or less money than the total price.

See the solution

8.2.2 if() has return value

Sometimes it is useful to use the fact that if() and if-else have return value–this is the return value of the last expression they evaluate. This feature can be used for assignments, or when returning a value in a function. For instance, we can write

money <- 1000
comment <- if(money > 100) {
              "rich"
           } else {
              "poor"
           }
comment
## [1] "rich"

This piece of code first computes the if() statement, and given money is 1000, it results in value “rich”. This value is assigned to variable comment. We can re-write this statement as a function that comments on your financial situation:

financialStatus <- function(money) {
   if(money > 100) {
      "Congrats, you are rich :-)"
   } else {
      "You are poor :-("
   }
}
financialStatus(50)
## [1] "You are poor :-("

Now depending on the amount of money, the function makes a certain kind of comment, and as it is the last thing the function does, it will also be returned.

Such code tends to be easier to read than when using auxiliary variables and a return statement.

Exercise 8.5 Rewrite the function test_food_temp() from above using these tools: it should contain no assignments and no return statements.

What do you think, which version is easier to understand?

See the solution

Exercise 8.6 Use this approach to implement absolute value: the function should test if the number is positive. In that case it should return the number, otherwise the number with the minus sign.

Note: this approach does not work with vectors, see Section 8.3.

See the solution

8.3 Conditional statements and vectors

8.3.1 The problem

Above, we discussed the basic decision-making in code that is based on a logical condition:

IF the condition is true THEN
   do something

However, R is a vectorized language, and the conditions may end up being not a single logical value, but logical vectors instead. For instance, what happens to our porridge-tester if we tell it not a single temperature, but temperatures of two bowls of porridge? Here is the result:

porridge_temp <- c(115, 125)
tooHot <- porridge_temp > 120
tooHot  # check what the value is
## [1] FALSE  TRUE
if(tooHot) {  # logical expression
   cat("This porridge is too hot!\n")
}
## Error in if (tooHot) {: the condition has length > 1

As you can see, this results in an error “the condition has length > 1”. How can we fix it?

Before we can even start to fix it, let’s first think what should the code do if feed in two logical values. Above, before it crashes due to the error, the program prints out the value of tooHot: it is FALSE and TRUE–the first porridge is not too hot but the second one is. What should the code do now? Print the message? But the other one is just fine… Or should it print “too hot” when at least one of the bowls contains too hot porridge? Or maybe print a more complex message, something like “the first porridge is fine but the second one is too hot”? Or something else?

Whatever is your answer here, it is fairly obvious that we cannot just live with a single message as above. We have to do something else. Here are a few suggestions. These may all be good options, depending on what do you want to do:

  • Report if at least one porridge is too hot. So out of two children, at least one cannot eat their meal.
  • Report if all porridges are are too hot. So even if you are alone with all these bowls, you still cannot eat.
  • Report a list of bowls and temperature, something like
  porridge 1 is just right
  porridge 2 is too hot
  ...
  • There are other options.

You cannot any further code unless you have answered this basic question–what do you want to do?

Exercise 8.7 You want to check out the new boba shop that just opened next door. You know it tends to be a pricy place, but they have a wide variety of interesting choices.

You write code to check if the items in their online menu are affordable (you set affordability limit at $7). What should the code report so you can decide if it is worth going?

See the solution

8.3.2 any() and all()

any() and all() are logical functions that mean pretty much the same as these words in the ordinary language. These operate on logical vectors, any() tests if any of the logical values in the vector is true (and returns a single TRUE in that case), all() tests if all values are true (and returns a single TRUE in that case). In some sense they are multi-way logical OR and AND.

Here are how do they work. First, let’s create the logical vector:

temp <- c(50, 115, 125)  # temperature in 3 bowls
temp > 120
## [1] FALSE FALSE  TRUE

So the porridge in the first two bowls is not too hot, but in the third it is. any() returns TRUE as there is at least one too hot bowl:

any(temp > 120)
## [1] TRUE

but all() returns FALSE because not all porridge is too hot:

all(temp > 120)
## [1] FALSE

Now we can use these functions to do some testing. For instance, we can write code that tells if any of porridge is too hot, if all porridge is too hot, or if none is too hot:

if(all(temp > 120)) {
   cat("All porridge is too hot! 😱\n")
} else if(any(temp > 120)) {
   cat("Some porridge is too hot! 🙄\n")
} else {
   cat("None of the porridge is too hot 👌\n")
}
## Some porridge is too hot! 🙄

This outputs “Some porridge is too hot” because only one bowl is too hot, the other too are fine (well, the first one is perhaps too cold…).

Note how the code works: first we test if all is too hot, and print the message if so. If this is not the case (that is what else does) we test if any of it is too hot (that is what else if does) and print the message if so. If none of this was the case, we print the third message.

Exercise 8.8

The boba shop that just opened next door has the following menu:
  • Brown sugar boba milk ($5)
  • Creme Brulee boba milk ($6)
  • Red bean boba milk ($7)
  • Grass jelly boba milk ($8)

Put these prices in a vector. Write code that tests if any of the menu items is affordable (price no more than $7) and prints either “you can get a drink” or “this is a too expensive place”.

Show that your code works both with the prices above, and when the place rises the price of all drinks by $3.

See the solution

8.3.3 ifelse()

Both any() and all() allow to make a decision, depending on whether any or all of the logical conditions are true. But in both cases, it can only be a single decision: we either print this message or another message; we either compute this value or another value.

ifelse() is a way to perform multiple decisions: one for each logical value in the condition vector. In a way, it is just vectorised if-else. Let’s explain it with an example.

We can implement absolute value, using ifelse(). For a single number \(x\), we can compute absolute value (call it \(y\)) as:

abs <- function(x) {
   y <- x
   if(y < 0) {
      y <- -x
   }
   y
}
abs(-3)
## [1] 3
abs(3)
## [1] 3

But this only works for single numbers. If we want to compute it for a whole vector, then we need to repeat the above for every single component, perhaps by putting it in a loop inside of the abs() function.

Exercise 8.9 Use the abs function above. Compute abs(c(-3, 3)). What happens? Why?

Note: this exercise is about using the abs function above. R includes built-in abs() function that is vectorized. That is not what this exercise is about.

ifelse() includes such a loop. So we can re-define absolute value as

abs <- function(x) {
   y <- x
   y <- ifelse(y < 0, -y, y)
   y
}
abs(c(-3, 3, -1, 1))
## [1] 3 3 1 1

ifelse() takes three arguments: the first is the logical condition, a logical vector. This can be of any length. The second argument is what to do if the condition is true. This is also a vector of the same length as the condition. And the last one is the vector or values to pick when the condition is false. So we can write it as

ifelse(condition, true value, false value)

ifelse() selects either true values or false values, depending on the logical conditions, row-by-row.

Here is an explanation how ifelse() example above works. Its first argument is y < 0. Depending on the value of y, it is either TRUE or FALSE. If it is true, ifelse picks the true value \(-y\). If it is false, it picks the false value \(y\). So we end up with true values \(-y\) for 1st and 3rd component, and false values \(y\) for 2nd and 4th component. We have implemented absolute value.

Exercise 8.10 Implement a similar vectorized absolute value using if, else, and a for loop.

Some notes about code style: the example abs function is unnecessarily complex. To begin with, we can simplify it as

abs <- function(x) {
   y <- x
   ifelse(y < 0, -y, y)
}
abs(c(-3, 3, -1, 1))
## [1] 3 3 1 1

because ifelse() returns the result vector, and as this is the last thing that the function does, it will be automatically returned. Even more, we do not need the y <- x assignment either–we can just work on x:

abs <- function(x) {
   ifelse(x < 0, -x, x)
}
abs(c(-3, 3, -1, 1))
## [1] 3 3 1 1

This is perhaps the cleanest and simplest way to implement absolute value.

8.3.4 TBD:

  • x == TRUE never needed
  • %in%
  • for loop over vectors
  • exiting loop