B Command line

B.1 What is command line and why it is useful

The command-line is an interface to a computer—a way for you (the human) to communicate with the machine. But unlike common graphical interfaces that use windows, icons, menus, and pointers, the command-line is text-based: you type commands instead of clicking on icons. The command-line lets you do everything you’d normally do by clicking with a mouse, but by typing in a manner similar to programming!

Command line originates form the computer terminals, from times where there were no graphical displays. Instead, computers in 1960-s and 1970-s often had “terminals”, essentially typewriters, connected to them. One had to type the commands on the keyboard, and reply came in the form of printed text.

An example of the command-line in action (from Wikipedia).

Nowadays, it is most common to use command line inside of a window in the graphical desktop. You need to open the appropriate program, still commonly called “terminal”, and you can easily enter the commands there, the computer will then execute the commands. The typical tasks done through command line are often standard computer tasks–moving and renaming files, viewing images, and opening apps. Outside of the personal computer sphere, it is very common to set up servers with no graphical interface whatsoever (frequently without any screens or keyboards either), so one can access those just by command line over internet.

The command-line is not as friendly or intuitive for beginners as a graphical interface: it’s harder to learn and even harder to figure out without learning. However, it has the advantage of being both more powerful and more efficient in the hands of expert users. (It’s faster to type than to move a mouse, and you can do lots of “clicks” with a single command). The command-line is also used when working on remote servers or other computers that for some reason do not have a graphical interface enabled. Thus, command line is an essential tool for all professional developers, particularly when working with large amounts of data or files.

Command line has a number of advantages over graphical interfaces:

Commands tend to be more powerful and faster to use. While simple things are easy to do using the mouse, complex tasks tend to be easier using dedicated commands.
Commands can be easily recorded, copied, edited, and repeated. You can write commands down as a list and then execute all of them in the list. If you notice that something went wrong in the process, then you can edit your list and repeat again, until all goes well. (This is called “shell scripting”.) In contrast, such complex tasks may need dozens of mouse clicks and movements, and if something goes well then you cannot just edit an individual click–you have to start all over again.
Commands are also easier to explain in a text-based medium. Explaining menus and dialogs often needs images or video tutorials.

But for beginners, command line tends to be much harder:

Commands are not as “discoverable” as menus and icons you can just click on. While you often can figure out how to perform basic tasks in a new app just be navigating around in the menus and dialog windows, this is hardly possible with commands. Usually, you have to start with reading the manuals and tutorials.
Basic computer education these days does not include any command line experience. So encountering command line the first time may feel quite intimidating, in particular if you are thrown at it with no training.
Command line often provides less visual feedback. For instance, you typically do not have access to the list of files, unless you explicitly ask for it. This contrasts with graphical file managers, where navigation automatically shows file and folder icons.

But the advantages of commands outweight the downsides in various situations. You may be surprised that a large number of common apps contain one or another version of command line. The examples include both computer games and browser search bars.

This chapter will give you a brief introduction to basic tasks using the command-line: enough to get you comfortable navigating the interface and able to interpret commands.

B.2 Accessing the Command-Line

Normally you use the command-line tools through a command shell (a.k.a. a terminal). This is a program that provides the interface to type commands into. You should have installed a command shell (hereafter “the terminal”) as part of setting up your machine. Besides a dedicated terminal, there are other options, e.g. RStudio can run a terminal in one of its “tabs” (see Section 1.3 for more).

Once you open up the shell (Terminal or Git Bash), you should see something like this (red notes are added for reference):

This is the textual equivalent of having opened up Finder or File Explorer and having it show you the user’s “Home” folder. The text you see (typically called command prompt or shell prompt) tells you:

What machine you’re currently interfacing with (you can use the command-line to control different computers across a network or the internet).
What directory (folder) you are currently looking at (~ is a shorthand for the “home directory”).
What user you are logged in as.

After that you’ll see the prompt (typically denoted as the $ symbol), which is where you will type in your commands. Note that prompt may mean two different things: either only the $-symbol, or the whole line of response you see in your shell.

B.3 Navigating the Command Line

Now when we have discussed how the computer file system is laid out, we can start using command line. The first task will be to navigate the file system and to access files and folders there.

B.3.1 Current working directory

Above, in Section 2.3, we discussed what is working directory. To recap, all programs, including the command shell “think” that they are “located” somewhere in the file system tree, in one of the folders there. That folder is the current working directory, or just the working directory. So all commands we run through the shell (command line) think that they are located there. Importantly, this is also true for computer programs–programs, including R programs, have a very similar idea where they are located. All their operations are focused on a particular folder, the current working directory, unless explicitly told to work somewhere else.

Although the command-prompt gives you the name of the folder you’re in, you might like more detail about where that folder is. Time to send your first command! At the prompt, type (and hit Enter):

pwd

This stands for print working directory (shell commands are highly abbreviated to make them faster to type), and will tell the computer to print the folder you are currently “in”. It may reply

/Users/yucun/Documents/

i.e. you are in the folder Documents, in folder yucun, in folder Users in the root folder. (Above, we saw that Yucun kept his applications and UW-stuff in there.)

If Yucun were using windows, you may see

/c/Users/yucun/Documents/

This means the you are in folder Documents, in folder yucun, in folder Users in C: drive on “This PC”. You can navigate to the same place using the file explorer as well, see the picture in Section 2.1.1.

B.3.2 Changing Directories

Next, we explain how to navigate your files and folders using command line. While this is a useful skill in itself, our main motivation for discussing it is the fact that programming languages use a very similar approach. Playing with command line is a quick and simple way to “see” the file system in a similar manner as computer programs “see” it.

What if you are not happy with the current working directory and want to work in a different folder instead? You need to change the working directory. In a graphical system like Finder, you can just double-click on the folder to open it. But there’s no clicking on the command-line.

This includes clicking to move the cursor to an earlier part of the command you typed. You’ll need to use the left and right arrow keys to move the cursor instead!

Since you can’t click on a folder, you’ll need to use a command:

cd folder_name

The first word here is the command, or what you want the computer to do. In this case, you’re issuing the command that means change directory.

The second word is an example of an argument, which is a programming term for “more details about what to do”. In this case, you’re telling the shell which folder you want to change to! Note that while we talking about “changing directories”, what we actually do is to change the current working directory. Shell will still stay on your screen and will not move anywhere 😂. Obviously, you have to replace folder_name with the name of the actual folder that you want to change to. For instance, Yucun may want to consult his UW stuff and issue (followed by Enter):

cd UW

Now Yucun has changed the folder to “UW”.³⁵ Or to be more specific, his current working directory is now /Users/yucun/Documents/UW (on Mac) or /c/users/yucun/Documents/UW/ (on Windows). This can be easily queried as

pwd

Now when Yucun is done his things is the UW folder, he may want to “go back” to Documents. This can be achieved with

cd ..

Remember–two dots .. is the way to denote moving “one level up” to parent folder (or “one level left”) in the file system navigation (see Section B.3.4). So this command is essentially very similar to cd UW as the double dots is the name of the parent folder. It is the same cd folder_name form.

It may be instructive to compare cd .. with graphical navigation. The command is similar to the ↑ symbol on Windows Explorer (see the figure).

Note that despite the similarity, their functionalities are somewhat different. cd .. makes the current working directory of the command shell to change one folder up; the navigation arrow in the explorer cause it to display files in the parent folder.

There is another special folder name besides .., this is tilde ~ which means “home folder” (see Section 2.4.1). So

cd ~

will change the working directory to home folder. Obviously, it has no effect if we are in the home folder already.

Exercise B.1 Start a new shell (run Terminal or git bash).

Which is the starting folder? Use pwd to find it.
Change to the folder Documents. Use cd to do this.
Are you inside of Documents? Use pwd to check it.
Move back to home folder using cd ... Use pwd to make sure you got back.
Now go to Downloads and make sure you are there.
Finally, use cd ~ to get back to the home folder and make sure you got it right.

See the solution

A note about Desktop on Windows: If you’re on Windows and the contents of your Desktop in the terminal doesn’t match the contents of your actual Desktop, your computer may be configured to have your Desktop in a different directory. This usually happens if you’ve set up software to back up your Desktop to an online service. Instead of ~/Desktop, check if your Desktop folder is really in ~/OneDrive/Desktop or ~/Dropbox/Desktop.

While commands like cd and pwd are easy to type, the long ones may be quite tedious. Fortunately, shell offers two extremely handy help tools:

Command history: the up and down arrow keys will let you cycle though your previous commands so you don’t need to re-type them. You can even edit them to fix typos or change them otherwise.
Tab competion: when changing to a directory, you do not have to type the complete directory name. Instead of writing cd Documents, you can type cd Doc and hit the [tab] key. The incomplete Doc will automatically turn into Documents. This is extremely handy by not just making typing simpler, but it also avoids possible typos in folder and file names.

B.3.3 Listing Files

In a graphical system, once you’ve double-clicked on a folder, Finder will show you the contents of that folder. The command-line doesn’t do this automatically; instead you need another command:

ls [folder_name]

This command says to list the folder contents. Note that the argument here is written in brackets ([]) to indicate that it is optional. If you just issue the ls command without an argument, it will list the contents of the current folder. If you include the optional argument (leaving off the brackets), you can “peek” at the contents of a folder you are not currently in.

Warning: The command-line can be not great about giving feedback for your actions. For example, if there are no files in the folder, then ls will simply show nothing, potentially looking like it “didn’t work”. Or when typing a password, the letters you type won’t show (not even as *) as a security measure.

Just because you don’t see any results from your command/typing, doesn’t mean it didn’t work! Trust in yourself, and use basic commands like ls and pwd to confirm any changes if you’re unsure. Take it slow, one step at a time.

Example B.1 Imagine your have files taxes.doc, dolphins.jpg and resume.doc in your Documents folder, and the Documents is located in your home ~. Now when you open the terminal, you will be in your home folder:

$ pwd
/home/otoomet

If you type ls, you see files and folders that are in your home directory:

$ ls
Desktop   Documents   Downloads   ...

But if you just want to “peek” into the Documents, you can do

$ ls Documents
dolhins.jpg   resume.doc  taxes.doc

ls can also used to show only files of certain type using shell patterns. Patterns is a way to describe the file names that follow certain patterns. For instance, *.jpg means all files that end with .jpg. This may include image.jpg, CANON123456.jpg and a very long picture file with spaces.jpg. In a similar fashion CV*.doc means _all files that start with CV and end with .doc. This is a very handy way to list only files of certain type. For instance, in order to see only files with .doc-extension inside the Documents folder, we can do

ls Documents/*.doc
resume.doc  taxes.doc

Indeed, the doc-files are here but the jpg is not.

Note that patterns (and file names) are case sensitive. For instance, ls *.DOC in the example above will fail to find any doc files. This is also true on Windows gitbash despite the windows file system normally using case-insensitive file names.

B.3.4 Folders and paths in command line

Relative and absolute paths are the standard way to access files and folders in command line, and the standard way for programs to access files. They are typically written in a way that is fairly similar to the compact navigation rules we did above, with just a few differences:

The root folder is marked by a single slash: “/”
Going “up” (or left) is marked by double dots: “..”
We need an additional slash “/” between the navigation steps.

So the relative path example in Section 2.3, “up-up-up-up-Pictures-fractal.png”, is written as

../../../../Pictures/fractal.png

and the absolute path example in Section 2.4.1, “/-Users-yucun-Documents-info201-cheatsheet.pdf”, will be

/Users/yucun/Documents/info201/cheatsheet.pdf

Note the differences:

Instead of writing four times “up”, we have four pairs of dots “..” as the pair of dots means “up”.
Instead of dash to separate navigation steps, we use slashes “/”. Also, the four pairs of dots are separated by slashes.
The absolute path starts with slash “/”.

A Note about Windows: Mac consistently use the slash “/” to denote the root file system. However, Windows normally uses “This PC”, followed by a drive letter. Also, Windows normally uses backslash “\” instead of forward slash “/” to separate navigation steps (folders). All the software we use here (gitbash and R) allow one to use the unix convention (the convention used by Mac). Hence we do not discuss the Windows-specific files and paths and use the unix way thorough the book. So were Yucun using Windows, the absolute path above would be

/c/Users/yucun/Documents/info201/cheatsheet.pdf

We still use the slash “/” for the root folder, but now we also have to specify the drive letter “c” as a folder name.

A Note about spaces in file/folder names: spaces in file/folder names can cause some hassle when working with command line. Normally, all command line programs accept file names and (absolute/relative) paths without quotes. So one can just write /Users/yucun/Documents/info201/cheatsheet.pdf. However, if the file name contains a space, you may put it in quotes, e.g. "Ross Lake.jpg". Otherwise the command interpreter considers it as two separate files, Ross and Lake.jpg. In this course, it is easy to avoid spaces and to use underscores (Ross_lake.jpg) or dashes (Ross-lake.jpg) instead of space.

B.3.5 Paths

Note that both the cd and ls commands work even for folders that are not “immediately inside” the current directory! You can refer to any file or folder on the computer by specifying its path. A file’s path is “how you get to that file”: the list of folders you’d need to click through to get to the file, with each folder separated by a /:

cd /Users/iguest/Desktop/

This says to start at the root directory (that initial /), then go to Users, then go to iguest, then to Desktop.

Because this path starts with a specific directory (the root directory), it is referred to as an absolute path. No matter what folder you currently happen to be in, that path will refer to the correct file because it always starts on its journey from the root.

Contrast that with:

cd iguest/Desktop/

Because this path doesn’t have the leading slash, it just says to “go to the iguest/Desktop folder from the current location”. It is known as a relative path: it gives you directions to a file relative to the current folder. As such, the relative path iguest/Desktop/ path will only refer to the correct location if you happen to be in the /Users folder; if you start somewhere else, who knows where you’ll end up!

You should always use relative paths, particularly when programming! Because you’ll almost always be managing multiples files in a project, you should refer to the files relatively within your project. That way, you program can easily work across computers. For example, if your code refers to /Users/your-user-name/project-name/data, it can only run on the your-user-name account. However, if you use a relative path within your code (i.e., project-name/data), the program will run on multiple computers (crucial for collaborative projects).

The special name .. refers to the parent folder. In a similar fashion, a single dot . refers to the “current folder”. So the command

ls .

means “list the contents of the current folder” (the same thing you get if you leave off the argument).

Note that . and .. act just like folder names, so you can include them anywhere in paths: ../../my_folder says to go up two directories, and then into my_folder.

Protip: Most command shells like Terminal and Git Bash support tab-completion. If you type out just the first few letters of a file or folder name and then hit the tab key, it will automatically fill in the rest of the name! If the name is ambiguous (e.g., you type Do and there is both a Documents and a Downloads folder), you can hit tab twice to see the list of matching folders. Then add enough letters to distinguish them and tab to complete! This will make your life a lot easier.

Additionally, you can use a tilde ~ as shorthand for the home directory of the current user. Just like . refers to “current folder”, ~ refers to the user’s home directory (usually /Users/USERNAME on mac or /c/Users/USERNAME on windows). And of course, you can use the tilde as part of a path as well (e.g., ~/Desktop is an absolute path to the desktop for the current user).

A note about home directory on Windows: unlike unix, windows does not have a consistent concept of home directory, and different programs may interpret it in a different way. For instance, gitbash assumes the home directory is /c/Users/USERNAME while R assumes it is /c/Users/USERNAME/Desktop.

As you perhaps noticed above, command line uses space as a separator between the command and additional arguments. This means it is more complicated to work with paths and file names that contain a space. For instance, if you want to change into a folder called my folder, then issuing cd my folder results in an error (“cd: too many arguments”). This is because cd thinks that my is folder name, and folder after space is an additional argument. But there are two ways to handle spaces: first, you can put your pathname in quotes like cd "my folder", and second, you can escape the space with a backslash: cd my\ folder. Both of these options work reasonably well but in general we recommend to avoid spaces in file names whenever you work with command line.

B.4 File Commands

Once you’re comfortable navigating folders in the command-line, you can start to use it to do all the same things you would do with Finder or File Explorer, simply by using the correct command. Here is an short list of commands to get you started using the command prompt, though there are many more:

Command	Behavior
`mkdir`	make a directory
`rm`	remove a file or folder
`cp`	copy a file from one location to another
`open`	opens a file or folder (Mac only)
`start`	opens a file or folder (Windows only)
`cat`	concatenate (combine) file contents and display the results
`history`	show previous commands executed

Warning: The command-line makes it dangerously easy to permanently delete multiple files or folders and will not ask you to confirm that you want to delete them (or move them to the “recycling bin”). Be very careful when using the terminal to manage your files, as it is very powerful.

Be aware that many of these commands won’t print anything when you run them. This often means that they worked; they just did so quietly. If it doesn’t work, you’ll know because you’ll see a message telling you so (and why, if you read the message). So just because you didn’t get any output doesn’t mean you did something wrong—you can use another command (such as ls) to confirm that the files or folders changed the way you wanted!

B.4.1 Learning New Commands

How can you figure out what kind of arguments these commands take? You can look it up! This information is available online, but many command shells (though not Git Bash, unfortunately) also include their own manual you can use to look up commands! But if you are using a shell where built-in manual is not available, there is always the option to enter the manual command in google search. The results are probably fairly similar, although you may end up reading manual of a slightly different version.

man mkdir

Will show the manual for the mkdir program/command.

Because manuals are often long, they are opened up in a command-line viewer called less. You can “scroll” up and down by using the arrow keys and space bar. Hit the q key to quit and return to the command-prompt.

The mkdir man page.

If you look under “Synopsis” you can see a summary of all the different arguments this command understands. A few notes about reading this syntax:

Recall that anything in brackets [] is optional. Arguments that are not in brackets (e.g., directory_name) are required.
“Options” (or “flags”) for command-line programs are often marked with a leading dash - to make them distinct from file or folder names. Options may change the way a command-line program behaves—like how you might set “easy” or “hard” mode in a game. You can either write out each option individually, or combine them: mkdir -p -v and mkdir -pv are equivalent.
- Some options may require an additional argument beyond just indicating a particular operation style. In this case, you can see that the -m option requires you to specify an additional mode parameter; see the details below for what this looks like.
Underlined arguments are ones you choose: you don’t actually type the word directory_name, but instead your own directory name! Contrast this with the options: if you want to use the -p option, you need to type -p exactly.

Command-line manuals (“man pages”) are primarily designed for those who are well familiar with the commands and are looking for just some additional details, e.g. how exactly do certain options work. These are not good introduction texts for beginners. If you consult the manual then start by looking at just the required arguments (which are usually straightforward), and then search for and use a particular option if you’re looking to change a command’s behavior.

For practice, try to read the man page for rm and figure out how to delete a folder and not just a single file. Note that you’ll want to be careful, as this is a good way to break things.

B.5 Dealing With Errors

Note that the syntax of these commands (how you write them out) is very important. Computers aren’t good at figuring out what you meant if you aren’t really specific; forgetting a space may result in an entirely different action.

Try another command: echo lets you “echo” (print out) some text. Try echoing "Hello World" (which is the traditional first computer program):

echo "Hello world"

What happens if you forget the closing quote? You keep hitting “enter” but you just get that > over and over again! What’s going on?

Because you didn’t “close” the quote, the shell thinks you are still typing the message you want to echo! When you hit “enter” it adds a line break instead of ending the command, and the > marks that you’re still going. If you finally close the quote, you’ll see your multi-line message printed!

IMPORTANT TIP If you ever get stuck in the command-line, hit ctrl-c (The control and c keys together). This almost always means “cancel”, and will “stop” whatever program or command is currently running in the shell so that you can try again. Just remember: “ctrl-c to flee”.

(If that doesn’t work, try hitting the esc key, or typing exit, q, or quit. Those commands will cover most command-line programs).

Throughout this book, we’ll discuss a variety of approaches to handling errors in computer programs. While it’s tempting to disregard dense error messages, many programs do provide error messages that explain what went wrong. If you enter an unrecognized command, the terminal will inform you of your mistake:

lx
> -bash: lx: command not found

However, forgetting arguments yields different results. In some cases, there will be a default behavior (see what happens if you enter cd without any arguments). If more information is required to run a command, your terminal will provide you with a brief summary of the command’s usage:

mkdir
> usage: mkdir [-pv] [-m mode] directory ...

Take the time to read the error message and think about what the problem might be before you try again.

B.6 Running R from command line

It is possible to issue R instructions (run lines of code) one-by-one at the command-line by starting an interactive R session within your terminal. This will allow you to type R code directly into the terminal, and your computer will interpret and execute each line of code (if you just typed R syntax directly into the terminal, your computer wouldn’t understand it). This is very similar to how we work with shell. However, for working with data, this is usually not the best approach, but may come occasionally handy for installing packages or performing a quick calculation.

If you have R installed, you can start an interactive R session on a Mac (or linux) by typing R into the terminal (to run the R program). On Windows you need to run the “R” desktop app program (Running R in terminal on windows needs a little configuration). This will start the session and provide you with lots of information about the R language:

An interactive R session running in the terminal.

Notice that this description also includes suggestions on what to do next_—most importantly "Type 'q()' to quit R.".

Always read the output when working on the command-line!

Once you’ve started running an interactive R session, you can begin entering one line of code at a time at the prompt (>). This is a nice way to experiment with the R language or to quickly run some code. For example, try doing some math at the command prompt (i.e., enter 1 + 1 and see the output).

RStudio also includes an interactive console that provides the exact same functionality.

It is also possible to run entire scripts from the command-line by using the RScript program, specifying the .R file you wish to execute:

Using RScript from the terminal

Entering this command in the terminal would execute each line of R code written in the analysis.R file, performing all of the instructions that you had save there. This is helpful if your data has changed, and you want to reproduce the results of your analysis using the same instructions.

On Windows, you need to tell the computer where to find the R.exe and RScript.exe programs to execute—that is, what is the path to these programs. You can do this by specifying the absolute path to the R program when you execute it, for example:

Using RScript from a Windows shell

If you plan to run R from the command-line regularly (not a requirement for this course), a better solution is to add the folder containing these programs to your computer’s PATH variable. This is a system-level variable that contains a list of folders that the computer searches when looking for programs. The reason the computer knows where to find the git.exe program when you type git in the command-line is because that program is “on the PATH”.

In Windows, You can add the R.exe and RScript.exe programs to your computer’s PATH by editing your machine’s Environment Variables through the Control Panel:

Open up the “Advanced” tab of the “System Properties”. In Windows 10, you can find this by searching for “environment”. Click on the “Environment Variables…” button to open the settings for the Environment Variables.
In the window that pops up, select the “PATH” variable (either per user or for the whole system) and click the “Edit” button below it.
In the next window that pops up, click the “Browse” button to select the folder that contains the R.exe and RScript.exe files. See the above screenshot for one possible path.
You will need to close and re-open your command-line (Git Bash) for the PATH changes to take effect.

However, in this course we run R through RStudio.

B.7 Cheatsheet

Here is a quick summary of the most important resources in this section.

B.7.1 Most important commands

pwd: print working directory
cd: change directory
ls: list files. Check out how to list in long form (option -l) and how to list in temporal order (options -t)
mkdir: make a directory
cp: copy a file from one location to another
mv: move (rename) a file from one location to another

B.7.2 Shell keyboard shortcuts

Tab-completion: write a few first letters of the file name or command and hit . This will complete the file name (without any typos!). If the letters are ambiguous, e.g. Do may mean both Documents and Downloads, then hit tab twice, and it displays the available options.
browsing history: arrow-up key walks back in shell command history and allows you to select and edit previous commands.
Stopping a command: Ctrl-c (hold Ctrl and press c) stops the command that takes too long. (Does not always work).

Resources

A note about file systems and case sensitivity. Neither Mac nor Windows file systems are case sensitive by default, so Yucun might also use cd uw or cd Uw or cd uW, all of which will result in the same working directory. However, on linux uw, UW and Uw denote three different folders. Hence a lot of web servers and docker containers use case-sensitive files systems. We recommend to consistently use the correct case in file names, otherwise it may happen that code that works perfectly on your computer refuses to run when you upload it to the web, e.g. as a shiny app (see Section E).↩︎