Linux#

https://upload.wikimedia.org/wikipedia/commons/3/35/Tux.svg

Fig. 18 Tux the penguin. Source: Wikimedia Commons.#

In scientific computing, the Linux operating system (OS) is commonly used. Linux is an open-source OS, which was first released in 1991 as a UNIX-like system.

Like in other parts of these lecture notes, the following sections cover only the essentials about using Linux and are by far not complete.

https://upload.wikimedia.org/wikipedia/commons/7/77/Unix_history-simple.svg

Fig. 19 Unix and Unix-like operating systems. Source: Wikimedia Commons.#

Philosophy#

The UNIX and UNIX-like based operating systems have the following philosophy:

  1. write computer programs, that solve a single task, but very efficient

  2. write computer programs, such that they can work together

  3. write computer programs, that process simple text streams

Therefore the whole OS is a collection of a zoo of small specialised applications. The combination of all of them creates a complex and mighty system. The base system of an UNIX-like OS can be used with text based terminals. However, most desktop systems have a graphical system on top.

Unlike desktop systems, most high performance computing (HPC) systems are operated in a command line based way. Therefore, the main content of this section focusses on the usage of command line tools.

Directories#

Linux uses a tree directory structure, which originates at the so called root /. All subdirectories are located in the same directory tree.

A path, i.e. the location of a file or directory, is a sequence of directories starting at the root and which are separated with /. Figure Fig. 20 illustrates the differences and common aspects of the three most popular types of operating systems for desktop computers.

https://upload.wikimedia.org/wikipedia/de/1/1f/Filesystem.svg

Fig. 20 (In German) Overview of commonly used file system approaches. Source: Wikimedia Commons.#

There exists a file system hierarchy in Linux, which prescribes the usage of special directories, see figure Fig. 21.

https://upload.wikimedia.org/wikipedia/commons/f/f3/Standard-unix-filesystem-hierarchy.svg

Fig. 21 Illustration of the Linux’s file system hierarchy. Source: Wikimedia Commons.#

There are two types of paths:

  • absolute: the path starts at the root, i.e. starts with /

  • relative: the path is stated from a given (or current) position in the file system

In order to express paths which are located towards the root, each level above can be adressed with a ...

Terminal & Shell#

A terminal is a device or (nowadays) a computer program that allows the user to interact with a computer. Besides the physical attendence at a terminal, terminal emulators allow to detach that requirement. There are terminal emulator applications that run locally, like Xterm or Konsole, but it is also possible to remotely connect terminals, e.g. with ssh. A command-line shell, or just shell, is a user interface to the terminal and thus the operating system.

The most popular Linux shells are:

Bash Basics#

Note

In the following, some examples within a directory structure are used. The directory contents are stored in the archive dir_structure_01.tar, which can be extracted after download by tar -xf dir_structure_01.tar.

The shell keeps track of commands that have been used in its history. Individual entries can be accessed with the up/down arrow keys. A general overview over the history can be shown when using the history command. Hitting the “Cmd” and “R” keys, the user can search for commands in the history. Furthermore, the shell provides functionality to auto-complete commands by hitting the “tab” key during typing in of a command. The command echo $SHELL allows the user to get information about which shell they are working with right now. The shell behaviour can be individualised, by setting up configuration files like the .bashrc, .bash_profile or .bash_aliases. Note on Pleiades it is the .profile instead.

Environment variables#

Most, probably all, shells offer environment variables. They store information, which can be used during the execution of the issued commands. In Bash the syntax for reading the content of a variable with the name AVAR is $AVAR, i.e. the $ character is prepended. To define the variable or change its value, the sole variable name is used.

There exist many environment variables, depending on the operating system and the used shell. Some common ones are:

  • USER: the name of the user

  • HOME: the path to the user’s home directory

  • SHELL: the name of the shell used by the user

  • PATH: a list of directories where the shell looks for executables

  • OMP_NUM_THREADS: sets the numer of OpenMP thread to be used

Command execution#

A command or program is executed by typing its name:

> env

In this case, the output of the env command is a list of all set environment variables.

Example: Print current date with the date command:

> date
Mon Apr 26 08:09:42 CEST 2021

Example: Print content of current directory:

> whoami
larnold

An issued command must be either:

  • a build-in function of the shell,

  • an executable file found in the directory list, defined by the PATH environment, or

  • an explicit reference to an executable file.

To execute a custom program in the current directory (not necessary in the PATH directory list), its location needs to be explicitly stated. Here with a . to indicate the current directory:

> ./a.out

Basic interaction with the file system#

Changing directories

The current directory is changed via the cd command. Invoking no arguments changes to the user’s home directory, otherwise the target directory is specified. There exist a couple of special directories, which can also be used with other commands:

  • ~: the user’s home directory

  • ~username: the home directory of user username

  • .: this directory

  • ..: above directory

  • -: last directory (cd command only)

The current working directory can be shown with the pwd (print working directory) command.

Copy

Files are copied with cp. After cp, source and destination must be specified:

> cp folder1/file1.txt folder2/file1.txt

To copy to the current file path, specify ./ as the destination:

> cp folder/file1.txt ./

To copy entire folders including their files and subfolders, the -r (recursively) option must be set:

> cp -r path/folder1 ./

Move

The command to move files works similarly to copying. With mv files can be moved:

> mv folder1/file1.txt  ./

The command can also be used to rename files:

> mv file1.txt file2.txt

or folders:

> mv folder1 folder2

To move multiple files, list them - the last parameter is the destination:

> mv file1.txt file2.txt file3.txt. data/

Delete

With rm files can be deleted:

> rm myfile.txt

There are multiple ways to delete multiple files. You can list them:

> rm file1 file2 file3

Or delete all files that contain a certain name:

> rm *name* 

To delete folders including files and subfolders, the -r option must be set:

> rm -r folder1

Remove all files at a given location one can alos use the wildcard *:

> rm *

Configure your terminal#

The user can adjust their shell by setting up a configuration file that is called .bashrc. This is a basic text file, which is stored in the user’s home directory as a hidden file (indicated by the leading period in the file name). Typically this file does not exist right away, using for example the commandvim ~/.bashrc the file will be created otherwise it will just be opened for editing in the text editor Vim. In the .bashrc default settings can be stored. For instance, the command prompt can be changed from the generic default to show user_name@host or often used modules could be pre-loaded, see section “Environment modules (Lmod)” below.

Let’s try it out. First we create/open the .bashrc file in Vim, assuming we are in the users home directory.

> vim ~/.bashrc

Text can be inserted (written) into the file by hitting the i key (”-- INSERT --” is visible at the bottom left of the window). Then enter the following lines.

# Set prompt outline.
PS1="\u@\h:\w\n\[\033[1;34m\]> \[\033[0m\]"

Afterwards, hit Esc to go back into the normal mode. Type :w to save the changes and :q to quit Vim, or :wq to combine both commands. With :q! changes to the file are discarded before closing Vim.

This changes the command prompt to show the user and host names.

Arguments#

In general, issued commands accept arguments. These can be either named arguments, which need to be explicitly indicated, or positional arguments, which are identified by their position.

Arguments are separated by a space character.

Further information about the accepted arguments is either available by using the man command, which prints the manual for a command. The name of the command is passed as an argument. In the following example, the manual page for grep is shown. This application searches files or given input for a given search pattern.

> man grep

A similar, yet more condensed, form with the focus on the usage of arguments, is to use the -h help argument.

> grep -h
usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
	[-e pattern] [-f file] [--binary-files=value] [--color=when]
	[--context[=num]] [--directories=action] [--label] [--line-buffered]
	[--null] [pattern] [file ...]

In the above example, there are many named arguments, like the one to set if the search is case sensitive:

[...]
     -i, --ignore-case
             Perform case insensitive matching.  By default, grep is case sensi-
             tive.
[...]

Often, named arguments are available as short forms with a single - prepended to a single character, here -i. In addition there is a long and more verbose version, which starts with --, here --ignore-case. Rare arguments may have only a long version.

Besides the named arguments, there are two positional arguments: pattern and file .... The first one defines the search pattern and the second one defines one or multiple files to be searched in.

Listing directories#

The content of a directory is displayed with ls. Without arguments it shows the current directory, otherwise the target directory.

Some common options:

  • --color=always, color output, may be already set by default

  • -l, long output, i.e. list permissions, data, size

  • -a, list all files, i.e. also hidden files starting with a .

  • -h, show file sizes in a human readable form

  • -r, reverse the output

  • -t, sort by modification time

Example: List all files in the current directory:

> ls
combined_logs
doc01.md
doc02.txt
doc03.md
doc04.txt
doc05.txt
for-loop.sh
if-statement.sh
info.txt
rundir_01
rundir_02
rundir_03
rundir_04
rundir_05
rundir_06
rundir_07
rundir_08
rundir_09
rundir_10
rundir_11
rundir_12
rundir_13
rundir_14
rundir_15

Example: List all files in the directory rundir_12:

> ls rundir_12
logfile

> ls -lah
total 88
drwxr-xr-x  27 larnold  staff   864B Apr 26 08:09 .
drwxr-xr-x  11 larnold  staff   352B Apr 26 08:09 ..
-rw-r--r--   1 larnold  staff    20B Apr 25 19:55 .hidden_file_do_not_look_here
-rw-r--r--   1 larnold  staff   816B Apr 26 08:02 combined_logs
-rw-r--r--   1 larnold  staff   591B Dec 12  2000 doc01.md
-rw-r--r--   1 larnold  staff   5.9K Dec 12  2000 doc02.txt
-rw-r--r--   1 larnold  staff    29B Dec 14  2000 doc03.md
-rw-r--r--   1 larnold  staff   889B Nov  8  2000 doc04.txt
-rw-r--r--   1 larnold  staff   1.3K Apr 25 17:19 doc05.txt
-rw-r--r--   1 larnold  staff    60B Apr 26 08:08 for-loop.sh
-rw-r--r--   1 larnold  staff   113B Apr 25 21:35 if-statement.sh
-rw-r--r--   1 larnold  staff    62B Apr 26 08:08 info.txt
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_01
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_02
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_03
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_04
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_05
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_06
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_07
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_08
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_09
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_10
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_11
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_12
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_13
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_14
drwxr-xr-x   3 larnold  staff    96B Apr 25 17:11 rundir_15

Example: List current directory’s files and sort them according to their modification time in reverse order:

> ls -lrt
total 80
-rw-r--r--  1 larnold  staff   889 Nov  8  2000 doc04.txt
-rw-r--r--  1 larnold  staff  6060 Dec 12  2000 doc02.txt
-rw-r--r--  1 larnold  staff   591 Dec 12  2000 doc01.md
-rw-r--r--  1 larnold  staff    29 Dec 14  2000 doc03.md
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_15
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_14
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_13
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_12
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_11
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_10
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_09
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_08
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_07
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_06
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_05
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_04
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_03
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_02
drwxr-xr-x  3 larnold  staff    96 Apr 25 17:11 rundir_01
-rw-r--r--  1 larnold  staff  1307 Apr 25 17:19 doc05.txt
-rw-r--r--  1 larnold  staff   113 Apr 25 21:35 if-statement.sh
-rw-r--r--  1 larnold  staff   816 Apr 26 08:02 combined_logs
-rw-r--r--  1 larnold  staff    62 Apr 26 08:08 info.txt
-rw-r--r--  1 larnold  staff    60 Apr 26 08:08 for-loop.sh

Aliases#

The user can personalise their environment in different ways. For example it is possible to define so-called “aliases”. An “alias” allows the user to set up individualised short form commands. For example, a command to call a program with an often used set of parameters could be attached to an alias. Assume we were to use the ls command often, together with the parameters to give the long output (-l) for all file in a directory (-a) in human-readable form (-h), i.e. ls -lah. We could give it a short form command, for example lsa, by typing alias lsa='ls -lah' in the command line. However, this definition will only exist as long as the session is active. To make it permanent, it needs to be stored in a script which is loaded when a new session is started. Typically, these definitions are stored in files within the user home directory, specifically the .bashrc or better the .bash_aliases files. These are simple text files, if they don’t exist they can be easily created. For example using vim ~/.bash_aliases. In the .bashrc a command needs then to be written to read the .bash_aliases.

Let’s try it out. First we navigate to the users home directory.

> cd ~

We set up the temporary alias “lsa”.

> alias lsa='ls -lah'

If we use the alias it should yield the same result as the command itself, try it out.

Now open a new terminal and try to use the alias again. You should see an error message, something like “-bash: lsa: command not found”.

To set a permanent alias we are using a file we name .bash_aliases. It will be created using Vim again:

> vim ~/.bash_aliases

In the insert mode (i) write the following lines into the file:

# Set aliases.
alias lsa='ls -lah'

Afterwards, hit Esc to go back into the normal mode. Type :w to save the changes and :q to quit Vim, or :wq to combine both commands. With :q! changes to the file are discarded before closing Vim.

Now we need to set up a way to load the .bash_aliases. For this we open the .bashrc and add the following lines:

# Load aliases.
if [ -f ~/.bash_aliases ]; then
source ~/.bash_aliases
fi

Instead of source ~/.bash_aliases also . ~/.bash_aliases could be used. If a new terminal is started it will now load the .bash_aliases. To load it in the active terminal one can just run the command directly:

> source ~/.bash_aliases

Another useful alias is the following:

# Check your jobs
alias myjobs='squeue -u $USER -o "%.9i %.22j %.2t %.20S %.12l %.5D %.5C"

When on a computer that uses the SLURM scheduler (see subsection “SLURM” in the “Parallel Execution of FDS”), it shows all queued jobs of the user ($USER).

Wildcards#

Wildcards are used to provide a matching pattern for the shell, which are generally called glob patterns. First, the shell will evaluate the pattern and search for matching paths. Then it will expand the wildcard pattern with the explicit list of the results, which typically is passed to an applicaiton.

Some common wildcards are:

  • *: match everything

  • ?: match any single character

  • [list]: match any single character from list, here: l, i, s, t

Wildcards can be combined with constant strings and with each other.

Example: List all files ending with .txt:

> ls *.txt
doc02.txt
doc04.txt
doc05.txt
info.txt

Example: List all files with their attributes in the directories rundir_10 to rundir_15:

> ls -l rundir_1?
rundir_10:
total 8
-rw-r--r--  1 larnold  staff  54 Apr 25 17:11 logfile

rundir_11:
total 8
-rw-r--r--  1 larnold  staff  54 Apr 25 17:11 logfile

rundir_12:
total 8
-rw-r--r--  1 larnold  staff  55 Apr 25 17:11 logfile

rundir_13:
total 8
-rw-r--r--  1 larnold  staff  54 Apr 25 17:11 logfile

rundir_14:
total 8
-rw-r--r--  1 larnold  staff  55 Apr 25 17:11 logfile

rundir_15:
total 8
-rw-r--r--  1 larnold  staff  55 Apr 25 17:11 logfile

Example: Print the content of the files logfile located in the directories rundir_10 to rundir_15. The content of a file can be printed with the cat command:

> cat rundir_1?/logfile
Simulation started
Result value: 1761
Run Time: 13836
Simulation started
Result value: 2375
Run Time: 11901
Simulation started
Result value: 15343
Run Time: 21131
Simulation started
Result value: 11280
Run Time: 3640
Simulation started
Result value: 13081
Run Time: 19866
Simulation started
Result value: 27146
Run Time: 20714

Output Redirection#

The output of programs is either to

  • stdout: the normal program output

  • stderr: error messages

In many cases it is useful to write the output into a file. This can be done via redirections.

  • >: creates a new output file, overwrites old one

  • >>: appends the command’s output to the output file

The redirections are used direclty after a command, e.g.

> command >> logfile

Example: Redirect the contents of all logfile-files located in the rundir_* directories to a file:

> cat rundir_*/logfile > combined_logs


> cat combined_logs
Simulation started
Result value: 2776
Run Time: 7789
Simulation started
Result value: 13856
Run Time: 7696
Simulation started
Result value: 22344
Run Time: 20401
Simulation started
Result value: 4155
Run Time: 15711
Simulation started
Result value: 14887
Run Time: 30786
Simulation started
Result value: 17783
Run Time: 9987
Simulation started
Result value: 19607
Run Time: 22428
Simulation started
Result value: 25874
Run Time: 16521
Simulation started
Result value: 28931
Run Time: 2517
Simulation started
Result value: 1761
Run Time: 13836
Simulation started
Result value: 2375
Run Time: 11901
Simulation started
Result value: 15343
Run Time: 21131
Simulation started
Result value: 11280
Run Time: 3640
Simulation started
Result value: 13081
Run Time: 19866
Simulation started
Result value: 27146
Run Time: 20714

Bash Programming#

Just as a starting point, a few examples on how to do looping and branchig in a bash script are shown here.

A shell script can be executed either

  • with an explicit call of sh or bash, or

  • as an executable file with a hash-bang line as a first line in the file.

A shell script is a simple text file with a sequence of commands in it. They are mostly used to prevent explicit typing of command chains and to automate processing. A script may access all environment variables and define new ones. Some special variables are:

  • $#: number of arguments passed

  • $@: all passed arguments

  • $1, $2, …: the first, second argument

Example: Simple loop over all files and directroies.

Script listing:

Execution example:

> bash for-loop.sh
Found directory element:  combined_logs
Found directory element:  doc01.md
Found directory element:  doc02.txt
Found directory element:  doc03.md
Found directory element:  doc04.txt
Found directory element:  doc05.txt
Found directory element:  for-loop.sh
Found directory element:  if-statement.sh
Found directory element:  info.txt
Found directory element:  rundir_01
Found directory element:  rundir_02
Found directory element:  rundir_03
Found directory element:  rundir_04
Found directory element:  rundir_05
Found directory element:  rundir_06
Found directory element:  rundir_07
Found directory element:  rundir_08
Found directory element:  rundir_09
Found directory element:  rundir_10
Found directory element:  rundir_11
Found directory element:  rundir_12
Found directory element:  rundir_13
Found directory element:  rundir_14
Found directory element:  rundir_15

Example: Simple if-statement to check for the existance of a file.

Script listing:

Execution examples:

> bash if-statement.sh nice-file
Checking for file: nice-file
File does not exists
> bash if-statement.sh info.txt
Checking for file: info.txt
File exists

Advanced#

Git and GitHub#

GitHub is a web hosting service for distributed version control using Git. Git allows collaborative software development, while keeping track of changes to the code. This code can be hosted in open-access repositories via GitHub, like the FDS repository for example. Operating systems like Linux or MacOS often come with Git integrated and can directly be used from the terminal. Windows users might need to install an application, for example the graphical user interface GitHub Desktop or the lightweight git for windows.

Version control systems allow the creation of branches to the code base, as well as keeping track of individual changes. Branches can be used to implement new features, solve bugs or further the development of new software versions, without affecting the stable version. Commits are incremental changes to the code itself. At any given time the user can change the branches and commits to work with specific versions of the software. On the web page of the repository there are a couple of tabs at the top of the page. On the right hand side is the tab “Insights”. Clicking on it and then on the left hand side “Network” gives a brief graphical overview over commits and branches. Typically, the user would clone the repository to their computer, work on the software locally, collect changes into commits and push these changes to the online repository later on. One can also download the repository as a zip archive.

To clone a repository, in the terminal navigate to a directory where you would like it to be stored. With Git installed the command git clone link/to/repo in the command line will fetch the data from the online repository and clone it to the desired location. The address of the repository can easily be taken from the respective web page. Just click on the green button on the top right labelled “Code”. A dropdown menu opens from which the link to the repo can be copied. With git checkout branch_label one can change to a different branch. When only git checkout is typed and one hits the Tab-key all available branch labels are shown. Furthermore, individual commits can be loaded by using the commit instead of the branch git checkout commit_id.

Compile software from Git repo#

It may be necessary for users to compile necessary software themselves. For example, to account for the pre-installed libraries and dependencies on the target system. It also happens that one might need a specific software version that is not available in a package or with an installer. Consider the ongoing development of FDS. New functionality could be added that you might want to try out or a change is implemented that might be beneficial for your work. Or maybe you want to compare simulation results to older versions.

The source code of FDS can be downloaded from the FDS GitHub repository, following the green button at the top right on the main page of the git repo. One can download it as a compressed archive or clone the repo directly to disk, using git clone address/of/repo in the command line.

In the directory Build is a README.md file located (*.md is a plain text file to be interpreted as Markdown). It contains some information on the compilation process of FDS. Detailed instructions are given in the makefile for different operating systems and MPI versions.