The Missing Semester of CS Education, MIT - the Shell
“计算机教育缺失的一课”,MIT
别出心裁开设的这门课将会介绍非常重要但是却鲜少在大学 CS curriculum 中
cover 的知识与工具,例如 Shell script,Vim,命令行环境,Git,ssh
等等。course motivation 中这么说
(有点汗颜,这说的不就是我吗):
... Yet many of us utilize only a small fraction of those tools; we only know enough magical incantations by rote to get by, and blindly copy-paste commands from the Internet when we get stuck.
硬要说的话,ENGG1340 中介绍了一部分相关的内容,但是远远不够。所以学习一下这门课还是很有必要的;并且,这门课 workload 不大,当作暑校期间的小零食也不错。
This article is a self-administered course note.
It will NOT cover any exam or assignment related content.
What is the Shell?
- Graphical use interfaces (GUIs).
- Voice interfaces.
- AR/VR.
Great for 80% of use-cases, but they are often restricted in what they allow you to do.
To take full advantage of the tools your computer provides, we have to go old-school and drop down to a textual interface: Ths Shell. In this lecture, we will focus on Bourne Again Shell ("bash").
1 | missing:~$ echo hello |
terminal.
prompt.
The main textual interface to the shell. It tells you are on the machine
missing
and your current working directory is~
(short for "home"). The$
tells you that you are NOT the root user.command.
Execute the program
echo
with the argumenthello
. The shell parses the command by splitting it by whitespace, and then runs the program indicated by the first word, supplying each subsequent word as an argument that the program can access.
How does the shell know where to find the built-in programs like
date
and echo
?
- The shell is asked to execute a command.
- If the command doesn't match one of its programming keywords, it
consults an environment variable called
$PATH
that lists which directories the shell should search for programs when it is given a command.
1 | missing:~$ echo $PATH |
Navigating in the Shell
The path /
is the root of the file
system.
- A path that starts with
/
is called an absolute path. - Any other path is a relative path relative to the current working directory.
Command cd
could take both absolute path and relative
path as arguments.
/
: root directory.~
: home directory..
: current directory...
: parent directory.
Connecting Programs
In the shell, programs have two primary "streams" associated with them: their input stream and output stream. When the program tries to read input, it reads from the input stream, and when it prints something, it prints to its output stream.
Normally, a program's standard input and output are both your terminal. That is, your keyboard as input and your screen as output. However, we can also rewire those stream.
command < file
,command > file
: rewire the input/output streams of a program to file.command >> file
: rewire the output streams of a program to append to a file.command1 | command2
: the use of pipes lets you "chain" programs such that the output of one is the input of another.
Root User
On most Unix-like systems, one use is special: the "root" user. It is above (almost) all access restrictions, and can create, read, update, and delete any file in the system.
You will not usually log into your system as the root use though,
since it's too easy to accidentally break something. Instead, you will
be using the sudo
command. As its name implies, it lets you
"do" something as "su" (short for "super user", or "root").
One thing you need to be root in order to do is writing to the
sysfs
file system mounted under /sys
.
sysfs
exposes a number of kernel parameters as
files, so that you can easily reconfigure the kernel on the fly
without speacialized tools.
For example, by writing a value into the file in directory
/sys/class/backlight
, we can change the screen brightness.
The first instinct might be to do something like:
1 | $ sudo find -L /sys/class/backlight -maxdepth 2 -name '*brightness*' |
This error may come as a surprise. After all, we ran the command with
sudo
!
This is an important thing to know about the shell. Operation
like |
, >
, and <
are done
by the shell, not by the indivisual program.
In the case above, the shell (which is authenticated just as your
user) tries to open the brightness file for writing, before setting that
as sudo echo
's output, but is prevented from doing so since
the shell does not run as root.
There are two solutions for this problem.
- Use
sudo su
command to effectively get a shell as the super user. You will find that the$
in prompt changes to#
(super user). In this mode, simply runecho 3 > brightness
. - Run
echo 3 | sudo tee brightness
. Since thetee
program is the one to open the/sys
file for writing, and it is running as root, the permissions will work out.
Shell Scripting
So far we've seen how to execute commands in the shell and pipe them together. However, in many scenarios you will want to perform a series of commands and make use of control flow expressions like conditionals or loops.
Most shells have their own scripting language with variables, control flow and its own syntax. What makes shell scripting different from other scriptng programming language is that it is optimized for performing shell-related tasks.
Variables and Functions
1 | foo=bar; |
Note that foo = bar
will not work since it is
interpreted as calling the foo
program with arguments
=
and bar
. In general, in shell scripts
the space character will perform argument
splitting.
1 | # filename: mcd |
Bash has functions that take arguments and can operate with them. The
above is an example of a function that creates a directory and
cd
into it.
Here $1
is the first argument to the script/function.
Bash uses a variety of special variables to refer to arguments, error
codes, and other relevant variables.
$0
- Name of the script. (command)$1
to$9
- Arguments to the script.$1
is the first argument and so on.$@
- All the arguments.$#
- Number of arguments.$?
- Return code of the previous command.$$
- Process identification code (PID) for the current script.$_
- Last argument from the last command.
When a bash function is defined, you could loaded it
into the shell environment with the command source
, so you
could later run it like any other built-in programs.
1 | XXZ:~$ source mcd |
Return/Exit Code
Commands will often return output using STDOUT
, errors
through STDERR
, and a Return Code to report errors in a
more script-friendly manner. A value of 0 usually means
everything went OK; anything different from 0 means an error
occurred.
Exit codes can be used to conditionally execute
commands using &&
(and operator) and
||
(or operator), both of which are
short-circuiting operators. Commands can also be
separated within the same line using a semicolon ;
.
1 | false || echo "Oops, fail" |
Substitution
- Variable substitution. Whenever you place
"$var"
, it will expand the variablevar
and substitute it in place as a string. - Command substitution. Whenever you place
$(CMD)
, it will executeCMD
, get the output of the command and substitute it in place. - Process substitution.
<(CMD)
will executeCMD
and place the output in a temporary file and substitute the<()
with that file's name. This is useful when commands expect values to be passed by file instead of by STDIN.
1 |
|
The above exampke will iterate through the arguments we provide,
grep
for the string foobar
. and append it to
the file as a comment if it's not found.
Shell Globbing
When launching scripts with similar arguments, we could use shell globbing technique to expand expressions by carrying out filename expansion.
- Wildcards. Use
?
and*
to match one or any amount of characters repectively. - Curly braces
{}
. Whenever you have a common substring in a series of commands, you can use curly braces for bash to expand this automatically.
1 | convert image.{png, jpg} |
Shebang
Scripts need not necessarily be written in bash to be called from the terminal. For instance, here's a simple Python script that outputs its arguments in reversed order.
1 | #!/usr/local/bin/python |
The shebang line at the top of the script tells the
kernel to execute this script with a python intepreter instead of a
shell command. It is good practice to write shebang lines using the
env
command that will resolve to wherever the command lives
in the system, increasing the portability of the script.
To resolve the location, env
will make use of the
PATH
environment variable. For this example the shebang
line would look like #!usr/bin/env python
.
Differences between shell functions and scripts:
- Functions have to be in the same language as the shell, while scripts can be written in any language. This is why including a shebang for scripts is important.
- Functions are executed in the current shell environment whereas scripts execute in their own process. Therefore functions can modify environment variables, e.g. change your current directory, whereas scripts can't.
- Functions are loaded once when their definition is read. Scripts are loaded every time they are executed.
Shell Tools
- Shell check.
- Finding how to use commands.
- Command with
-h
or--help
. - Manual page command
man
. - TLDR pages are a nifty solution that focuses on giving example use cases.
- Command with
- Finding files using command
file
.- recursively search for files matching some criteria.
- perform actions over files that match your query.
- A user-friendlier alternative
fd
.
- Finding code.
- Command
grep
with flags-C
,-v
,-R
. - Alternative
ack
,ag
andrg
.
- Command
- Finding shell commands.
- Command
history
will let you access your shell history. Ex. `history | grep find
will print commands that contain the substring "find". - General-purpose fuzzy finder
fzf
. - history-based autosuggestions.
- Command
- Directory navigation.
fasd
andautojump
, ranks directories and files by frecency, that is, by both frequency and recency.- directory structure:
tree
,broot
.
Reference
This article is a self-administered course note.
References in the article are from corresponding course materials if not specified.
Course info:
MIT Open Learning. The Missing Semester of Your CS Education.
Course resource: