Shell Tutorial

引言

本次 Tutorial 分享的是 shell 基础用法。主要目的在于结合实际使用案例,帮助刚加入实验室的新同学快速上手,从而高效完成实验。

  • Brief Intro: all you need to know about starting using a CLI

  • Basic but useful command line tools

  • How to write a bash scripts and what can those scripts do?

  • Real-world examples

首先会有一个关于 CLI 非常简短的 Intro;第二部分会介绍一些比较有用的 command line tools;第三部分是介绍 bash 脚本以及我们能用它做什么事情;最后是实验室会用到的常用例子。

  1. Using shell to get the work done (doing experiment, coding, etc.) efficiently
  2. Simple Result (data) Processing using bash script
  3. Automatic experiment, data collecting and plotting figures

今天的目标希望大家知道如何使用 shell,包括做实验、coding 或者自动化的工作流程。

❗It is not a detailed tutorial, find out more details (e.g., about how to use each tools, more useful commands and techniques) yourself.

❗It is okey not having this lesson (like myself). Only a quick start, experience sharing or something like that.

Part 1: CLI? That’s cool!

GUI vs CLI: which is better?

GUI 更符合我们的用户直觉,实际上很多软件都提供了 GUI 和 CLI 两种选择。上图的 VSCode 和 Vim 编辑代码的功能是一样的,主要区别在于适用场景不一样。

GUICLI
Graph (e.g., code analysis tools)Data analysis
More intuitive user-interface, especially in complex softwareHome-brew app

When both are available (e.g., editor), use the one suit you best!

图像方面的工作更倾向于使用 GUI,CLI 接到服务器上会经常使用,对网络连接的要求不高。

Shell: The system user-interface in CLI

  • Capability: Launch app, execute command, manage foreground/background tasks

  • A lot of shell available: zsh, bash, sh, etc.

    • Mostly similar
    • Differences: build-in commands, script grammar, extensions
    • Chose the one you like
  • Useful extensions of oh my zsh: history, autosuggestion, vim-like

GUI 有一个基础的 Desktop 作为桌面环境交互,shell 则在 CLI 中扮演类似的角色:用于与系统交互的中介如:启动应用、执行命令、管理前后端等。

比较常用的 shell 有 zsh、bash 等等,它们的实现大部分是相同的,并且语法类似。区别主要在于内嵌命令、脚本语法和所支持的扩展会所有不同。

Basic Setup

  • Terminal (emulator): emulate a (texted-based) terminal inside the GUI environment
  • SSH to server
    • Running sshd: daemon of SSH server
    • Strong password or use ssh key to login
    • Keep the connection: tmux, screen, etc.
  • Keyboard shortcuts
    • ctrl + r (to find history), tab (to autofill)
    • ctrl + c (to kill SIGINT)

对于本地的 CLI,我们需要 Terminal (emulator) 在 GUI 系统中控制 CLI 模拟终端。

更多的场景我们要连接服务器,需要注意的是服务器密码要设置为强密码,最好使用 ssh key 来登陆。

Install Software in CLI

  • Package manager: apt (ubuntu, Debian), brew (macOS), dnf (fedora)

  • Build from source (no suitable version, or need to modify their code)

    • README/INSTALL doc
    • configure and make install

在 GUI 环境中我们可以下载安装包然后 next 即可,而 CLI 也比较类似,但 CLI 更方便,因为系统会提供 Package Manager 这样的东西(例如 Ubuntu 会提供 apt)。包管理器可以类似于一个 App Store,你可以直接用它下载软件。

Communication: Pipe & Redirect

  • A lot of CLI tools, communication is required to do complex jobs
  • Pipe: | use the stdout of previous command as the stdin of the next

Pipe 会把前一条指令的输出通过管道流向后一条指令的输入。

  • A lot of CLI tools, communication is required to do complex jobs
  • Redirect: > & <, stdout to file or file to stdin (normally)

我们还可以重定向文件的输出。

Part 2: Handy tools make things easier

Basic Tools (Commands)

Find out yourself:

e.g., https://www.geeksforgeeks.org/basic-shell-commands-in-linux/ https://swcarpentry.github.io/shell-novice/reference.html

常见的命令可以通过文档学习,善用 -helpman 帮助。

ag

Usage Scenario: Find keyword

在 command-not-found 网站会介绍命令是干什么用的,其次它会给一系列的示例可以做什么事情,非常直观的告诉你这个指令所需要的一切。

Usage Scenario: Find keyword in code, doc, stdout, etc.

ag 命令用于快速定位具体的关键字,图中就是在 Linux 中寻找 cpufreq_driver_fast_switch,包含它或者调用它的文件就全部列举出来。

不仅如此,ag 还可以接收正则表达和 stdin。

awk

Usage Scenario: Result (data) Processing

  • Domain-specific language designed for text processing (c-like)
  • Typically used as a data extraction and reporting tool

Normal Use Cases:

  • Average, max, min
  • Get data in a certain column
  • Simple conditional logic

Usage Scenario: Result (data) Processing

Example: Grab Data from a certain column

cat tmp | awk '{print $2}'

cat tmp | awk '
BEGIN {cnt=0}
{sum+=$1;cnt+=1}
END {print (sum/cnt)}'
BEGIN{cnt=0}

cat tmp | awk ‘{if($1>3) print $2}’

可以看到 awk 可以让我们不需要写冗长的 C 语言代码再做数据处理。

sed

sed 用于截取数据。

Part 3: Lets write some shell scripts!

Shell (Bash) Batch Script

Basically, Batch Scripts are stored in simple text files containing lines with commands

echo "Hello IPADSer!"
echo "result of ls:"
ls
echo "OOOPS, here is the end!"

With local variables.

year=2021
echo "This year is $year!"
echo "Last year is $[$year-1]!"

Passing in as an arguments

command=$0
prompt=$1
year=$2

echo "$prompt Running $command"
echo "$prompt This year is $year"

Passing in as an arguments; or from the results of commands

str=echo 'Hello world'
echo "str is $str"

Support loop and conditions

files='ls'
for file in $files
do
if test $file = "run.sh"
then
echo 'find myself'
else
echo 'find file $file'
fi
done

Functions

func() {
echo "First arg $1"
echo "Second arg $1"
echo "All arg $0"
echo "Arg count $0"
}
func 1 2 3 4 5

  • Run one after another
  • Can call other scripts in a script
    • Decoupling

Part 4: Talk is cheap. Show me some example!

Example #1

Running Experiments Multiple Times and Get the Average Result

echo "Dummy Experiment Output"
echo "Throughput $RANDOM ops/s"

exp_times=100
result_file=tmp_result
echo "" > $result_file
for i in `seq 1 $exp_times`
do
bash ./run_exp.sh | sed -n '2 p' | awk '{print $2}' >> $result_file
done
awk 'BEGIN{cnt=0} {sum+=$1;cnt++;} END{print sum/cnt}' $result_file

Example #2

echo "$RANDOM us"

for i in `seq 1 1000`
do
bash ./run_request.sh
done

file=$1
# Sort the file
sort -g $file > $file-sorted

# Get the result count
cnt=`wc -l $file | awk '{print $1}'`
echo "Result Count $cnt"

# p99 p999
p99_line=$[$cnt*99/100]
p999_line=$[$cnt*999/1000]
p99_lat=`sed -n "$p99_line p" $file-sort`
echo "P99 Latency $p99_lat"
p999_lat=`sed -n "$p999_line p" $file-sort`
echo "P999 Latency $p99_lat"

# cdf
awk -v tot_cnt="$cnt" 'BEGIN{line=0} {line++;if (line%2 == 0) print (line/tot_cnt" "$1)}' $file-sorted > $file-cdf

Part 5: Whats’ next?