From 320691f312287575140bde777a11035166fc53a0 Mon Sep 17 00:00:00 2001 From: Einhard Leichtfuß Date: Sun, 22 Dec 2024 04:08:55 +0100 Subject: Initial commit The basic.bash script is based on the one used in `github.com:lawandorga/laworga-mail-server.git`, and other versions used by me (which the `lawandorga-mail-server.git` one was based upon). The notes are generally new, but many of them just a consolidation and refinement of existing knowledge (of mine). --- README.md | 24 +++++ docs/error-handling.md | 112 ++++++++++++++++++++++ docs/error-handling/subshell/repeated-printing.md | 84 ++++++++++++++++ docs/subshell.md | 8 ++ src/basic.bash | 112 ++++++++++++++++++++++ 5 files changed, 340 insertions(+) create mode 100644 README.md create mode 100644 docs/error-handling.md create mode 100644 docs/error-handling/subshell/repeated-printing.md create mode 100644 docs/subshell.md create mode 100644 src/basic.bash diff --git a/README.md b/README.md new file mode 100644 index 0000000..f495b5b --- /dev/null +++ b/README.md @@ -0,0 +1,24 @@ +# Basic bash "library" + +* The idea of this is to provide basic convenience settings in a Bash script. + + +## How to use + +* Install `src/basic.bash` to `/path/to/basic.bash.` +* At the very beginning of a script (after the shebang): + `include /path/to/basic.bash || exit 1` +* If a script includes other libraries, these should be included after + `basic.bash`, which will then also apply to those. +* The `basic.bash` library should not be included in other libraries, except + when these are meant as wrappers of `basic.bash`, providing additional + functionality. + * I.e., `basic.bash` should not be included more than once (while it + should be safe to do so). +* Alternatively, one could also copy the content of `basic.bash` to the top + (after the shebang) of a script. + + +## Semantics + +* See [basic.bash](src/basic.bash) and [Error handling](doc/error-handling.md). diff --git a/docs/error-handling.md b/docs/error-handling.md new file mode 100644 index 0000000..483123f --- /dev/null +++ b/docs/error-handling.md @@ -0,0 +1,112 @@ +# Error handling + +* A central feature of `basic.bash` is to exit (non-zero) on error and provide + a useful error message, namely a stack trace. + * In many cases, the failing command will additionally print an error + message on its own, which should be printed above the stack trace. +* Manual error handling is still best, but often too time consuming when + calling many external commands and bash builtins that may fail, as is common + with shell scripts (unless they are very small). + + +## Limitations + +### Backgrounded process + +* An error in a backgrounded process (e.g., `false &`) is ignored by the + calling shell. +* Failures within, e.g., `{ false; } &` will cause a stack trace to be + printed, but the calling shell will not terminate. +* It may be a good idea to use `wait` to check on the return status. +* It may further be a good idea to redefine the trap on `ERR` within a + backgrounded process to not print a stack trace. + + +### Subshell + +* Some errors in [subshells](subshell.md) are not caught (properly): + * `declare var=$(false)` + * Instead, write: `declare var; var=$(false)` + * `true $(false)` + * Workaround: Use an intermediate variable. + * `true <(false)`, `true >(false)` + * Workaround: Use a pipe where possible. + * See also: [Pipes](#Pipes) + * Workaround: Use a temporary file. + * See also: [Backgrounded process](#Backgrounded_process) + + +### Unset variables + +* Unset arrays (when accessed with subscript `@` or `*`) are never treated as + an error. + * To manually check for a *non-empty* array: `test -v var[@]` + * To manually check for an unset array, `declare -p var` and/or `${var@a}` + may be helpful. + * To manually check whether a variable is declared as an array, + `declare -p var` and/or `${var@a}` may be helpful. + * The latter will fail (due to `set -o nounset`) if `$var` is empty + or unset. + * `${var[@]@a}` and `${var[*]@a}` behave weirdly: + * If `$var` is undeclared, evaluates to the empty array/string. + * If `$var` is declared as an array, but unset, evaluates to `a`. + * If `$var` is set to the empty array, evaluates to the empty + array/string. + * If `$var` is set to a non-empty array, evaluates to the array + (or space-separated list in case of `${var[*]@a}`) of as many + `a` as the array is long. +* An error on an unset variable (due to `set -o nounset`) does not cause a + stack trace to be printed in the current [(sub-)](subshell.md)shell, but + only a simple message mentioning the file and line number. + * If there is no subshell involved, no stack trace is printed at all. + * Note that unset variable errors are also caught in + `declare var=$undef_var` (compare [Subshell section](#Subshell)). + * This is arguably not a big problem; such errors should mostly be quickly + spotted by simple testing (or some static analysis tool). + * Exceptions: Usage of `declare -n` and `eval` with dynamic variable + names. + + +### Pipes + +* A pipeline is considered an error iff the last command returns non-zero. +* A different behaviour can be achieved by `set -o pipefail`. + * `basic.bash` deliberately does not set this. See there for details. +* To catch errors in a non-last command of a pipeline, one should either + consult the `${PIPESTATUS[@]}` array variable, or `set -o pipefail` in a + subshell. +* Oftentimes, avoiding pipes may be the best option. + + +### Conditionals + +* A command that is evaluated as a condition (e.g., `if cmd; then ...; fi`), + is never considered an error as the return code is instead used as a boolean + condition. +* In some cases it may be necessary for proper error reporting to do something + like the following: + ``` + if cmd + ... + else + then + if [[ $? -ne 1 ]] + then + return 1 + fi + ... + fi + ``` +* See also: [Pipes](#Pipes) + + +### Other use of `stderr` + +* The `basic.bash` library assumes that `stderr` is never redirected, except + directly from external commands or shell builtins (e.g., `var=$(cmd 2>&1)`). + + +## Annoyances + +* With [subshells](subshell.md), there may be + [multiple stack traces printed](error-handling/subshell/repeated-printing.md). diff --git a/docs/error-handling/subshell/repeated-printing.md b/docs/error-handling/subshell/repeated-printing.md new file mode 100644 index 0000000..fa94a1b --- /dev/null +++ b/docs/error-handling/subshell/repeated-printing.md @@ -0,0 +1,84 @@ +# Problem: Repeated printing of (parts of) the stack trace + +## Gist + +* With subshells, several stack traces may be printed. +* All but the first stack trace can be ignored. + + +## Problematic behaviour + +* For each subshell in the current stack of subshells (including the root + shell), we get a stack trace. + * The `ERR` trap is caught for each subshell. + * Reason: The trap on `ERR` returns non-zero, and so does the subshell. + * If it returned zero, the error would be ignored on the upper level, + and program execution continue, which is undesired. +* More precisely, each subshell gives a stack trace on the stack from itself + up to the root shell. + * Reason: Subshells inherit the knowledge of its ancestors, but do not + know of its descendant subshells. + * Thus, the closer to the root of the stack, the more often we get the + (same) information printed; while the information on where the error + originally occurred is printed only once. + + +## Desired behaviour + +* The whole stack trace is printed once, and nothing more. + * This would be achieved if only the subshell where the error originally + occurred were to print a stack trace. + + +## Considerations on fixes / improvements + +### Only print the stack trace for the root shell. + +* This is easy, just check for `[[ $BASH_SUBSHELL -eq 0 ]]`. +* This would mean that on error within a subshell, we do not get + information below where the first subshell was invoked. +* This may be deemed acceptable if subshells are rarely used and/or only + with short local code within. + * Short local code would be, e.g., `$(head -n 1 FILE)`. + * Short local code would not be, e.g., `$(local_nontrivial_function)`. + + +### Reserve a special exit code. + +* Let `$SPECIAL_EXIT_CODE` be some exit code distinct from `0` and `1`. +* Change `ERR` trap to `trap 'basic::on_error 0' ERR`, and define `on_error()` + as follows: + ``` + function basic::on_error() + { + [[ $? -eq $SPECIAL_EXIT_CODE ]] && exit 1 + local -ri offset="$1" + basic::print_stacktrace $((offset + 1)) + exit $SPECIAL_EXIT_CODE + } + ``` +* The idea is that `on_error()` only prints the stack trace in the lowest + subshell, which has the full stack trace. +* Naturally, this does not work properly if the original actual error had + `$SPECIAL_EXIT_CODE` as exit code. +* That is, we'd need an exit code that cannot occur anywhere else. + * This should be impossible in the general case. + + +### Use a temporary file + +* Use a temporary file to indicate whether `print_stacktrace()` was already + called in a subshell. +* This feels somewhat evil. + + +### Inspect `$BASH_COMMAND` + +* The `$BASH_COMMAND` variable contains the command executed that caused the + trap. +* We'd have to identify whether the command spawned a subshell (and the error + came from there). +* Note that `(` is a valid command name---but `$BASH_COMMAND` maintains any + necessary quoting. +* It might suffice to check for `(.*` and `${varname_regex}=\$(.*`, given the + [other self-imposed restrictions](../../error-handling.md#Subshell). diff --git a/docs/subshell.md b/docs/subshell.md new file mode 100644 index 0000000..319e629 --- /dev/null +++ b/docs/subshell.md @@ -0,0 +1,8 @@ +# Subshell + +* A subshell may be created by any of the following: + * `(.)` + * command substitution: `$(.)`, `` `.` `` + * process substitution: `<(.)`, `>(.)` + * `bash(1)` does not talk of a subshell here, but this seems to work + similarly to `$(.)`. diff --git a/src/basic.bash b/src/basic.bash new file mode 100644 index 0000000..a951e0c --- /dev/null +++ b/src/basic.bash @@ -0,0 +1,112 @@ +# Basic bash configuration for scripts. + +# Copyright 2020-2024 Einhard Leichtfuß + + +################################### +## Error and termination handling + +# See `/docs/error-handling.md` on limitations and annoyances. + + +## What to consider as an error. + +# Accessing unset variables is an error. +set -o nounset + +# No `pipefail` (default). +# - Enabling this might, in particular, cause unexpected issues in `if` +# statements. +# - Enable in a subshell whereever necessary. + + +## How to act on error. + +function basic::print_stacktrace() +{ + local -ri offset="$1" + + local -i i + local -a args=() + for (( i = offset + 1; i < ${#BASH_SOURCE[@]} - 1; i++ )) + do + # See also the `caller` bash builtin. + args+=( "${BASH_SOURCE[i]} ${BASH_LINENO[i-1]} ${FUNCNAME[i]}" ) + done + args+=( "${BASH_SOURCE[i]} ${BASH_LINENO[i-1]}
" ) + + basic::stacktrace_prettyprint "$BASH_SUBSHELL" "${args[@]}" + + return 0 +} + +# Default function to pretty-print stack trace. +# - To be redefined if desired. +function basic::stacktrace_prettyprint() +{ + local -ri subshell_id="$1" + shift + + local -i i=$# + local msg + for msg + do + printf 'FAILED (%d, %d): %s\n' $subshell_id $((--i)) "$msg" >&2 + done + + return 0 +} + +# On ERR, print stack trace and exit non-zero. +trap 'basic::print_stacktrace 0; exit 1' ERR + +# Inherit traps on ERR. +set -o errtrace + +# If we didn't trap ERR (and exit on ERR), we'd likely want the below. +# - Note: `inherit_errexit` is similar to `errtrace`. +#set -o errexit +#shopt -s inherit_errexit + + +## How to act on exit. + +# Default on_exit() function. +# - To be redefined if desired. +function basic::on_exit() +{ + return 0 +} + +trap 'basic::on_exit' EXIT + + + +######################## +## Other shell options + +# Disable globbing; set good defaults for when temporarily enabled. +# - `extglob` has an effect regardless. +set -o noglob +shopt -s nullglob dotglob globasciiranges globstar extglob globskipdots + +# Print an error message upon `shift`-ing "too far" (causes ERR regardless). +shopt -s shift_verbose + +# Expand associative array subscripts only once. +shopt -s assoc_expand_once + +# Do not let `source' use PATH. +shopt -u sourcepath + + + +################## +## Other options + +# Explicitly set the expected default umask. +umask 022 + + + +# vi: ft=bash ts=2 sw=0 et -- cgit v1.2.3