Type checking in bash and correct argument handling
I just want to write a quick tutorial, after stumbled with yet another stone in the path, so you don’t have to (although chances are that if you have landed in this article you may have probably looked for an answer already).
Bash is an sripting language, but its full interoperability outside the box with Linux, makes it an incredibly powerfull tool as it can be runned in almost every server in this planet.
This language has been envisaged for scripting purposes, but when writting a tool, you may cross the thin line of scripting to programming as your script grows in complexity and at that point and for not wanting to change all the infrastructure of it you will stick to it regardless of certain ineficiencies.
In order to write good code we will end up with the necesity of implementing certain code blocks repeteadly , a.k.a boilerplate . We will see below some useful snippets.
Type checking
When repeating patterns in your program you may want to encapsulate its functionalities for easy reuse in functions.
Functions in bash work like executables themselves (meaning executables), they take arguments as a concatenation of strings, there is no control on how many strings will they take (even if any), and what their form is gonna be. Calling a function in bash it is the same as invoking a program from the command line.
A well writen executable will be implemented in a way that you pass the arguments the following way.
./executable.sh -a value_a -b value_b
-a
and -b
are flags and value_a
, value_b
their respective values.
Excutables can be written in many different ways, though for the clarity of the user this pattern is desirable, and can even be refined way more, we wont get into that in this article.
The point is that functions in a bash
program work the same way as executables
What for an executable is
# vim ./executable.sh
#!/bin/bash
echo ${1}
echo ${3}
$ ./executable.sh -a value_a -b value_b
-a
-b
For a function is
$ vim ./myscript.sh
#!/bin/bash
my_function -a value_a -b value_b
my_function() {
echo ${1}
echo ${3}
}
$ ./myscript.sh
-a
-b
The issue
Scripts above are really simple code, they won’t withstand many execution environments though.
In regards to functions if you write them expecting the parameters to be there you may run into big problems (I did) , for instance executing rsync
with shifted arguments can lead you to a big trouble.
use_rsync "etc"
# use_rsync "home" "etc"
use_rsync () {
SUBDIR_ONE=${1}
SUBDIR_TWO=${2}
rsync --remove-source-files "/${SUBDIR_ONE}/${SUBDIR_TWO}" 192.168.43.42:/home/otheruser/
}
In the silly example above (in which you have a user called etc
) use_rsync
if you miss the first argument as in the first invocation use_rsync "myuser"
you are no longer checking on /home/etc/
but in /etc/
and will be deleting that directory after the transfer (not to say that you modify the same paths on the receiver too).
You can destroy hosts in a fraction of a second, and althogh this scenario is unlikely, sometimes functions don’t parse arguments at all for little issues in your code, or a missuse by the user. Then we can be at risking directories like the root directory.
Doing some boiler plate
It is good then to apply this snippet even in your funtions
functionname() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: functionname -a ARG_ONE -b ARG_TWO" >&2
echo "Example: functionname -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: functionname requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
# CODE GOES IN here
}
Even for the simplest of the functions, if you save this snippet, this will implement a basic type checking , we can re-write the above as
use_rsync "etc"
# use_rsync -a "home" -b "etc"
user_rsync() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
# Initialize variables
SUBDIR_ONE=""
SUBDIR_TWO=""
while getopts "a:b:" opt; do
case "${opt}" in
a) SUBDIR_ONE="${OPTARG}" ;;
b) SUBDIR_TWO="${OPTARG}" ;;
*)
echo "Usage: user_rsync -a SUBDIR_ONE -b SUBDIR_TWO" >&2
echo "Example: user_rsync -a SUBDIR_ONE -b SUBDIR_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${SUBDIR_ONE}" || -z "${SUBDIR_TWO}" ]]; then
echo "Error: user_rsync requires -a SUBDIR_ONE -b SUBDIR_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
rsync --remove-source-files "/${SUBDIR_ONE}/${SUBDIR_TWO}" 192.168.43.42:/home/otheruser/
}
Now the program will exit with error, specifying it this way with getopts
function can save you from a lot of hassle.
More info on getopts
on man bash
getopts optstring name [arg ...] is used by shell procedures to parse positional parameters.
(...)
But hold on dont copy that snippet, we will add something else as it may still malfunction (although not messing things up anymore).
The problem with getopts
Following this previous programming practice I have carried on programming, and when it came to debugging it was easier, yet I was having issues, with functions that were not parsing its arguments (despite them been passed)
This situation was detected when a function was calling another function within it, following two function calls (ergo two argument passes), getopts
will malfunction if written this way as for its functioning is using and setting a bash
built in variable, OPTIND
. So the second invocation will read this already altered variable and malfunction.
Follow my example below
./myscript.sh
#!/bin/bash
main() {
wrapper -a "Luis" -b "Pepe"
}
wrapper() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: wrapper -a ARG_ONE -b ARG_TWO" >&2
echo "Example: wrapper -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: wrapper requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
### echo say_hi -a ${ARG_ONE} -b ${ARG_TWO} ## FOR DEBUGGING
say_hi -a ${ARG_ONE} -b ${ARG_TWO}
}
say_hi() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:z:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: say_hi -a ARG_ONE -b ARG_TWO" >&2
echo "Example: say_hi -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: say_hi requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
echo "Mr ${ARG_ONE} is saying hi to ${ARG_TWO}"
}
main
When executed it throws you out with Error: say_hi requires -a ARG_ONE -b ARG_TWO
If we uncomment the line for debugging echoing the function invocation we get
say_hi -a Luis -b Pepe
, so the arguments are been passed correctly, you may get stuck.
The solution is to reset and assign the variable local OPTIND=1
making it locally for each function definition.
Rewritting the code as
#!/bin/bash
main() {
wrapper -a "Luis" -b "Pepe"
}
wrapper() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
local OPTIND=1
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: wrapper -a ARG_ONE -b ARG_TWO" >&2
echo "Example: wrapper -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: wrapper requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
say_hi -a ${ARG_ONE} -b ${ARG_TWO}
}
say_hi() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
local OPTIND=1
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:z:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: say_hi -a ARG_ONE -b ARG_TWO" >&2
echo "Example: say_hi -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: say_hi requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
echo "Mr ${ARG_ONE} is saying hi to ${ARG_TWO}"
}
main
Now we get
Mr Luis is saying hi to Pepe
Final snippet
We have got the resulting snippet, which you may want to copy.
functionname() {
## PARSING ARGUMENTS ---------------------------------
## ---------------------------------------------------
local OPTIND=1
# Initialize variables
ARG_ONE=""
ARG_TWO=""
while getopts "a:b:" opt; do
case "${opt}" in
a) ARG_ONE="${OPTARG}" ;;
b) ARG_TWO="${OPTARG}" ;;
*)
echo "Usage: functionname -a ARG_ONE -b ARG_TWO" >&2
echo "Example: functionname -a ARG_ONE -b ARG_TWO" >&2
return 1
;;
esac
done
# Variable check
if [[ -z "${ARG_ONE}" || -z "${ARG_TWO}" ]]; then
echo "Error: functionname requires -a ARG_ONE -b ARG_TWO" >&2
return 1
fi
## PARSING ARGUMENTS ---------------------------------
## EOF EOF EOF EOF------------------------------------
# CODE GOES IN here
}