diff options
| author | Paul Buetow <git@mx.buetow.org> | 2021-05-16 15:27:51 +0100 |
|---|---|---|
| committer | Paul Buetow <git@mx.buetow.org> | 2021-05-21 05:11:05 +0100 |
| commit | b4f44105569f0683e1d8d405a07645905b632962 (patch) | |
| tree | 8c44e118d05dfd1c064674b79899fffb30b05d10 | |
| parent | c83586f2c6cd8b046e7384a01c2e7c52e1fbb703 (diff) | |
add bash coding style article
| -rw-r--r-- | content/gemtext/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi (renamed from content/gemtext/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.gmi) | 40 | ||||
| -rw-r--r-- | content/gemtext/gemfeed/atom.xml | 310 | ||||
| -rw-r--r-- | content/gemtext/gemfeed/index.gmi | 1 | ||||
| -rw-r--r-- | content/gemtext/index.gmi | 1 | ||||
| -rw-r--r-- | content/html/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.html | 43 | ||||
| -rw-r--r-- | content/html/gemfeed/2021-05-16-personal-bash-coding-style-guide.html | 350 | ||||
| -rw-r--r-- | content/html/gemfeed/atom.xml | 310 | ||||
| -rw-r--r-- | content/html/gemfeed/index.html | 1 | ||||
| -rw-r--r-- | content/html/index.html | 1 | ||||
| -rw-r--r-- | content/md/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.md | 49 | ||||
| -rw-r--r-- | content/md/gemfeed/2021-05-16-personal-bash-coding-style-guide.md | 385 | ||||
| -rw-r--r-- | content/md/gemfeed/index.md | 1 | ||||
| -rw-r--r-- | content/md/index.md | 1 | ||||
| -rw-r--r-- | content/meta/gemfeed/2021-05-16-personal-bash-coding-style-guide.meta | 5 |
14 files changed, 1445 insertions, 53 deletions
diff --git a/content/gemtext/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.gmi b/content/gemtext/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi index fea6c302..8adb5b6b 100644 --- a/content/gemtext/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.gmi +++ b/content/gemtext/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi @@ -23,9 +23,23 @@ Lately, I have been polishing and writing a lot of Bash code. Not that I never w These are my personal modifications of the Google Guide. +### Shebang + +Google recommends using always + +``` +#!/bin/bash +``` + +as the shebang line. But that does not really work on all Unix and Unix like operating systems (e.g. the *BSDs don't have Bash installed to /bin/bash). Better is: + +``` +#!/usr/bin/env bash +``` + ### 2 space soft-tabs indentation -I know there have been many tab- and soft-tab wars on this planet. Google recommends to use 2 space soft-tabs for Bash scripts. +I know there have been many tab- and soft-tab wars on this planet. Google recommends using 2 space soft-tabs for Bash scripts. I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project. @@ -35,7 +49,7 @@ I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, ### Breaking long pipes -Google recommends to break up long pipes like this: +Google recommends breaking up long pipes like this: ``` # All fits on one line @@ -60,7 +74,7 @@ command1 | ### Quoting your variables -Google recommends to always quote your variables. I think you should do that only for variables where you aren't sure what the content is (e.g. content is from an external input source). In my opinion, the code will become quite noisy when you always quote your variables like this: +Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this: ``` greet () { @@ -110,7 +124,7 @@ I prefer to do light text processing with the Bash builtins and more complicated Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points). -I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without respecting the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-) +I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-) ## My additions @@ -137,7 +151,7 @@ buy_soda $I_NEED_THE_BUZZ ### Non-evil alternative to variable assignments via eval -Google is in the opinion that eval should be avoided. I think so too. They list this example in their guide: +Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide: ``` # What does this set? @@ -161,11 +175,11 @@ declare bay=foo bar baz foo ``` -And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash - write code which produces code for immediate execution): +And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution): ``` % cat vars.sh -#!/usr/bin/bash +#!/usr/bin/env bash cat <<END declare date="$(date)" declare user=$USER @@ -179,7 +193,7 @@ The downside is that ShellCheck won't be able to follow the dynamic sourcing any ### Prefer pipes over arrays for list processing -When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure similar to the following example: +When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure like this: ``` filter_lines () { @@ -223,7 +237,7 @@ The stdout is always passed as a pipe to the next following stage. The stderr is I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments. -The solution is to use of the "assign-then-shift"-pattern which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago: +The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago: ``` some_function () { @@ -257,7 +271,7 @@ some_function () { } ``` -As you can see I didn't need to change any other assignments within the function. +As you can see I didn't need to change any other assignments within the function. Of course you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session. ### Paranoid mode @@ -272,7 +286,7 @@ echo Jo Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to purely run in paranoid mode so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this: ``` -#!/bin/bash +#!/usr/bin/env bash set -e @@ -309,7 +323,7 @@ if [[ "${my_var}" > 3 ]]; then fi ``` -... but is Probably unintended lexicographical comparison. A correct way would be: +... but is probably unintended lexicographical comparison. A correct way would be: ``` if (( my_var > 3 )); then @@ -327,7 +341,7 @@ fi ### PIPESTATUS -To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it until now. +To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it how it works until now. The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable: diff --git a/content/gemtext/gemfeed/atom.xml b/content/gemtext/gemfeed/atom.xml index 5f738cc1..eb80f8d5 100644 --- a/content/gemtext/gemfeed/atom.xml +++ b/content/gemtext/gemfeed/atom.xml @@ -1,12 +1,320 @@ <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> - <updated>2021-05-15T18:38:00+01:00</updated> + <updated>2021-05-16T15:27:41+01:00</updated> <title>buetow.org feed</title> <subtitle>Having fun with computers!</subtitle> <link href="gemini://buetow.org/gemfeed/atom.xml" rel="self" /> <link href="gemini://buetow.org/" /> <id>gemini://buetow.org/</id> <entry> + <title>Personal Bash coding style guide</title> + <link href="gemini://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi" /> + <id>gemini://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi</id> + <updated>2021-05-16T14:51:57+01:00</updated> + <author> + <name>Paul Buetow</name> + <email>comments@mx.buetow.org</email> + </author> + <summary>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the 'Google Shell Style Guide' I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. . .....to read on please visit my site.</summary> + <content type="xhtml"> + <div xmlns="http://www.w3.org/1999/xhtml"> + <h1>Personal Bash coding style guide</h1> +<pre> + .---------------------------. + /,--..---..---..---..---..--. `. + //___||___||___||___||___||___\_| + [j__ ######################## [_| + \============================| + .==| |"""||"""||"""||"""| |"""|| +/======"---""---""---""---"=| =|| +|____ []* ____ | ==|| +// \\ // \\ |===|| hjw +"\__/"---------------"\__/"-+---+' +</pre> +<p class="quote"><i>Written by Paul Buetow 2021-05-16</i></p> +<p>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the "Google Shell Style Guide" I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. </p> +<a class="textlink" href="https://google.github.io/styleguide/shellguide.html">Google Shell Style Guide</a><br /> +<h2>My modifications</h2> +<p>These are my personal modifications of the Google Guide.</p> +<h3>Shebang</h3> +<p>Google recommends using always</p> +<pre> +#!/bin/bash +</pre> +<p>as the shebang line. But that does not really work on all Unix and Unix like operating systems (e.g. the *BSDs don't have Bash installed to /bin/bash). Better is:</p> +<pre> +#!/usr/bin/env bash +</pre> +<h3>2 space soft-tabs indentation</h3> +<p>I know there have been many tab- and soft-tab wars on this planet. Google recommends using 2 space soft-tabs for Bash scripts. </p> +<p>I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project.</p> +<p>Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device.</p> +<p>I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. </p> +<h3>Breaking long pipes</h3> +<p>Google recommends breaking up long pipes like this:</p> +<pre> +# All fits on one line +command1 | command2 + +# Long commands +command1 \ + | command2 \ + | command3 \ + | command4 +</pre> +<p>I think there is a better way like the following, which is less noisy. The pipe | already indicates the Bash that another command is expected, thus making the explicit line breaks with \ obsolete:</p> +<pre> +# Long commands +command1 | + command2 | + command3 | + command4 +</pre> +<h3>Quoting your variables</h3> +<p>Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> +<pre> +greet () { + local -r greeting="${1}" + local -r name="${2}" + echo "${greeting} ${name}!" +} +</pre> +<p>In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead:</p> +<pre> +say_hello_to_paul () { + local -r greeting=Hello + local -r name=Paul + echo "$greeting $name!" +} +</pre> +<p>You see I also omitted the curly braces { } around the variables. I only use the curly braces around variables when it makes the code either easier/clearer to read or if it is necessary to use them:</p> +<pre> +declare FOO=bar +# Curly braces around FOO are necessary +echo "foo${FOO}baz" +</pre> +<p>A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. </p> +<h3>Prefer builtin commands over external commands</h3> +<p>Google recommends using the builtin commands over external available commands where possible:</p> +<pre> +# Prefer this: +addition=$(( X + Y )) +substitution="${string/#foo/bar}" + +# Instead of this: +addition="$(expr "${X}" + "${Y}")" +substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" +</pre> +<p>I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all).</p> +<p>I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl).</p> +<p>Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points).</p> +<p>I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-)</p> +<h2>My additions</h2> +<h3>Use of 'yes' and 'no'</h3> +<p>Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct?</p> +<pre> +declare -r SUGAR_FREE=yes +declare -r I_NEED_THE_BUZZ=no + +buy_soda () { + local -r sugar_free=$1 + + if [[ $sugar_free == yes ]]; then + echo 'Diet Dr. Pepper' + else + echo 'Pepsi Coke' + fi +} + +buy_soda $I_NEED_THE_BUZZ +</pre> +<h3>Non-evil alternative to variable assignments via eval</h3> +<p>Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide:</p> +<pre> +# What does this set? +# Did it succeed? In part or whole? +eval $(set_my_variables) + +# What happens if one of the returned values has a space in it? +variable="$(eval some_function)" + +</pre> +<p>However, if I want to read variables from another file I don't have to use eval here. I just source the file:</p> +<pre> +% cat vars.source.sh +declare foo=bar +declare bar=baz +declare bay=foo + +% bash -c 'source vars.source.sh; echo $foo $bar $baz' +bar baz foo +</pre> +<p>And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution):</p> +<pre> +% cat vars.sh +#!/usr/bin/env bash +cat <<END +declare date="$(date)" +declare user=$USER +END + +% bash -c 'source <(./vars.sh); echo "Hello $user, it is $date"' +Hello paul, it is Sat 15 May 19:21:12 BST 2021 +</pre> +<p>The downside is that ShellCheck won't be able to follow the dynamic sourcing anymore.</p> +<h3>Prefer pipes over arrays for list processing</h3> +<p>When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure like this:</p> +<pre> +filter_lines () { + echo 'Start filtering lines in a fancy way!' >&2 + grep ... | sed .... +} + +process_lines () { + echo 'Start processing line by line!' >&2 + while read -r line; do + ... do something and produce a result... + echo "$result" + done +} + +# Do some post processing of the data +postprocess_lines () { + echo 'Start removing duplicates!' >&2 + sort -u +} + +genreate_report () { + echo 'My boss wants to have a report!' >&2 + tee outfile.txt + wc -l outfile.txt +} + +main () { + filter_lines | + process_lines | + postprocess_lines | + generate_report +} + +main +</pre> +<p>The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging.</p> +<h3>Assign-then-shift</h3> +<p>I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments.</p> +<p>The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to add a param_baz? Just do this:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to remove param_foo? Nothing easier than that:</p> +<pre> +some_function () { + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>As you can see I didn't need to change any other assignments within the function. Of course you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session.</p> +<h3>Paranoid mode</h3> +<p>I call this the paranoid mode. The Bash will stop executing when a command exists with a status not equal to 0:</p> +<pre> +set -e +grep -q foo <<< bar +echo Jo +</pre> +<p>Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to purely run in paranoid mode so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this:</p> +<pre> +#!/usr/bin/env bash + +set -e + +some_function () { + .. some critical code + ... + + set +e + # Grep might fail, but that's OK now + grep .... + local -i ec=$? + set -e + + .. critical code continues ... + if [[ $ec -ne 0 ]]; then + ... + fi + ... +} +</pre> +<h2>Learned</h2> +<p>There are also a couple of things I've learned from Googles guide.</p> +<h3>Unintended lexicographical comparison.</h3> +<p>The following looks like valid Bash code:</p> +<pre> +if [[ "${my_var}" > 3 ]]; then + # True for 4, false for 22. + do_something +fi +</pre> +<p>... but is probably unintended lexicographical comparison. A correct way would be:</p> +<pre> +if (( my_var > 3 )); then + do_something +fi +</pre> +<p>or</p> +<pre> +if [[ "${my_var}" -gt 3 ]]; then + do_something +fi +</pre> +<h3>PIPESTATUS</h3> +<p>To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it how it works until now.</p> +<p>The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable:</p> +<pre> +tar -cf - ./* | ( cd "${dir}" && tar -xf - ) +if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then + echo "Unable to tar files to ${dir}" >&2 +fi +</pre> +<p>However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you’ll need to assign PIPESTATUS to another variable immediately after running the command (don’t forget that [ is a command and will wipe out PIPESTATUS).</p> +<pre> +tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) +return_codes=( "${PIPESTATUS[@]}" ) +if (( return_codes[0] != 0 )); then + do_something +fi +if (( return_codes[1] != 0 )); then + do_something_else +fi +</pre> +<h2>Use common sense and BE CONSISTENT.</h2> +<p>The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head:</p> +<p class="quote"><i>If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.</i></p> +<p class="quote"><i>The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.</i></p> +<h2>Advanced Bash learning pro tip</h2> +<p>I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it.</p> +<a class="textlink" href="https://tldp.org/LDP/abs/html/">Advanced Bash-Scripting Guide</a><br /> +<p>E-Mail me your thoughts at comments@mx.buetow.org!</p> + </div> + </content> + </entry> + <entry> <title>Welcome to the Geminispace</title> <link href="gemini://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.gmi" /> <id>gemini://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.gmi</id> diff --git a/content/gemtext/gemfeed/index.gmi b/content/gemtext/gemfeed/index.gmi index b08827d8..a61bfb58 100644 --- a/content/gemtext/gemfeed/index.gmi +++ b/content/gemtext/gemfeed/index.gmi @@ -2,6 +2,7 @@ ## Having fun with computers! +=> ./2021-05-16-personal-bash-coding-style-guide.gmi 2021-05-16 - Personal Bash coding style guide => ./2021-04-24-welcome-to-the-geminispace.gmi 2021-04-24 - Welcome to the Geminispace => ./2021-04-22-dtail-the-distributed-log-tail-program.gmi 2021-04-22 - DTail - The distributed log tail program => ./2018-06-01-realistic-load-testing-with-ioriot-for-linux.gmi 2018-06-01 - Realistic load testing with I/O Riot for Linux diff --git a/content/gemtext/index.gmi b/content/gemtext/index.gmi index ad6a89d9..0531fa49 100644 --- a/content/gemtext/index.gmi +++ b/content/gemtext/index.gmi @@ -52,6 +52,7 @@ English is not my mother tongue. So please ignore any errors you might encounter I have switched blog software multiple times. I might be back filling some of the older articles here. So please don't wonder when suddenly very old posts appear here. +=> ./gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi 2021-05-16 - Personal Bash coding style guide => ./gemfeed/2021-04-24-welcome-to-the-geminispace.gmi 2021-04-24 - Welcome to the Geminispace => ./gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.gmi 2021-04-22 - DTail - The distributed log tail program => ./gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.gmi 2018-06-01 - Realistic load testing with I/O Riot for Linux diff --git a/content/html/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.html b/content/html/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.html index 08084175..f99560f7 100644 --- a/content/html/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.html +++ b/content/html/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.html @@ -71,9 +71,9 @@ h2, h3 { <h2>My modifications</h2> <p>These are my personal modifications of the Google Guide.</p> <h3>2 space soft-tabs indentation</h3> -<p>I know there have been many tab and soft-tab wars on this planet. Google recommends to use 2 space soft-tabs for Bash scripts. </p> +<p>I know there have been many tab- and soft-tab wars on this planet. Google recommends to use 2 space soft-tabs for Bash scripts. </p> <p>I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project.</p> -<p>Google also recommends to limit line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practise at least for shell scripts. For example I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device.</p> +<p>Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device.</p> <p>I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. </p> <h3>Breaking long pipes</h3> <p>Google recommends to break up long pipes like this:</p> @@ -96,7 +96,7 @@ command1 | command4 </pre> <h3>Quoting your variables</h3> -<p>Google recommends to always quote your variables. I think you should do that only for variables where you aren't sure what the content is (e.g. content is from an external input source). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> +<p>Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> <pre> greet () { local -r greeting="${1}" @@ -104,7 +104,7 @@ greet () { echo "${greeting} ${name}!" } </pre> -<p>In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do:</p> +<p>In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead:</p> <pre> say_hello_to_paul () { local -r greeting=Hello @@ -118,7 +118,7 @@ declare FOO=bar # Curly braces around FOO are necessary echo "foo${FOO}baz" </pre> -<p>One word more about always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against to always quote everything I encounter. It's just that I won't do that for every small script I write.</p> +<p>A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. </p> <h3>Prefer builtin commands over external commands</h3> <p>Google recommends using the builtin commands over external available commands where possible:</p> <pre> @@ -130,13 +130,13 @@ substitution="${string/#foo/bar}" addition="$(expr "${X}" + "${Y}")" substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" </pre> -<p>I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do with native capabilities when it comes to text editing (the name "sed" stands for streaming editor after all).</p> +<p>I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all).</p> <p>I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl).</p> -<p>Also, you would like to use an external command for floating point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating points).</p> +<p>Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points).</p> <p>I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without respecting the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-)</p> <h2>My additions</h2> <h3>Use of 'yes' and 'no'</h3> -<p>Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are better readable. Yes, you would need to do string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct?</p> +<p>Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct?</p> <pre> declare -r SUGAR_FREE=yes declare -r I_NEED_THE_BUZZ=no @@ -174,7 +174,7 @@ declare bay=foo % bash -c 'source vars.source.sh; echo $foo $bar $baz' bar baz foo </pre> -<p>And if I want to assign variables dynamically then I could just run an external script and source it's output (This is how you could do metaprogramming in Bash - write code which produces code for immediate execution):</p> +<p>And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash - write code which produces code for immediate execution):</p> <pre> % cat vars.sh #!/usr/bin/bash @@ -227,7 +227,7 @@ main <p>The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging.</p> <h3>Assign-then-shift</h3> <p>I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments.</p> -<p>The solution is to use of the "assign-then-shift"-pattern which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That's is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a purely Bash wizard) some time ago:</p> +<p>The solution is to use of the "assign-then-shift"-pattern which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago:</p> <pre> some_function () { local -r param_foo="$1"; shift @@ -292,28 +292,29 @@ some_function () { <p>The following looks like valid Bash code:</p> <pre> if [[ "${my_var}" > 3 ]]; then - # True for 4, false for 22. - do_something + # True for 4, false for 22. + do_something fi </pre> <p>... but is Probably unintended lexicographical comparison. A correct way would be:</p> <pre> if (( my_var > 3 )); then - do_something + do_something fi </pre> <p>or</p> <pre> if [[ "${my_var}" -gt 3 ]]; then - do_something + do_something fi </pre> <h3>PIPESTATUS</h3> -<p>What I have never used is the PIPESTATUS variable. The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable:</p> +<p>To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it until now.</p> +<p>The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable:</p> <pre> tar -cf - ./* | ( cd "${dir}" && tar -xf - ) if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then - echo "Unable to tar files to ${dir}" >&2 + echo "Unable to tar files to ${dir}" >&2 fi </pre> <p>However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you’ll need to assign PIPESTATUS to another variable immediately after running the command (don’t forget that [ is a command and will wipe out PIPESTATUS).</p> @@ -321,14 +322,18 @@ fi tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) return_codes=( "${PIPESTATUS[@]}" ) if (( return_codes[0] != 0 )); then - do_something + do_something fi if (( return_codes[1] != 0 )); then - do_something_else + do_something_else fi </pre> +<h2>Use common sense and BE CONSISTENT.</h2> +<p>The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head:</p> +<p class="quote"><i>If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.</i></p> +<p class="quote"><i>The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.</i></p> <h2>Advanced Bash learning pro tip</h2> -<p>I also highly recommend to have a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it.</p> +<p>I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it.</p> <a class="textlink" href="https://tldp.org/LDP/abs/html/">Advanced Bash-Scripting Guide</a><br /> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> <a class="textlink" href="../">Go back to the main site</a><br /> diff --git a/content/html/gemfeed/2021-05-16-personal-bash-coding-style-guide.html b/content/html/gemfeed/2021-05-16-personal-bash-coding-style-guide.html new file mode 100644 index 00000000..28cbca3d --- /dev/null +++ b/content/html/gemfeed/2021-05-16-personal-bash-coding-style-guide.html @@ -0,0 +1,350 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<title>Having fun with computers!</title> +<link rel="shortcut icon" type="image/gif" href="/favicon.ico" /> +<style type="text/css"> +body { + margin: auto; + max-width: 900px; + background-color: #FFFFEF; + border: 1px dashed #880000; + border-radius: 8px; + padding: 5px; +} +img { + display:block; + max-width: 80%; +} +p.quote:before { + content: " | "; + padding-left: 2px; +} +a.textlink:before { + content: " ⇒ "; + padding-left: 2px; +} +a.textlink { + text-decoration: none; + color: #FF0000; +} +a.textlink:hover { + text-decoration: underline; +} +i { + color: #FFA500; +} +pre { + background-color: #F1F8E9; + border: 1px dashed #BB0000; + border-radius: 8px; + padding: 5px; + font-family: "Lucida Console", "Courier New", monospace; +} +h1 { + text-align: center; + color: #880000; +} +h2, h3 { + color: #BB0000; +} +</style> +</head> +<body> +<h1>Personal Bash coding style guide</h1> +<pre> + .---------------------------. + /,--..---..---..---..---..--. `. + //___||___||___||___||___||___\_| + [j__ ######################## [_| + \============================| + .==| |"""||"""||"""||"""| |"""|| +/======"---""---""---""---"=| =|| +|____ []* ____ | ==|| +// \\ // \\ |===|| hjw +"\__/"---------------"\__/"-+---+' +</pre> +<p class="quote"><i>Written by Paul Buetow 2021-05-16</i></p> +<p>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the "Google Shell Style Guide" I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. </p> +<a class="textlink" href="https://google.github.io/styleguide/shellguide.html">Google Shell Style Guide</a><br /> +<h2>My modifications</h2> +<p>These are my personal modifications of the Google Guide.</p> +<h3>Shebang</h3> +<p>Google recommends using always</p> +<pre> +#!/bin/bash +</pre> +<p>as the shebang line. But that does not really work on all Unix and Unix like operating systems (e.g. the *BSDs don't have Bash installed to /bin/bash). Better is:</p> +<pre> +#!/usr/bin/env bash +</pre> +<h3>2 space soft-tabs indentation</h3> +<p>I know there have been many tab- and soft-tab wars on this planet. Google recommends using 2 space soft-tabs for Bash scripts. </p> +<p>I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project.</p> +<p>Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device.</p> +<p>I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. </p> +<h3>Breaking long pipes</h3> +<p>Google recommends breaking up long pipes like this:</p> +<pre> +# All fits on one line +command1 | command2 + +# Long commands +command1 \ + | command2 \ + | command3 \ + | command4 +</pre> +<p>I think there is a better way like the following, which is less noisy. The pipe | already indicates the Bash that another command is expected, thus making the explicit line breaks with \ obsolete:</p> +<pre> +# Long commands +command1 | + command2 | + command3 | + command4 +</pre> +<h3>Quoting your variables</h3> +<p>Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> +<pre> +greet () { + local -r greeting="${1}" + local -r name="${2}" + echo "${greeting} ${name}!" +} +</pre> +<p>In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead:</p> +<pre> +say_hello_to_paul () { + local -r greeting=Hello + local -r name=Paul + echo "$greeting $name!" +} +</pre> +<p>You see I also omitted the curly braces { } around the variables. I only use the curly braces around variables when it makes the code either easier/clearer to read or if it is necessary to use them:</p> +<pre> +declare FOO=bar +# Curly braces around FOO are necessary +echo "foo${FOO}baz" +</pre> +<p>A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. </p> +<h3>Prefer builtin commands over external commands</h3> +<p>Google recommends using the builtin commands over external available commands where possible:</p> +<pre> +# Prefer this: +addition=$(( X + Y )) +substitution="${string/#foo/bar}" + +# Instead of this: +addition="$(expr "${X}" + "${Y}")" +substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" +</pre> +<p>I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all).</p> +<p>I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl).</p> +<p>Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points).</p> +<p>I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-)</p> +<h2>My additions</h2> +<h3>Use of 'yes' and 'no'</h3> +<p>Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct?</p> +<pre> +declare -r SUGAR_FREE=yes +declare -r I_NEED_THE_BUZZ=no + +buy_soda () { + local -r sugar_free=$1 + + if [[ $sugar_free == yes ]]; then + echo 'Diet Dr. Pepper' + else + echo 'Pepsi Coke' + fi +} + +buy_soda $I_NEED_THE_BUZZ +</pre> +<h3>Non-evil alternative to variable assignments via eval</h3> +<p>Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide:</p> +<pre> +# What does this set? +# Did it succeed? In part or whole? +eval $(set_my_variables) + +# What happens if one of the returned values has a space in it? +variable="$(eval some_function)" + +</pre> +<p>However, if I want to read variables from another file I don't have to use eval here. I just source the file:</p> +<pre> +% cat vars.source.sh +declare foo=bar +declare bar=baz +declare bay=foo + +% bash -c 'source vars.source.sh; echo $foo $bar $baz' +bar baz foo +</pre> +<p>And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution):</p> +<pre> +% cat vars.sh +#!/usr/bin/env bash +cat <<END +declare date="$(date)" +declare user=$USER +END + +% bash -c 'source <(./vars.sh); echo "Hello $user, it is $date"' +Hello paul, it is Sat 15 May 19:21:12 BST 2021 +</pre> +<p>The downside is that ShellCheck won't be able to follow the dynamic sourcing anymore.</p> +<h3>Prefer pipes over arrays for list processing</h3> +<p>When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure like this:</p> +<pre> +filter_lines () { + echo 'Start filtering lines in a fancy way!' >&2 + grep ... | sed .... +} + +process_lines () { + echo 'Start processing line by line!' >&2 + while read -r line; do + ... do something and produce a result... + echo "$result" + done +} + +# Do some post processing of the data +postprocess_lines () { + echo 'Start removing duplicates!' >&2 + sort -u +} + +genreate_report () { + echo 'My boss wants to have a report!' >&2 + tee outfile.txt + wc -l outfile.txt +} + +main () { + filter_lines | + process_lines | + postprocess_lines | + generate_report +} + +main +</pre> +<p>The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging.</p> +<h3>Assign-then-shift</h3> +<p>I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments.</p> +<p>The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to add a param_baz? Just do this:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to remove param_foo? Nothing easier than that:</p> +<pre> +some_function () { + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>As you can see I didn't need to change any other assignments within the function. Of course you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session.</p> +<h3>Paranoid mode</h3> +<p>I call this the paranoid mode. The Bash will stop executing when a command exists with a status not equal to 0:</p> +<pre> +set -e +grep -q foo <<< bar +echo Jo +</pre> +<p>Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to purely run in paranoid mode so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this:</p> +<pre> +#!/usr/bin/env bash + +set -e + +some_function () { + .. some critical code + ... + + set +e + # Grep might fail, but that's OK now + grep .... + local -i ec=$? + set -e + + .. critical code continues ... + if [[ $ec -ne 0 ]]; then + ... + fi + ... +} +</pre> +<h2>Learned</h2> +<p>There are also a couple of things I've learned from Googles guide.</p> +<h3>Unintended lexicographical comparison.</h3> +<p>The following looks like valid Bash code:</p> +<pre> +if [[ "${my_var}" > 3 ]]; then + # True for 4, false for 22. + do_something +fi +</pre> +<p>... but is probably unintended lexicographical comparison. A correct way would be:</p> +<pre> +if (( my_var > 3 )); then + do_something +fi +</pre> +<p>or</p> +<pre> +if [[ "${my_var}" -gt 3 ]]; then + do_something +fi +</pre> +<h3>PIPESTATUS</h3> +<p>To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it how it works until now.</p> +<p>The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable:</p> +<pre> +tar -cf - ./* | ( cd "${dir}" && tar -xf - ) +if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then + echo "Unable to tar files to ${dir}" >&2 +fi +</pre> +<p>However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you’ll need to assign PIPESTATUS to another variable immediately after running the command (don’t forget that [ is a command and will wipe out PIPESTATUS).</p> +<pre> +tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) +return_codes=( "${PIPESTATUS[@]}" ) +if (( return_codes[0] != 0 )); then + do_something +fi +if (( return_codes[1] != 0 )); then + do_something_else +fi +</pre> +<h2>Use common sense and BE CONSISTENT.</h2> +<p>The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head:</p> +<p class="quote"><i>If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.</i></p> +<p class="quote"><i>The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.</i></p> +<h2>Advanced Bash learning pro tip</h2> +<p>I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it.</p> +<a class="textlink" href="https://tldp.org/LDP/abs/html/">Advanced Bash-Scripting Guide</a><br /> +<p>E-Mail me your thoughts at comments@mx.buetow.org!</p> +<a class="textlink" href="../">Go back to the main site</a><br /> +</body> +</html> diff --git a/content/html/gemfeed/atom.xml b/content/html/gemfeed/atom.xml index 35a58e5a..f4683a69 100644 --- a/content/html/gemfeed/atom.xml +++ b/content/html/gemfeed/atom.xml @@ -1,12 +1,320 @@ <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> - <updated>2021-05-15T18:38:00+01:00</updated> + <updated>2021-05-16T15:27:41+01:00</updated> <title>buetow.org feed</title> <subtitle>Having fun with computers!</subtitle> <link href="https://buetow.org/gemfeed/atom.xml" rel="self" /> <link href="https://buetow.org/" /> <id>https://buetow.org/</id> <entry> + <title>Personal Bash coding style guide</title> + <link href="https://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.html" /> + <id>https://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.html</id> + <updated>2021-05-16T14:51:57+01:00</updated> + <author> + <name>Paul Buetow</name> + <email>comments@mx.buetow.org</email> + </author> + <summary>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the 'Google Shell Style Guide' I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. . .....to read on please visit my site.</summary> + <content type="xhtml"> + <div xmlns="http://www.w3.org/1999/xhtml"> + <h1>Personal Bash coding style guide</h1> +<pre> + .---------------------------. + /,--..---..---..---..---..--. `. + //___||___||___||___||___||___\_| + [j__ ######################## [_| + \============================| + .==| |"""||"""||"""||"""| |"""|| +/======"---""---""---""---"=| =|| +|____ []* ____ | ==|| +// \\ // \\ |===|| hjw +"\__/"---------------"\__/"-+---+' +</pre> +<p class="quote"><i>Written by Paul Buetow 2021-05-16</i></p> +<p>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the "Google Shell Style Guide" I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. </p> +<a class="textlink" href="https://google.github.io/styleguide/shellguide.html">Google Shell Style Guide</a><br /> +<h2>My modifications</h2> +<p>These are my personal modifications of the Google Guide.</p> +<h3>Shebang</h3> +<p>Google recommends using always</p> +<pre> +#!/bin/bash +</pre> +<p>as the shebang line. But that does not really work on all Unix and Unix like operating systems (e.g. the *BSDs don't have Bash installed to /bin/bash). Better is:</p> +<pre> +#!/usr/bin/env bash +</pre> +<h3>2 space soft-tabs indentation</h3> +<p>I know there have been many tab- and soft-tab wars on this planet. Google recommends using 2 space soft-tabs for Bash scripts. </p> +<p>I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project.</p> +<p>Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device.</p> +<p>I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. </p> +<h3>Breaking long pipes</h3> +<p>Google recommends breaking up long pipes like this:</p> +<pre> +# All fits on one line +command1 | command2 + +# Long commands +command1 \ + | command2 \ + | command3 \ + | command4 +</pre> +<p>I think there is a better way like the following, which is less noisy. The pipe | already indicates the Bash that another command is expected, thus making the explicit line breaks with \ obsolete:</p> +<pre> +# Long commands +command1 | + command2 | + command3 | + command4 +</pre> +<h3>Quoting your variables</h3> +<p>Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> +<pre> +greet () { + local -r greeting="${1}" + local -r name="${2}" + echo "${greeting} ${name}!" +} +</pre> +<p>In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead:</p> +<pre> +say_hello_to_paul () { + local -r greeting=Hello + local -r name=Paul + echo "$greeting $name!" +} +</pre> +<p>You see I also omitted the curly braces { } around the variables. I only use the curly braces around variables when it makes the code either easier/clearer to read or if it is necessary to use them:</p> +<pre> +declare FOO=bar +# Curly braces around FOO are necessary +echo "foo${FOO}baz" +</pre> +<p>A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. </p> +<h3>Prefer builtin commands over external commands</h3> +<p>Google recommends using the builtin commands over external available commands where possible:</p> +<pre> +# Prefer this: +addition=$(( X + Y )) +substitution="${string/#foo/bar}" + +# Instead of this: +addition="$(expr "${X}" + "${Y}")" +substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" +</pre> +<p>I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all).</p> +<p>I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl).</p> +<p>Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points).</p> +<p>I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-)</p> +<h2>My additions</h2> +<h3>Use of 'yes' and 'no'</h3> +<p>Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct?</p> +<pre> +declare -r SUGAR_FREE=yes +declare -r I_NEED_THE_BUZZ=no + +buy_soda () { + local -r sugar_free=$1 + + if [[ $sugar_free == yes ]]; then + echo 'Diet Dr. Pepper' + else + echo 'Pepsi Coke' + fi +} + +buy_soda $I_NEED_THE_BUZZ +</pre> +<h3>Non-evil alternative to variable assignments via eval</h3> +<p>Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide:</p> +<pre> +# What does this set? +# Did it succeed? In part or whole? +eval $(set_my_variables) + +# What happens if one of the returned values has a space in it? +variable="$(eval some_function)" + +</pre> +<p>However, if I want to read variables from another file I don't have to use eval here. I just source the file:</p> +<pre> +% cat vars.source.sh +declare foo=bar +declare bar=baz +declare bay=foo + +% bash -c 'source vars.source.sh; echo $foo $bar $baz' +bar baz foo +</pre> +<p>And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution):</p> +<pre> +% cat vars.sh +#!/usr/bin/env bash +cat <<END +declare date="$(date)" +declare user=$USER +END + +% bash -c 'source <(./vars.sh); echo "Hello $user, it is $date"' +Hello paul, it is Sat 15 May 19:21:12 BST 2021 +</pre> +<p>The downside is that ShellCheck won't be able to follow the dynamic sourcing anymore.</p> +<h3>Prefer pipes over arrays for list processing</h3> +<p>When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure like this:</p> +<pre> +filter_lines () { + echo 'Start filtering lines in a fancy way!' >&2 + grep ... | sed .... +} + +process_lines () { + echo 'Start processing line by line!' >&2 + while read -r line; do + ... do something and produce a result... + echo "$result" + done +} + +# Do some post processing of the data +postprocess_lines () { + echo 'Start removing duplicates!' >&2 + sort -u +} + +genreate_report () { + echo 'My boss wants to have a report!' >&2 + tee outfile.txt + wc -l outfile.txt +} + +main () { + filter_lines | + process_lines | + postprocess_lines | + generate_report +} + +main +</pre> +<p>The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging.</p> +<h3>Assign-then-shift</h3> +<p>I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments.</p> +<p>The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to add a param_baz? Just do this:</p> +<pre> +some_function () { + local -r param_foo="$1"; shift + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>Want to remove param_foo? Nothing easier than that:</p> +<pre> +some_function () { + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +</pre> +<p>As you can see I didn't need to change any other assignments within the function. Of course you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session.</p> +<h3>Paranoid mode</h3> +<p>I call this the paranoid mode. The Bash will stop executing when a command exists with a status not equal to 0:</p> +<pre> +set -e +grep -q foo <<< bar +echo Jo +</pre> +<p>Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to purely run in paranoid mode so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this:</p> +<pre> +#!/usr/bin/env bash + +set -e + +some_function () { + .. some critical code + ... + + set +e + # Grep might fail, but that's OK now + grep .... + local -i ec=$? + set -e + + .. critical code continues ... + if [[ $ec -ne 0 ]]; then + ... + fi + ... +} +</pre> +<h2>Learned</h2> +<p>There are also a couple of things I've learned from Googles guide.</p> +<h3>Unintended lexicographical comparison.</h3> +<p>The following looks like valid Bash code:</p> +<pre> +if [[ "${my_var}" > 3 ]]; then + # True for 4, false for 22. + do_something +fi +</pre> +<p>... but is probably unintended lexicographical comparison. A correct way would be:</p> +<pre> +if (( my_var > 3 )); then + do_something +fi +</pre> +<p>or</p> +<pre> +if [[ "${my_var}" -gt 3 ]]; then + do_something +fi +</pre> +<h3>PIPESTATUS</h3> +<p>To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it how it works until now.</p> +<p>The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable:</p> +<pre> +tar -cf - ./* | ( cd "${dir}" && tar -xf - ) +if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then + echo "Unable to tar files to ${dir}" >&2 +fi +</pre> +<p>However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you’ll need to assign PIPESTATUS to another variable immediately after running the command (don’t forget that [ is a command and will wipe out PIPESTATUS).</p> +<pre> +tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) +return_codes=( "${PIPESTATUS[@]}" ) +if (( return_codes[0] != 0 )); then + do_something +fi +if (( return_codes[1] != 0 )); then + do_something_else +fi +</pre> +<h2>Use common sense and BE CONSISTENT.</h2> +<p>The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head:</p> +<p class="quote"><i>If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.</i></p> +<p class="quote"><i>The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.</i></p> +<h2>Advanced Bash learning pro tip</h2> +<p>I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it.</p> +<a class="textlink" href="https://tldp.org/LDP/abs/html/">Advanced Bash-Scripting Guide</a><br /> +<p>E-Mail me your thoughts at comments@mx.buetow.org!</p> + </div> + </content> + </entry> + <entry> <title>Welcome to the Geminispace</title> <link href="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.html" /> <id>https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.html</id> diff --git a/content/html/gemfeed/index.html b/content/html/gemfeed/index.html index fb53be13..31d762fa 100644 --- a/content/html/gemfeed/index.html +++ b/content/html/gemfeed/index.html @@ -54,6 +54,7 @@ h2, h3 { <body> <h1>buetow.org's Gemfeed</h1> <h2>Having fun with computers!</h2> +<a class="textlink" href="./2021-05-16-personal-bash-coding-style-guide.html">2021-05-16 - Personal Bash coding style guide</a><br /> <a class="textlink" href="./2021-04-24-welcome-to-the-geminispace.html">2021-04-24 - Welcome to the Geminispace</a><br /> <a class="textlink" href="./2021-04-22-dtail-the-distributed-log-tail-program.html">2021-04-22 - DTail - The distributed log tail program</a><br /> <a class="textlink" href="./2018-06-01-realistic-load-testing-with-ioriot-for-linux.html">2018-06-01 - Realistic load testing with I/O Riot for Linux</a><br /> diff --git a/content/html/index.html b/content/html/index.html index b935ba59..838789f4 100644 --- a/content/html/index.html +++ b/content/html/index.html @@ -90,6 +90,7 @@ h2, h3 { <a class="textlink" href="./gemfeed/index.html">Subscribe to this blog's Gemfeed</a><br /> <h3>Posts</h3> <p>I have switched blog software multiple times. I might be back filling some of the older articles here. So please don't wonder when suddenly very old posts appear here.</p> +<a class="textlink" href="./gemfeed/2021-05-16-personal-bash-coding-style-guide.html">2021-05-16 - Personal Bash coding style guide</a><br /> <a class="textlink" href="./gemfeed/2021-04-24-welcome-to-the-geminispace.html">2021-04-24 - Welcome to the Geminispace</a><br /> <a class="textlink" href="./gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.html">2021-04-22 - DTail - The distributed log tail program</a><br /> <a class="textlink" href="./gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.html">2018-06-01 - Realistic load testing with I/O Riot for Linux</a><br /> diff --git a/content/md/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.md b/content/md/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.md index 10d266fb..2e6a1ed1 100644 --- a/content/md/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.md +++ b/content/md/gemfeed/2021-05-15-personal-bash-coding-style-guide.draft.md @@ -25,11 +25,11 @@ These are my personal modifications of the Google Guide. ### 2 space soft-tabs indentation -I know there have been many tab and soft-tab wars on this planet. Google recommends to use 2 space soft-tabs for Bash scripts. +I know there have been many tab- and soft-tab wars on this planet. Google recommends to use 2 space soft-tabs for Bash scripts. I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project. -Google also recommends to limit line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practise at least for shell scripts. For example I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device. +Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device. I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. @@ -60,7 +60,7 @@ command1 | ### Quoting your variables -Google recommends to always quote your variables. I think you should do that only for variables where you aren't sure what the content is (e.g. content is from an external input source). In my opinion, the code will become quite noisy when you always quote your variables like this: +Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this: ``` greet () { @@ -70,7 +70,7 @@ greet () { } ``` -In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do: +In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead: ``` say_hello_to_paul () { @@ -88,7 +88,7 @@ declare FOO=bar echo "foo${FOO}baz" ``` -One word more about always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against to always quote everything I encounter. It's just that I won't do that for every small script I write. +A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. ### Prefer builtin commands over external commands @@ -104,11 +104,11 @@ addition="$(expr "${X}" + "${Y}")" substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" ``` -I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do with native capabilities when it comes to text editing (the name "sed" stands for streaming editor after all). +I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all). I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl). -Also, you would like to use an external command for floating point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating points). +Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points). I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without respecting the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-) @@ -116,7 +116,7 @@ I even didn't get started what you can do with Awk (especially GNU Awk), a fully ### Use of 'yes' and 'no' -Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are better readable. Yes, you would need to do string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct? +Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct? ``` declare -r SUGAR_FREE=yes @@ -161,7 +161,7 @@ declare bay=foo bar baz foo ``` -And if I want to assign variables dynamically then I could just run an external script and source it's output (This is how you could do metaprogramming in Bash - write code which produces code for immediate execution): +And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash - write code which produces code for immediate execution): ``` % cat vars.sh @@ -223,7 +223,7 @@ The stdout is always passed as a pipe to the next following stage. The stderr is I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments. -The solution is to use of the "assign-then-shift"-pattern which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That's is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a purely Bash wizard) some time ago: +The solution is to use of the "assign-then-shift"-pattern which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago: ``` some_function () { @@ -304,8 +304,8 @@ The following looks like valid Bash code: ``` if [[ "${my_var}" > 3 ]]; then - # True for 4, false for 22. - do_something + # True for 4, false for 22. + do_something fi ``` @@ -313,7 +313,7 @@ fi ``` if (( my_var > 3 )); then - do_something + do_something fi ``` @@ -321,18 +321,20 @@ or ``` if [[ "${my_var}" -gt 3 ]]; then - do_something + do_something fi ``` ### PIPESTATUS -What I have never used is the PIPESTATUS variable. The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable: +To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it until now. + +The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable: ``` tar -cf - ./* | ( cd "${dir}" && tar -xf - ) if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then - echo "Unable to tar files to ${dir}" >&2 + echo "Unable to tar files to ${dir}" >&2 fi ``` @@ -342,16 +344,25 @@ However, as PIPESTATUS will be overwritten as soon as you do any other command, tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) return_codes=( "${PIPESTATUS[@]}" ) if (( return_codes[0] != 0 )); then - do_something + do_something fi if (( return_codes[1] != 0 )); then - do_something_else + do_something_else fi ``` +## Use common sense and BE CONSISTENT. + +The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head: + +> If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too. + +> The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this. + + ## Advanced Bash learning pro tip -I also highly recommend to have a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it. +I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it. [Advanced Bash-Scripting Guide](https://tldp.org/LDP/abs/html/) diff --git a/content/md/gemfeed/2021-05-16-personal-bash-coding-style-guide.md b/content/md/gemfeed/2021-05-16-personal-bash-coding-style-guide.md new file mode 100644 index 00000000..4152d937 --- /dev/null +++ b/content/md/gemfeed/2021-05-16-personal-bash-coding-style-guide.md @@ -0,0 +1,385 @@ +# Personal Bash coding style guide + +``` + .---------------------------. + /,--..---..---..---..---..--. `. + //___||___||___||___||___||___\_| + [j__ ######################## [_| + \============================| + .==| |"""||"""||"""||"""| |"""|| +/======"---""---""---""---"=| =|| +|____ []* ____ | ==|| +// \\ // \\ |===|| hjw +"\__/"---------------"\__/"-+---+' +``` + +> Written by Paul Buetow 2021-05-16 + +Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the "Google Shell Style Guide" I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. + +[Google Shell Style Guide](https://google.github.io/styleguide/shellguide.html) + +## My modifications + +These are my personal modifications of the Google Guide. + +### Shebang + +Google recommends using always + +``` +#!/bin/bash +``` + +as the shebang line. But that does not really work on all Unix and Unix like operating systems (e.g. the *BSDs don't have Bash installed to /bin/bash). Better is: + +``` +#!/usr/bin/env bash +``` + +### 2 space soft-tabs indentation + +I know there have been many tab- and soft-tab wars on this planet. Google recommends using 2 space soft-tabs for Bash scripts. + +I personally don't really care if I use 2 or 4 space indentations. I agree however that tabs should not be used. I personally tend to use 4 space soft-tabs as that's currently how my Vim is configured for any programming language. What matters most though is consistency within the same script/project. + +Google also recommends limiting the line length to 80 characters. For some people that seem's to be an ancient habit from the 80's, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux of course) and it comes in very handy if the lines are not too long due to the relatively small display on the device. + +I hit the 80 character line length quicker with the 4 spaces than with 2 spaces, but that makes me refactor the Bash code more aggressively which is actually a good thing. + +### Breaking long pipes + +Google recommends breaking up long pipes like this: + +``` +# All fits on one line +command1 | command2 + +# Long commands +command1 \ + | command2 \ + | command3 \ + | command4 +``` + +I think there is a better way like the following, which is less noisy. The pipe | already indicates the Bash that another command is expected, thus making the explicit line breaks with \ obsolete: + +``` +# Long commands +command1 | + command2 | + command3 | + command4 +``` + +### Quoting your variables + +Google recommends to always quote your variables. I think generally you should do that only for variables where you are unsure about the content/values of the variables (e.g. content is from an external input source and may contains whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this: + +``` +greet () { + local -r greeting="${1}" + local -r name="${2}" + echo "${greeting} ${name}!" +} +``` + +In this particular example I agree that you should quote them as you don't really know what is the input (are there for example whitespace characters?). But if you are sure that you are only using simple bare words then I think that the code looks much cleaner when you do this instead: + +``` +say_hello_to_paul () { + local -r greeting=Hello + local -r name=Paul + echo "$greeting $name!" +} +``` + +You see I also omitted the curly braces { } around the variables. I only use the curly braces around variables when it makes the code either easier/clearer to read or if it is necessary to use them: + +``` +declare FOO=bar +# Curly braces around FOO are necessary +echo "foo${FOO}baz" +``` + +A few more words on always quoting the variables: For the sake of consistency (and for the sake of making ShellCheck happy) I am not against quoting everything I encounter. I personally also think that the larger the Bash script becomes, the more important it becomes to always quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in it for example. It's just that I won't quote everything in every small script I write. + +### Prefer builtin commands over external commands + +Google recommends using the builtin commands over external available commands where possible: + +``` +# Prefer this: +addition=$(( X + Y )) +substitution="${string/#foo/bar}" + +# Instead of this: +addition="$(expr "${X}" + "${Y}")" +substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" +``` + +I don't agree fully here. The external commands (especially sed) are much more sophisticated and powerful than the Bash builtin versions. Sed can do much more than the Bash can ever do natively when it comes to text manipulation (the name "sed" stands for streaming editor after all). + +I prefer to do light text processing with the Bash builtins and more complicated text processing with external programs such as sed, grep, awk, cut and tr. There is however also the case of medium-light text processing where I would want to use external programs too. That is so because I remember using them better than the Bash builtins. The Bash can get quite obscure here (even Perl will be more readable then - Side note: I love Perl). + +Also, you would like to use an external command for floating-point calculation (e.g. bc) instead using the Bash builtins (worth noticing that ZSH supports builtin floating-points). + +I even didn't get started what you can do with Awk (especially GNU Awk), a fully fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-) + +## My additions + +### Use of 'yes' and 'no' + +Bash does not support a boolean type. I tend to just use the strings 'yes' and 'no' here. For some time I used 0 for false and 1 for true, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is important to you, you wouldn't want to use a Bash script anyway, correct? + +``` +declare -r SUGAR_FREE=yes +declare -r I_NEED_THE_BUZZ=no + +buy_soda () { + local -r sugar_free=$1 + + if [[ $sugar_free == yes ]]; then + echo 'Diet Dr. Pepper' + else + echo 'Pepsi Coke' + fi +} + +buy_soda $I_NEED_THE_BUZZ +``` + +### Non-evil alternative to variable assignments via eval + +Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide: + +``` +# What does this set? +# Did it succeed? In part or whole? +eval $(set_my_variables) + +# What happens if one of the returned values has a space in it? +variable="$(eval some_function)" + +``` + +However, if I want to read variables from another file I don't have to use eval here. I just source the file: + +``` +% cat vars.source.sh +declare foo=bar +declare bar=baz +declare bay=foo + +% bash -c 'source vars.source.sh; echo $foo $bar $baz' +bar baz foo +``` + +And if I want to assign variables dynamically then I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution): + +``` +% cat vars.sh +#!/usr/bin/env bash +cat <<END +declare date="$(date)" +declare user=$USER +END + +% bash -c 'source <(./vars.sh); echo "Hello $user, it is $date"' +Hello paul, it is Sat 15 May 19:21:12 BST 2021 +``` + +The downside is that ShellCheck won't be able to follow the dynamic sourcing anymore. + +### Prefer pipes over arrays for list processing + +When I do list processing in Bash, I prefer to use pipes. You can chain then through Bash functions as well which is pretty neat. Usually my list processing scripts are of a structure like this: + +``` +filter_lines () { + echo 'Start filtering lines in a fancy way!' >&2 + grep ... | sed .... +} + +process_lines () { + echo 'Start processing line by line!' >&2 + while read -r line; do + ... do something and produce a result... + echo "$result" + done +} + +# Do some post processing of the data +postprocess_lines () { + echo 'Start removing duplicates!' >&2 + sort -u +} + +genreate_report () { + echo 'My boss wants to have a report!' >&2 + tee outfile.txt + wc -l outfile.txt +} + +main () { + filter_lines | + process_lines | + postprocess_lines | + generate_report +} + +main +``` + +The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging. + +### Assign-then-shift + +I often refactor existing Bash code. That leads me to adding and removing function arguments quite often. It's quite repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments. + +The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something what I picked up from a colleague (a pure Bash wizard) some time ago: + +``` +some_function () { + local -r param_foo="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +``` + +Want to add a param_baz? Just do this: + +``` +some_function () { + local -r param_foo="$1"; shift + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +``` + +Want to remove param_foo? Nothing easier than that: + +``` +some_function () { + local -r param_bar="$1"; shift + local -r param_baz="$1"; shift + local -r param_bay="$1"; shift + ... +} +``` + +As you can see I didn't need to change any other assignments within the function. Of course you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session. + +### Paranoid mode + +I call this the paranoid mode. The Bash will stop executing when a command exists with a status not equal to 0: + +``` +set -e +grep -q foo <<< bar +echo Jo +``` + +Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to purely run in paranoid mode so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this: + +``` +#!/usr/bin/env bash + +set -e + +some_function () { + .. some critical code + ... + + set +e + # Grep might fail, but that's OK now + grep .... + local -i ec=$? + set -e + + .. critical code continues ... + if [[ $ec -ne 0 ]]; then + ... + fi + ... +} +``` + +## Learned + +There are also a couple of things I've learned from Googles guide. + +### Unintended lexicographical comparison. + +The following looks like valid Bash code: + +``` +if [[ "${my_var}" > 3 ]]; then + # True for 4, false for 22. + do_something +fi +``` + +... but is probably unintended lexicographical comparison. A correct way would be: + +``` +if (( my_var > 3 )); then + do_something +fi +``` + +or + +``` +if [[ "${my_var}" -gt 3 ]]; then + do_something +fi +``` + +### PIPESTATUS + +To be honest, I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to fully understand it how it works until now. + +The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it’s only necessary to check success or failure of the whole pipe, then the following is acceptable: + +``` +tar -cf - ./* | ( cd "${dir}" && tar -xf - ) +if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then + echo "Unable to tar files to ${dir}" >&2 +fi +``` + +However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you’ll need to assign PIPESTATUS to another variable immediately after running the command (don’t forget that [ is a command and will wipe out PIPESTATUS). + +``` +tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) +return_codes=( "${PIPESTATUS[@]}" ) +if (( return_codes[0] != 0 )); then + do_something +fi +if (( return_codes[1] != 0 )); then + do_something_else +fi +``` + +## Use common sense and BE CONSISTENT. + +The following 2 paragraphs are completely quoted from the Google guidelines. But they hit the hammer on the head: + +> If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too. + +> The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this. + + +## Advanced Bash learning pro tip + +I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (which is not from Google). I use it as the universal Bash reference and learn something new every time I have a look at it. + +[Advanced Bash-Scripting Guide](https://tldp.org/LDP/abs/html/) + +E-Mail me your thoughts at comments@mx.buetow.org! + +[Go back to the main site](../) diff --git a/content/md/gemfeed/index.md b/content/md/gemfeed/index.md index 98922fef..ebbecc74 100644 --- a/content/md/gemfeed/index.md +++ b/content/md/gemfeed/index.md @@ -2,6 +2,7 @@ ## Having fun with computers! +[2021-05-16 - Personal Bash coding style guide](./2021-05-16-personal-bash-coding-style-guide.md) [2021-04-24 - Welcome to the Geminispace](./2021-04-24-welcome-to-the-geminispace.md) [2021-04-22 - DTail - The distributed log tail program](./2021-04-22-dtail-the-distributed-log-tail-program.md) [2018-06-01 - Realistic load testing with I/O Riot for Linux](./2018-06-01-realistic-load-testing-with-ioriot-for-linux.md) diff --git a/content/md/index.md b/content/md/index.md index 8658bd60..3f0775aa 100644 --- a/content/md/index.md +++ b/content/md/index.md @@ -52,6 +52,7 @@ English is not my mother tongue. So please ignore any errors you might encounter I have switched blog software multiple times. I might be back filling some of the older articles here. So please don't wonder when suddenly very old posts appear here. +[2021-05-16 - Personal Bash coding style guide](./gemfeed/2021-05-16-personal-bash-coding-style-guide.md) [2021-04-24 - Welcome to the Geminispace](./gemfeed/2021-04-24-welcome-to-the-geminispace.md) [2021-04-22 - DTail - The distributed log tail program](./gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.md) [2018-06-01 - Realistic load testing with I/O Riot for Linux](./gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.md) diff --git a/content/meta/gemfeed/2021-05-16-personal-bash-coding-style-guide.meta b/content/meta/gemfeed/2021-05-16-personal-bash-coding-style-guide.meta new file mode 100644 index 00000000..cad0ad7a --- /dev/null +++ b/content/meta/gemfeed/2021-05-16-personal-bash-coding-style-guide.meta @@ -0,0 +1,5 @@ +local meta_date="2021-05-16T14:51:57+01:00" +local meta_author="Paul Buetow" +local meta_email="comments@mx.buetow.org" +local meta_title="Personal Bash coding style guide" +local meta_summary="Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the 'Google Shell Style Guide' I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. . .....to read on please visit my site." |
