summaryrefslogtreecommitdiff
path: root/public/bash-arrays.md
diff options
context:
space:
mode:
authorLuke Shumaker <LukeShu@sbcglobal.net>2013-10-12 13:47:42 -0400
committerLuke Shumaker <LukeShu@sbcglobal.net>2013-10-12 13:47:42 -0400
commit6a42c8de66e3b2dc7293ddeadaa3ee396db2624d (patch)
tree67a027b892d3122662526504dd6d11e8dea02ca1 /public/bash-arrays.md
initial commit
Diffstat (limited to 'public/bash-arrays.md')
-rw-r--r--public/bash-arrays.md201
1 files changed, 201 insertions, 0 deletions
diff --git a/public/bash-arrays.md b/public/bash-arrays.md
new file mode 100644
index 0000000..697d018
--- /dev/null
+++ b/public/bash-arrays.md
@@ -0,0 +1,201 @@
+Bash arrays
+===========
+:copyright 2013 Luke Shumaker
+
+Way too many people don't understand Bash arrays. Many of them argue
+that if you need arrays, you shouldn't be using Bash. If we reject
+the notion that one should never use Bash for scripting, then thinking
+you don't need Bash arrays is what I like to call "wrong".
+
+The simple expanation of why everybody who programs in Bash needs to
+understand arrays is this: command line arguments are exposed as an
+array. Does your script take any arguments on the command line?
+Great, you need to work with an array!
+
+General array syntax
+--------------------
+
+The most important things to understanding arrays is to quote them,
+and understanding the difference between `@` and `*`.
+
+<table>
+ <caption>
+ <h1>Getting the entire array</h1>
+ <p>There is <em>no</em> valid reason to not wrap these in double
+ quotes.</p>
+ </caption>
+ <tbody>
+ <tr>
+ <td><code>"${array[@]}"</code></td>
+ <td>Returns every element of the array as a separate token.</td>
+ </tr><tr>
+ <td><code>"${array[*]}"</code></td>
+ <td>Returns every element of the array in a single
+ whitepace-separated string.</td>
+ </tr>
+ </tbody>
+</table>
+
+It's really that simple—that covers most usages of arrays, and most of
+the mistakes made with them.
+
+To help you understand the difference between `@` and `*`, here is a
+sample.
+
+<pre><code>#!/bin/bash
+array=(foo bar baz)
+for item in "${array[@]}"; do
+ echo " - &lt;${item}&gt;"
+done<hr> - &lt;foo&gt;
+ - &lt;bar&gt;
+ - &lt;baz&gt;</code></pre>
+
+<pre><code>#!/bin/bash
+array=(foo bar baz)
+for item in "${array[@]}"; do
+ echo " - &lt;${item}&gt;"
+done<hr> - &lt;foo bar baz&gt;</code></pre>
+
+To get individual entries, the syntax is
+<code>${array[<var>n</var>]}</code>, where <var>n</var> starts at 0.
+
+<table>
+ <caption>
+ <h1>Getting a single entry from the array</h1>
+ </caption>
+ <tbody>
+ <tr>
+ <td><code>"${array[<var>n</var>]}"</code></td>
+ <td>Returns the <var>n</var>th entry of the array, where the
+ first entry is at <var>n</var>=0.</td>
+ </tr>
+ </tbody>
+</table>
+
+To get a subset of the array, there are a few options (like normal,
+switch between `@` and `*` to switch between
+getting it as separate items, and as a whitespace-separated string):
+
+<table>
+ <caption>
+ <h1>Getting subsets of an array</h1>
+ <p>Substitute <code>*</code> for <code>@</code> to get the subset
+ as a whitespace-separated string instead of separate tokens, as
+ described above.</p>
+ <p>Again, there is no valid reason to not wrap each of these in
+ double quotes.</p>
+ </caption>
+ <tbody>
+ <tr>
+ <td><code>"${array[@]:<var>start</var>}"</code></td>
+ <td>Returns from <var>n</var>=<var>start</var> to the end of the array.</td>
+ </tr><tr>
+ <td><code>"${array[@]:<var>start</var>:<var>count</var>}"</code></td>
+ <td>Returns <var>count</var> entries, starting at <var>n</var>=<var>start</var>.</td>
+ </tr><tr>
+ <td><code>"${array[@]::<var>count</var>}"</code></td>
+ <td>Returns <var>count</var> entries from the beginning of the array.</td>
+ </tr>
+ </tbody>
+</table>
+
+Notice that `"${array[@]}"` is equivalent to `"${array[@]:0}"`.
+
+<table>
+ <caption>
+ <h1>Getting the length of an array</h1>
+ <p>The is the only situation where there is no difference
+ between <code>@</code> and <code>*</code>.</p>
+ </caption>
+ <tbody>
+ <tr>
+ <td>
+ <code>${#array[@]}</code>
+ <br>or<br>
+ <code>${#array[*]}</code>
+ </td>
+ <td>
+ Returns the length of the array
+ </td>
+ </tr>
+ </tbody>
+</table>
+
+Accessing the arguments array
+-----------------------------
+
+Accessing the arguments is mostly that simple, but that array doesn't
+actually have a variable name. It's special. Instead, it is exposed
+through a series of special variables (normal variables can only start
+with letters and underscore), that *mostly* match up with the normal
+array syntax.
+
+<table>
+ <caption>
+ <h1>Accessing the arguments array</h1>
+ <aside>Note that for values of <var>n</var> with more than 1
+ digit, you need to wrap it in <code>{}</code>.
+ Otherwise, <code>"$10"</code> would be parsed
+ as <code>"${1}0"</code>.</aside>
+ </caption>
+ <tbody>
+ <tr><th colspan=2>Individual entries</th></tr>
+ <tr><td><code>${array[0]}</code></td><td><code>$0</code></td></tr>
+ <tr><td><code>${array[1]}</code></td><td><code>$1</code></td></tr>
+ <tr><td colspan=2 style="text-align:center">...</td></tr>
+ <tr><td><code>${array[9]}</code></td><td><code>$9</code></td></tr>
+ <tr><td><code>${array[10]}</code></td><td><code>${10}</code></td></tr>
+ <tr><td colspan=2 style="text-align:center">...</td></tr>
+ <tr><td><code>${array[<var>n</var>]}</code></td><td><code>${<var>n</var>}</code></td></tr>
+ <tr><th colspan=2>Subset arrays (array)</th></tr>
+ <tr><td><code>"${array[@]}"</code></td><td><code>"${@:0}"</code></td></tr>
+ <tr><td><code>"${array[@]:1}"</code></td><td><code>"$@"</code></td></tr>
+ <tr><td><code>"${array[@]:<var>pos</var>}"</code></td><td><code>"${@:<var>pos</var>}"</code></td></tr>
+ <tr><td><code>"${array[@]:<var>pos</var>:<var>len</var>}"</code></td><td><code>"${@:<var>pos</var>:<var>len</var>}"</code></td></tr>
+ <tr><td><code>"${array[@]::<var>len</var>}"</code></td><td><code>"${@::<var>len</var>}"</code></td></tr>
+ <tr><th colspan=2>Subset arrays (string)</th></tr>
+ <tr><td><code>"${array[*]}"</code></td><td><code>"${*:0}"</code></td></tr>
+ <tr><td><code>"${array[*]:1}"</code></td><td><code>"$*"</code></td></tr>
+ <tr><td><code>"${array[*]:<var>pos</var>}"</code></td><td><code>"${*:<var>pos</var>}"</code></td></tr>
+ <tr><td><code>"${array[*]:<var>pos</var>:<var>len</var>}"</code></td><td><code>"${*:<var>pos</var>:<var>len</var>}"</code></td></tr>
+ <tr><td><code>"${array[*]::<var>len</var>}"</code></td><td><code>"${*::<var>len</var>}"</code></td></tr>
+ <tr><th colspan=2>Array length</th></tr>
+ <tr><td><code>${#array[@]}</code></td><td><code>$#</code> + 1</td></tr>
+ </tbody>
+</table>
+
+Did notice what was inconsistent? The variables `$*`, `$@`, and `$#`
+behave like the <var>n</var>=0 entry doesn't exist.
+
+<table>
+ <caption>
+ <h1>Inconsistencies</h1>
+ </caption>
+ <tbody>
+ <tr>
+ <th colspan=3><code>@</code> or <code>*</code></th>
+ </tr><tr>
+ <td><code>"${array[@]}"</code></td>
+ <td>→</td>
+ <td><code>"${array[@]:0}"</code></td>
+ </tr><tr>
+ <td><code>"${@}"</code></td>
+ <td>→</td>
+ <td><code>"${@:1}"</code></td>
+ </tr><tr>
+ <th colspan=3><code>#</code></th>
+ </tr><tr>
+ <td><code>"${#array[@]}"</code></td>
+ <td>→</td>
+ <td>length</td>
+ </tr><tr>
+ <td><code>"${#}"</code></td>
+ <td>→</td>
+ <td>length-1</td>
+ </tr>
+ </tbody>
+</table>
+
+These make sense because argument 0 is the name of the script—we
+almost never want that when parsing arguments. You'd spend more code
+getting the values that it currently gives you.