Preface
Goal: Utilizing BASH IFS for string processing.
Shell scripting is definitely is not the place for data structure. For complex data structure, you should find, real programming language other than shell.
For impatient, you can extract json
,
using jq
in bash
in this shell article series.
This is the a better approach than using pure bash.
However, if you wish for pure shell script,
you can utilize IFS
in bash
.
IFS
itself stand for Internal Field Separator.
Shell Article Series
This shell article series consist of six parts.
- Bash, by help from IFS.
- Fish Shell.
- Ion Shell (for Redox OS).
- JQ in Bash.
- AWK.
- Sed.
Personal Motivation
Raison d’être
I learn Bash
, awk
, sed
,
along with Python
and PERL
in 2005 when I was young.
I really love those golden times, and I miss those stuff.
But I never have a chance to read over the documentation again.
I don’t know if I would use sed
or awk
,
or even any regular expression
ever again.
So I decide to give each an article, a short article.
This won’t be about my long script I wrote when I was young.
This article is just a self reminder.
A monument, that I was good with those stuff, when I was in early age.
Reference Reading
Source Examples
You can obtain source examples here:
Common Use Case
Task: Get the unique tag string
Please read overview for more detail.
Data Structure Support
bash
does not support multidimensional array.
And it doesn’t support array in array.
So how are we suppose to handle data structure?
The answer is, avoid bash for data structure.
However, for learning purpose, we can explore bash capability, with CSV like data.
Prepopulated Data
Songs and Poetry
The data is simply an array containing multilines. Which each lines separated by semicolon, then separated by comma.
declare -a songs=(
"Cantaloupe Island;60s,jazz"
"Let It Be;60s,rock"
"Knockin' on Heaven's Door;70s,rock"
"Emotion;70s,pop"
"The River"
)
BASH Solution
The Answer
We are going to utilize IFS
(Internal Field Separator),
to extract variable in bash
.
Note that, fish
and ion
shell have different approach.
source ./MySongs.sh
source ./MyHelperFlatten.sh
# Extract flatten output to array
tags_fl=$(flatten ${songs[@]})
IFS=' '; read -a tags_flatten <<< "${tags_fl[@]}"
declare -A tags_unique
for tag_item in "${tags_flatten[@]}"; do
let tags_unique[$tag_item]++
done
tags=(${!tags_unique[@]})
IFS=':'
echo "${tags[*]}"
Enough with introduction, at this point we should go straight to coding.
Environment
No need any special setup. Just run and voila..!
1: Array in BASH
We are going to check how far associative array
in bash,
can handle data structure.
Simple Array
Consider begin with simple array
.
declare -a tags=(
"rock" "jazz" "rock" "pop" "pop")
echo ${tags[@]}
With output result as below:
❯ ./01-tags.sh
rock jazz rock pop pop
No problem so far.
Associative Array
Now consider this form.
declare -A song
song['title']='Cantaloupe Island'
song['tags', 0]='60s'
song['tags', 1]='jazz'
echo ${song[@]}
echo ${song['title']}
echo ${song['tags', 0]}
With output result as below:
❯ ./02-declare.sh
60s jazz Cantaloupe Island
Cantaloupe Island
60s
We still have no problem with code above. But unfortunately we cannot go further than this.
Using IFS
With simple data like above, we can test IFS (Internal Field Separator). See IFS in action. And examine how it works.
declare -A song=(
['title']='Cantaloupe Island'
['tags']='60s|jazz'
)
echo ${#song[@]}
echo ${song['title']}
echo ${song['tags']}
# Internal Field Separator
IFS='|'
read tagss <<< "${song['tags']}"
echo ${tagss[@]}
With the result similar as below record:
❯ ./02-split.sh
2
Cantaloupe Island
60s|jazz
60s jazz
Not Using Array
Since bash capability is rather limited, we should change our strategy. The data structure must not use, complex array.
declare -a songs=(
"Cantaloupe Island;60s,jazz"
"Let It Be;60s,rock"
"Knockin' on Heaven's Door;70s,rock"
"Emotion;70s,pop"
"The River"
)
IFS=$(echo -en "\n\b")
echo "${songs[*]}"
With the result,similar as below record:
❯ ./03-songs.sh
Cantaloupe Island;60s,jazz
Let It Be;60s,rock
Knockin' on Heaven's Door;70s,rock
Emotion;70s,pop
The River
No need complex structure.
2: Separating Module
Since we need to reuse the songs record multiple times, it is a good idea to separate the record structure from logic.
Songs Module
As I already said, we are going to use string processing. The data structure would be simple. It is just lines with string, which each field separated with character, such as semicolon or comma.
The code can be shown as below:
declare -a songs=(
"Cantaloupe Island;60s,jazz"
"Let It Be;60s,rock"
"Knockin' on Heaven's Door;70s,rock"
"Emotion;70s,pop"
"The River"
)
Using Songs Module
Now we can have a very short code. With the result exactly the same as above code:
source ./MySongs.sh
IFS=$(echo -en "\n\b")
echo "${songs[*]}"
Or alternatively, we can alter the output, using IFS and loop.
source ./MySongs.sh
# Internal Field Separator
IFS=';'
for song in "${songs[@]}"
do
read title tags <<< "${song}"
echo "${title} is [${tags}]"
done
unset IFS
With the result as below lines of string
.
❯ ./04-module-b.sh
Cantaloupe Island is [60s,jazz]
Let It Be is [60s,rock]
Knockin' on Heaven's Door is [70s,rock]
Emotion is [70s,pop]
The River is []
[] Notice how the variable assigned through read
command.
read title tags <<< "${song}"
This read
command,
split variables based on,
the IFS
value.
3: Finishing The Task
Extract, Flatten, Unique
Extracting Data Structure
Based on the result above,
we can go further, extracting the tags
data.
source ./MySongs.sh
declare -a tagss
# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
read title tags_temp <<< "${song_item}"
tagss+=($tags_temp)
done
# Join tags list
IFS=':'
echo "${tagss[*]}"
unset IFS
With the result of array
of array
, as shown below.
❯ ./05-extract.sh
60s,jazz:60s,rock:70s,rock:70s,pop
Notice how we populate the tagss
array variable,
by append value for each loop.
tagss+=($tags_temp)
Flatten
Now we can normalize, the separated values. Flatten all values into just single array.
Using two loops. One after another.
source ./MySongs.sh
declare -a tagss
declare -a tags
# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
read title tags_temp <<< "${song_item}"
tagss+=($tags_temp)
done
# Split tags list
IFS=','
for tags_item in "${tagss[@]}"; do
read tag_temp <<< "${tags_item}"
tags+=(${tag_temp[@]})
done
# Join tags list
IFS=':'
echo "${tags[*]}"
unset IFS
With the result of a flattened array
shown below.
❯ ./06-flatten-a.sh
60s:jazz:60s:rock:70s:rock:70s:pop
I know it is a little bit long. This long code looks inefficient. I have to do something about it.
Flatten Rewrite
We can rewrite above statement, using loop in a loop, to make the code shorter.
source ./MySongs.sh
# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
read title tags_temp <<< "${song_item}"
# Split tags list
for tags_item in "${tags_temp[@]}"; do
IFS=','/ read -a tag_temp <<< "${tags_item}"
tags+=(${tag_temp[@]})
done
done
# Join tags list
IFS=':'
echo "${tags[*]}"
With the result exactly the same as previous flattened array
.
Flatten Function
Since we are going to reuse the flatten approach above in other code. It is better to bundle the script in its own module.
function flatten() {
local l_songs=("$@")
# Split song record
IFS=';'
for song_item in "${l_songs[@]}"; do
read title tags_temp <<< "${song_item}"
# Split tags list
for tags_item in "${tags_temp[@]}"; do
IFS=','/ read -a tag_temp <<< "${tags_item}"
tags_flatten+=(${tag_temp[@]})
done
done
unset IFS
echo "${tags_flatten[@]}"
}
Beware of the return values. Returning array in bash is a little bit tricky.
Now we can use code above in other script.
Unique
To solve unique
array,
we can just use the uniq
command, or sort -u
.
But if you want the code to be pure bash, then we can examine link from bash wizard below:
Then we can adapt the example code above into our code below:
source ./MySongs.sh
source ./MyHelperFlatten.sh
# Extract flatten output to array
tags_fl=$(flatten ${songs[@]})
IFS=' '; read -a tags_flatten <<< "${tags_fl[@]}"
# https://bashwizard.com/displaying-unique-values/
declare -A tags_unique
for tag_item in "${tags_flatten[@]}"; do
let tags_unique[$tag_item]++
done
tags=(${!tags_unique[@]})
IFS=':'
echo "${tags[*]}"
With the result similar as below array:
❯ ./07-unique.sh
rock:70s:pop:jazz:60s
It is not that hard. Or isn’t it 🤔?
What is Next 🤔?
Consider continue reading [ Bash IFS - Playing with Records - Part Two ].