Article Series

This article series discuss more than 30 different programming languages. Please read overview before you read any of the details.

Playing with Records Related Articles.

Where to Discuss?

Local Group

Preface

Goal: Utilizing BASH IFS for string processing.

Shell scripting is definitely is not the place for data structure. For complex data structure, you should find, real programming language other than shell.

For impatient, you can extract json, using jq in bash in this shell article series. This is the a better approach than using pure bash.

However, if you wish for pure shell script, you can utilize IFS in bash. IFS itself stand for Internal Field Separator.

Shell Article Series

This shell article series consist of six parts.

  1. Bash, by help from IFS.
  2. Fish Shell.
  3. Ion Shell (for Redox OS).
  4. JQ in Bash.
  5. AWK.
  6. Sed.

Personal Motivation

Raison d’être

I learn Bash, awk, sed, along with Python and PERL in 2005 when I was young. I really love those golden times, and I miss those stuff. But I never have a chance to read over the documentation again.

I don’t know if I would use sed or awk, or even any regular expression ever again. So I decide to give each an article, a short article. This won’t be about my long script I wrote when I was young. This article is just a self reminder.

A monument, that I was good with those stuff, when I was in early age.

Reference Reading

Source Examples

You can obtain source examples here:


Common Use Case

Task: Get the unique tag string

Please read overview for more detail.

Data Structure Support

bash does not support multidimensional array. And it doesn’t support array in array. So how are we suppose to handle data structure?

The answer is, avoid bash for data structure.

However, for learning purpose, we can explore bash capability, with CSV like data.

Prepopulated Data

Songs and Poetry

The data is simply an array containing multilines. Which each lines separated by semicolon, then separated by comma.

declare -a songs=(
  "Cantaloupe Island;60s,jazz"
  "Let It Be;60s,rock"
  "Knockin' on Heaven's Door;70s,rock"
  "Emotion;70s,pop"
  "The River"
)

BASH Solution

The Answer

We are going to utilize IFS (Internal Field Separator), to extract variable in bash. Note that, fish and ion shell have different approach.

source ./MySongs.sh
source ./MyHelperFlatten.sh

# Extract flatten output to array
tags_fl=$(flatten ${songs[@]})
IFS=' '; read -a tags_flatten <<< "${tags_fl[@]}"

declare -A tags_unique

for tag_item in "${tags_flatten[@]}"; do
  let tags_unique[$tag_item]++
done

tags=(${!tags_unique[@]})

IFS=':'
echo "${tags[*]}"

Enough with introduction, at this point we should go straight to coding.

Environment

No need any special setup. Just run and voila..!


1: Array in BASH

We are going to check how far associative array in bash, can handle data structure.

Simple Array

Consider begin with simple array.

declare -a tags=(
  "rock" "jazz" "rock" "pop" "pop")

echo ${tags[@]}

With output result as below:

❯ ./01-tags.sh
rock jazz rock pop pop

BASH: Simple Array

No problem so far.

Associative Array

Now consider this form.

declare -A song
song['title']='Cantaloupe Island'
song['tags', 0]='60s'
song['tags', 1]='jazz'

echo ${song[@]}
echo ${song['title']}
echo ${song['tags', 0]}

With output result as below:

❯ ./02-declare.sh
60s jazz Cantaloupe Island
Cantaloupe Island
60s

BASH: Output associative array

We still have no problem with code above. But unfortunately we cannot go further than this.

Using IFS

With simple data like above, we can test IFS (Internal Field Separator). See IFS in action. And examine how it works.

declare -A song=(
  ['title']='Cantaloupe Island'
  ['tags']='60s|jazz'
)

echo ${#song[@]}
echo ${song['title']}
echo ${song['tags']}

# Internal Field Separator
IFS='|'
read tagss <<< "${song['tags']}"

echo ${tagss[@]}

With the result similar as below record:

❯ ./02-split.sh
2
Cantaloupe Island
60s|jazz
60s jazz

BASH: Using IFS

Not Using Array

Since bash capability is rather limited, we should change our strategy. The data structure must not use, complex array.

declare -a songs=(
  "Cantaloupe Island;60s,jazz"
  "Let It Be;60s,rock"
  "Knockin' on Heaven's Door;70s,rock"
  "Emotion;70s,pop"
  "The River"
)

IFS=$(echo -en "\n\b")
echo "${songs[*]}"

With the result,similar as below record:

❯ ./03-songs.sh
Cantaloupe Island;60s,jazz
Let It Be;60s,rock
Knockin' on Heaven's Door;70s,rock
Emotion;70s,pop
The River

BASH: Using no Array

No need complex structure.


2: Separating Module

Since we need to reuse the songs record multiple times, it is a good idea to separate the record structure from logic.

Songs Module

As I already said, we are going to use string processing. The data structure would be simple. It is just lines with string, which each field separated with character, such as semicolon or comma.

The code can be shown as below:

declare -a songs=(
  "Cantaloupe Island;60s,jazz"
  "Let It Be;60s,rock"
  "Knockin' on Heaven's Door;70s,rock"
  "Emotion;70s,pop"
  "The River"
)

BASH: The Songs Module Containing List of Record

Using Songs Module

Now we can have a very short code. With the result exactly the same as above code:

source ./MySongs.sh

IFS=$(echo -en "\n\b")
echo "${songs[*]}"

Or alternatively, we can alter the output, using IFS and loop.

source ./MySongs.sh

# Internal Field Separator
IFS=';'

for song in "${songs[@]}"
do
  read title tags <<< "${song}"
  echo "${title} is [${tags}]"
done

unset IFS

With the result as below lines of string.

BASH: Using Songs Module

❯ ./04-module-b.sh
Cantaloupe Island is [60s,jazz]
Let It Be is [60s,rock]
Knockin' on Heaven's Door is [70s,rock]
Emotion is [70s,pop]
The River is []

[] Notice how the variable assigned through read command.

read title tags <<< "${song}"

This read command, split variables based on, the IFS value.


3: Finishing The Task

Extract, Flatten, Unique

Extracting Data Structure

Based on the result above, we can go further, extracting the tags data.

source ./MySongs.sh

declare -a tagss

# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
  read title tags_temp <<< "${song_item}"
  tagss+=($tags_temp)
done

# Join tags list
IFS=':'
echo "${tagss[*]}"

unset IFS

With the result of array of array, as shown below.

❯ ./05-extract.sh
60s,jazz:60s,rock:70s,rock:70s,pop

BASH: Extracting Comma Separated String

Notice how we populate the tagss array variable, by append value for each loop.

tagss+=($tags_temp)

Flatten

Now we can normalize, the separated values. Flatten all values into just single array.

Using two loops. One after another.

source ./MySongs.sh

declare -a tagss
declare -a tags

# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
  read title tags_temp <<< "${song_item}"
  tagss+=($tags_temp)
done

# Split tags list
IFS=','
for tags_item in "${tagss[@]}"; do
  read tag_temp <<< "${tags_item}"
  tags+=(${tag_temp[@]})
done

# Join tags list
IFS=':'
echo "${tags[*]}"

unset IFS

With the result of a flattened array shown below.

❯ ./06-flatten-a.sh
60s:jazz:60s:rock:70s:rock:70s:pop

I know it is a little bit long. This long code looks inefficient. I have to do something about it.

Flatten Rewrite

We can rewrite above statement, using loop in a loop, to make the code shorter.

source ./MySongs.sh

# Split song record
IFS=';'
for song_item in "${songs[@]}"; do
  read title tags_temp <<< "${song_item}"

  # Split tags list
  for tags_item in "${tags_temp[@]}"; do
    IFS=','/ read -a tag_temp <<< "${tags_item}"
    tags+=(${tag_temp[@]})
  done
done

# Join tags list
IFS=':'
echo "${tags[*]}"

With the result exactly the same as previous flattened array.

BASH: Flattening Array

Flatten Function

Since we are going to reuse the flatten approach above in other code. It is better to bundle the script in its own module.

function flatten() {
  local l_songs=("$@")

  # Split song record
  IFS=';'
  for song_item in "${l_songs[@]}"; do
    read title tags_temp <<< "${song_item}"

    # Split tags list
    for tags_item in "${tags_temp[@]}"; do
      IFS=','/ read -a tag_temp <<< "${tags_item}"
      tags_flatten+=(${tag_temp[@]})
    done
  done

  unset IFS
  echo "${tags_flatten[@]}"
}

Beware of the return values. Returning array in bash is a little bit tricky.

BASH: The Flatten Function

Now we can use code above in other script.

Unique

To solve unique array, we can just use the uniq command, or sort -u.

But if you want the code to be pure bash, then we can examine link from bash wizard below:

Then we can adapt the example code above into our code below:

source ./MySongs.sh
source ./MyHelperFlatten.sh

# Extract flatten output to array
tags_fl=$(flatten ${songs[@]})
IFS=' '; read -a tags_flatten <<< "${tags_fl[@]}"

# https://bashwizard.com/displaying-unique-values/
declare -A tags_unique

for tag_item in "${tags_flatten[@]}"; do
  let tags_unique[$tag_item]++
done

tags=(${!tags_unique[@]})

IFS=':'
echo "${tags[*]}"

With the result similar as below array:

❯ ./07-unique.sh
rock:70s:pop:jazz:60s

BASH: Solving Unique Song

It is not that hard. Or isn’t it 🤔?


What is Next 🤔?

Consider continue reading [ Bash IFS - Playing with Records - Part Two ].