Preface
Goal: A practical case to collect unique record fields using Python.
What I think about python is large community base.
Reference Reading
The last time I read python documentation thoroughly, was two decades ago. There have been some changes, and interesting things.
Source Examples
You can obtain source examples here:
Common Use Case
Task: Get the unique tag string
Please read overview for more detail.
Prepopulated Data
Songs and Poetry
songs = [
dict( title = 'Cantaloupe Island',
tags = ['60s', 'jazz'] ),
dict( title = 'Let It Be',
tags = ['60s', 'rock'] ),
dict( title = 'Knockin\' on Heaven\'s Door',
tags = ['70s', 'rock'] ),
dict( title = 'Emotion',
tags = ['70s', 'pop'] ),
dict( title = 'The River')
]Python Solution
The Answer
I use list comprehension a lot. One of them is this oneliner as below:
from MySongs import songs
tags = [
tag for song in songs
if 'tags' in song
for tag in song['tags']
]
print(list(set(tags)))Enough with introduction, at this point we should go straight to coding.
Environment
No need any special setup. Just run and voila..!
1: Data Structure Using Dictionary
We are going to use list and dictionary,
throught out this article.
Simple List
Consider begin with simple list.
tags = ["rock", "jazz", "rock", "pop", "pop"]
print(tags)It is easy to dump variable in python using print.
With the result similar as below list:
❯ python 01-tags.py
['rock', 'jazz', 'rock', 'pop', 'pop']
Dictionary
There are different way to write dictionary.
The choice is your preferences.
I mostly uses bracket for my project,
which is the simpler one.
But for this record project,
I prefer the dict form.
import pprint
song1 = { 'title': 'Cantaloupe Island',
'tags' : ['60s', 'jazz'] }
song2 = dict( title = 'Cantaloupe Island',
tags = ['60s', 'jazz'] )
pprint.pprint(song1)
pprint.pprint(song2)The python standard library has this pprint.
Now with python we can output structure data, in tidier form.
This means we can examine better,
for any bug, or whatsoever.
❯ python 02-record.py
{'tags': ['60s', 'jazz'], 'title': 'Cantaloupe Island'}
{'tags': ['60s', 'jazz'], 'title': 'Cantaloupe Island'}As we can examine in output result above, both dictionaries represent the very same structure.

The Songs Structure
We can continue our journey to records just using dictionary.
No need any complex structure.
from pprint import pprint
songs = [
dict( title = 'Cantaloupe Island',
tags = ['60s', 'jazz'] ),
dict( title = 'Let It Be',
tags = ['60s', 'rock'] ),
dict( title = 'Knockin\' on Heaven\'s Door',
tags = ['70s', 'rock'] ),
dict( title = 'Emotion',
tags = ['70s', 'pop'] ),
dict( title = 'The River')
]
pprint(songs)With the result similar as below record:
❯ python 03-songs.py
[{'tags': ['60s', 'jazz'], 'title': 'Cantaloupe Island'},
{'tags': ['60s', 'rock'], 'title': 'Let It Be'},
{'tags': ['70s', 'rock'], 'title': "Knockin' on Heaven's Door"},
{'tags': ['70s', 'pop'], 'title': 'Emotion'},
{'title': 'The River'}]2: Separating Module
Since we need to reuse the songs record multiple times, it is a good idea to separate the record structure from logic.
Songs Module
The code can be shown as below:
songs = [
dict( title = 'Cantaloupe Island',
tags = ['60s', 'jazz'] ),
dict( title = 'Let It Be',
tags = ['60s', 'rock'] ),
dict( title = 'Knockin\' on Heaven\'s Door',
tags = ['70s', 'rock'] ),
dict( title = 'Emotion',
tags = ['70s', 'pop'] ),
dict( title = 'The River')
]
Using Songs Module
Now we can have a very short code.
from pprint import pprint
from MySongs import songs
pprint(songs)With the result exactly the same as above dictionary.

❯ python 04-module.py
[{'tags': ['60s', 'jazz'], 'title': 'Cantaloupe Island'},
{'tags': ['60s', 'rock'], 'title': 'Let It Be'},
{'tags': ['70s', 'rock'], 'title': "Knockin' on Heaven's Door"},
{'tags': ['70s', 'pop'], 'title': 'Emotion'},
{'title': 'The River'}]3: Finishing The Task
Extract, Flatten, Unique
Extracting Dictionary
List comprehension, and nothing else
The only addition is filter
from pprint import pprint
from MySongs import songs
tagss = [
song['tags'] for song in songs
if 'tags' in song
]
pprint(tagss)With the result of list of list, as shown below.
❯ python 05-extract.py
[['60s', 'jazz'], ['60s', 'rock'], ['70s', 'rock'], ['70s', 'pop']]
You can go further with map and filter,
but I’m going to skip these map and filter part.
Flatten
Advance List Comprehension
Again, what we need is only list comprehension. But two level list comprehension is a little bit tricky.
Consider begin with separating the list comprehension.
from pprint import pprint
from MySongs import songs
tagss = [
song['tags'] for song in songs
if 'tags' in song
]
tags = [
tag for tags in tagss
for tag in tags
]
pprint(tags)With the result of a flattened list shown below.
❯ python 06-flatten-a.py
['60s', 'jazz', '60s', 'rock', '70s', 'rock', '70s', 'pop']We can rewrite above statement, as two level loop:
a = []
for tags in tagss:
for tag in tags:
a.append(tag)And finally we can unified as oneline list comprehension.
tags = [
tag for song in songs
if 'tags' in song
for tag in song['tags']
]With the result exactly the same as previous advance list.

Unique
To solve unique list,
we can convert a list to a set,
and convert back the unique set to a list.
from MySongs import songs
tags = [
tag for song in songs
if 'tags' in song
for tag in song['tags']
]
print(list(set(tags)))With the result similar as below array:
❯ python 07-unique.py
['jazz', 'rock', '70s', '60s', 'pop']
Very short, right?
What is Next 🤔?
We have alternative way to build the record structure.
Consider continue reading [ Python - Playing with Records - Part Two ].