Preface
Goal: Merging multiple CSVs using Python Macro with UNO library.
Before we study the code in formatter class, we need to merge the example data above into a workbook. We are going to merge these example CSVs into one workbook using macro.
CSV Source
How do we choose the CSV source? I have two approaches here:
-
Using Directory: The simpler case would be using directory as example.
-
Using Config: Real world case require more control, such as using TOML config.
Reference
We can load CSV manually into a Calc sheet using LibreOffice, but to automate this we need this one trick.
Just one from this URL:
You can read the detail on that link above.
Manual Loading
Normally, we can load CSV from file manager, and get this libreoffice dialog.
So how exactly we can load CSV using macro? Sure we should avoid dialog, especially when you have dozens of CSVs
1: CSV Importer: By Directory
The simpler case would be using directory as example.
Macro Skeleton
You can find the macro here.
There are only a few methods.
import os, uno
class CSVImporterDir:
def __init__(self, input_dir) -> None:
...
def load_csv(self,
csv_path: str, sheet_name_dst: str) -> None:
...
def process(self) -> None:
...
def importDir():
...
csv_importer = CSVImporterDir(input_dir)
csv_importer.process()
The core part is the filter options. This part is in the loading CSV function.
APSO Dialog
You can load the macro using APSO
Initialization
The initialization part is simple, and almost common in all macro.
def __init__(self, input_dir) -> None:
# config, all parameter arguments from tomli
self.input_dir= input_dir
# Get the current LibreOffice context
self.desktop = XSCRIPTCONTEXT.getDesktop()
self.model_dst = self.desktop.loadComponentFromURL(
"private:factory/scalc", "_blank", 0, ())
Just remember, whenever you want to refactor, the XSCRIPTCONTEXT should be in the mainscript.
Loading CSV
Filter Options
This is the trick that we are talking about. With this trick we can avoid dialog.
First define the filter itself.
def load_csv(self,
csv_path: str, sheet_name_dst: str) -> None:
# Prepare filter options for CSV import\
filter_options = "44,34,0,1,"
file_url = uno.systemPathToFileUrl(csv_path)
Then some ritual to set properties using official guidance.
# Set properties for loading the CSV file
properties = (
uno.createUnoStruct("com.sun.star.beans.PropertyValue"),
uno.createUnoStruct("com.sun.star.beans.PropertyValue"),
)
properties[0].Name = "FilterName"
properties[0].Value = "Text - txt - csv (StarCalc)"
properties[1].Name = "FilterOptions"
properties[1].Value = filter_options
Then create this sheet from thin air.
# Load the CSV file as a spreadsheet
model_src = self.desktop.loadComponentFromURL(
file_url, "_blank", 0, properties)
model_src.Sheets[0].Name = sheet_name_dst
self.model_dst.Sheets.importSheet(
model_src, sheet_name_dst, 0)
model_src.close(True)
This is basically based on practical magic. So I cannot really explain behind the scene.
Process
Multiple CSV
Instead of just loading one sheet, we can utilize this for mass CSV import.
The main process is here. This read multiple CSV files, and load them all into single workbook.
We need to continue to retain one non empty sheet. And delete this sheet once we are finished.
def process(self) -> None:
# write something, do not delete this line
first_sheet = self.model_dst.Sheets[0]
first_sheet.getCellRangeByName(
"A1").String = "Hello World!"
Then manage, the content of directory, as a feed in the loop.
# Get a list of CSV files in the input directory
all_files = sorted(os.listdir(self.input_dir))
csv_files = [f for f in all_files if f.endswith('.csv')]
# Iterate over the files in reverse order
for idx, filename in enumerate(reversed(csv_files), start=1):
...
self.model_dst.Sheets.removeByName(
first_sheet.Name)
This is the actual loading process.
# Iterate over the files in reverse order
# with an index starting from the largest number
for idx, filename in enumerate(reversed(csv_files), start=1):
input_csv = os.path.join(self.input_dir, filename)
# Format the sheet name with
# a two-digit index followed by year
output_sheet = f"{len(csv_files) - idx + 1:02d}-" + \
f"{filename[7:11]}"
print(f"Loading: {input_csv} into sheet: {output_sheet}")
self.load_csv(input_csv, output_sheet)
Class Instance
Path
The macro function is simply just creating class instance with path arguments. Just be aware that path in linux and windows is different.
def importDir():
# Configure paths or parameters as needed
# Linux based
input_dir = '/home/epsi/movies-by-year/'
# Windows based
# input_dir = 'D:/movies-by-year/'
csv_importer = CSVImporterDir(input_dir)
csv_importer.process()
APSO Console
APSO python console [LibreOffice]
If you need to debug, you can run APSO Console.
Just type the function name importDir()
.
>>> importDir()
Loading: /home/epsi/movies-by-year/movies_2019.csv into sheet: 40-2019
Loading: /home/epsi/movies-by-year/movies_2018.csv into sheet: 39-2018
Loading: /home/epsi/movies-by-year/movies_2017.csv into sheet: 38-2017
Loading: /home/epsi/movies-by-year/movies_2016.csv into sheet: 37-2016
Loading: /home/epsi/movies-by-year/movies_2015.csv into sheet: 36-2015
Loading: /home/epsi/movies-by-year/movies_2014.csv into sheet: 35-2014
Loading: /home/epsi/movies-by-year/movies_2013.csv into sheet: 34-2013
Loading: /home/epsi/movies-by-year/movies_2012.csv into sheet: 33-2012
Loading: /home/epsi/movies-by-year/movies_2011.csv into sheet: 32-2011
Loading: /home/epsi/movies-by-year/movies_2010.csv into sheet: 31-2010
Sheet Result
The result of the macro is displayed in the screenshot below:
Now you can save to ODS, for example to movies.ods
.
You can download example ODS result here:
We are done here.
2: CSV Importer: By Config
Real world case require more control, such as using TOML config.
TOML Configuration
The TOML Configuration would looks like below:
# Linux Based
path = "/home/epsi/movies-by-year/"
# Windows Based
path = "D:/movies-by-year"
# Included Files
csv_s = [
"movies_1980",
"movies_1981",
"movies_1982",
...
...
"movies_2018",
"movies_2019"
]
Just be aware that path in linux and windows is different.
Macro Skeleton
You can find the macro here.
There is only one additional method.
import tomli, uno
class CSVImporter:
def __init__(self, config_path) -> None:
...
def set_config(self) -> None:
...
def load_csv(self, csv_path, sheet_name_dst):
...
def process(self, sheetType) -> None:
...
def importCSVs():
config_path = '/home/epsi/Dev/config.toml'
csv_importer = CSVImporter(config_path)
csv_importer.process('Tabular')
Initialization
This is very similar as before,
but we use config.toml
as a source instead of directory.
def __init__(self, config_path) -> None:
# config, all parameter arguments from tomli
self.config_path = config_path
self.set_config()
# Get the current LibreOffice context
self.desktop = XSCRIPTCONTEXT.getDesktop()
self.model_dst = self.desktop.loadComponentFromURL(
"private:factory/scalc", "_blank", 0, ())
Configuration
We need to build list with dictionary:
- filename (input-expand)
- sheetname (sheet-expand)
The word expand here, is my internal naming.
I have a habit to name my data CSV files variously depend on the process.
This is my naming step by step: tabular
, expand
, pivot
, and stat
for statistics.
The list result would be stored in self.filenames
.
def set_config(self) -> None:
# read toml configuration
file_obj: TextIO = open(self.config_path, 'rb')
config_root = tomli.load(file_obj)
file_obj.close()
# Get config content.
path_root = config_root.get('path', '.')
csv_s = config_root.get('csv_s', [])
# Define input and output file paths
filenames = []
for index, csv in enumerate(csv_s):
filenames.append({
'input-expand' : f'{path_root}/{csv}.csv',
'sheet-expand' : f'{index+1:02d}-{csv[7:11]}',
})
self.filenames = filenames
The worksheet names would looks like below:
Loading CSV
Filter Options
This is exactly the same with previous method.
So no need to give any source code.
Process
This is where we utilize the previously mentioned self.filenames
.
def process(self, sheetType) -> None:
# write something, do not delete this line
first_sheet = self.model_dst.Sheets[0]
first_sheet.getCellRangeByName(
"A1").String = "Hello World!"
# Main function to load Pivot CSV
for filename in reversed(self.filenames):
input_csv = filename['input-expand']
output_sheet = filename['sheet-expand']
print(f"Loading: {input_csv}")
self.load_csv(input_csv, output_sheet)
self.model_dst.Sheets.removeByName(
first_sheet.Name)
Class Instance
Config Path
The main method is basically just instatiating class, along with prepared parameters.
Instead of folder path, we use config path. This way we can store more variable, than just directory.
def importCSVs():
# Configure paths or parameters as needed
config_path = '/home/epsi/Dev/config.toml'
csv_importer = CSVImporter(config_path)
csv_importer.process('Tabular')
Sheet Result
The result of the macro is similar with previous macro. But this time you have more control.
I’m using this script heavily, but of course my CSV data is not about movies.
3: Tab Colorizer
For a lot of worksheets in a workbook, it would be nice if we can distinct them by pretty color. We can do this by defining new python macro in libreoffice.
This is actually, just a nice to have feature. But I always use this feature, so I would like to share this code for you.
Macro Skeleton
You can find the macro here.
# Color scale
blueScale = { ... }
tealScale = { ... }
amberScale = { ... }
brownScale = { ... }
class SheetTabColorizer:
def __init__(self):
...
# Apply Color To Sheets
def run(self):
...
def main():
# Create the colorizer instance and run it
colorizer = SheetTabColorizer()
colorizer.run()
Color Scale
I’m using Google Material Color as usual.
# Google Material color scale
blueScale = {
0: 0xE3F2FD, 1: 0xBBDEFB, 2: 0x90CAF9,
3: 0x64B5F6, 4: 0x42A5F5, 5: 0x2196F3,
6: 0x1E88E5, 7: 0x1976D2, 8: 0x1565C0,
9: 0x0D47A1
}
tealScale = {
0: 0xE0F2F1, 1: 0xB2DFDB, 2: 0x80CBC4,
3: 0x4DB6AC, 4: 0x26A69A, 5: 0x009688,
6: 0x00897B, 7: 0x00796B, 8: 0x00695C,
9: 0x004D40
}
amberScale = {
0: 0xFFF8E1, 1: 0xFFECB3, 2: 0xFFE082,
3: 0xFFD54F, 4: 0xFFCA28, 5: 0xFFC107,
6: 0xFFB300, 7: 0xFFA000, 8: 0xFF8F00,
9: 0xFF6F00
}
brownScale = {
0: 0xEFEBE9, 1: 0xD7CCC8, 2: 0xBCAAA4,
3: 0xA1887F, 4: 0x8D6E63, 5: 0x795548,
6: 0x6D4C41, 7: 0x5D4037, 8: 0x4E342E,
9: 0x3E2723
}
You may change the color scale to suit your need.
Initialization
Just preparing variables.
class SheetTabColorizer:
def __init__(self):
# Get the current LibreOffice context and document
self.document = XSCRIPTCONTEXT.getDocument()
# List of palettes to cycle through
self.palettes = [blueScale, tealScale, amberScale, brownScale]
Applying Color
You may change the logic to suit your need. This is just an example.
# Apply Color To Sheets
def run(self):
# Loop through each sheet and apply color based on index
sheets = self.document.Sheets
for i in range(sheets.Count):
sheet = sheets[i]
# Determine which palette to use based on sheet index
palette_index = (i // 10) % 4 # Cycle through 4 palettes
color_index = i % 10 # Cycle through 0 to 9 for each palette
# Get the color from the selected palette
tab_color = self.palettes[palette_index][color_index]
sheet.TabColor = tab_color # Set the tab color
Class Instance
The main method is basically just instatiating class, along with prepared parameters.
def main():
# Create the colorizer instance and run it
colorizer = SheetTabColorizer()
colorizer.run()
Sheet Result
The worksheet names would looks like below:
You can compare with previous preview:
What is Next 🤔?
Now we can automate the sheet formatting. Let’s start with a simple-config class, rearangging the column. The basic flow is just formatting single sheet and multiple sheet.
Consider continue reading [ Formatter - Simple Config - One ].