Where to Discuss?

Local Group

Preface

Goal: Merging multiple CSVs using Python Macro with UNO library.

Before we study the code in formatter class, we need to merge the example data above into a workbook. We are going to merge these example CSVs into one workbook using macro.

CSV Source

How do we choose the CSV source? I have two approaches here:

  1. Using Directory: The simpler case would be using directory as example.

  2. Using Config: Real world case require more control, such as using TOML config.

Reference

We can load CSV manually into a Calc sheet using LibreOffice, but to automate this we need this one trick.

Just one from this URL:

You can read the detail on that link above.

Manual Loading

Normally, we can load CSV from file manager, and get this libreoffice dialog.

Merge CSV: Manual Loading: Dialog

So how exactly we can load CSV using macro? Sure we should avoid dialog, especially when you have dozens of CSVs


1: CSV Importer: By Directory

The simpler case would be using directory as example.

Macro Skeleton

You can find the macro here.

There are only a few methods.

Merge CSV: Calc Macro: Class Skeleton

import os, uno

class CSVImporterDir:
  def __init__(self, input_dir) -> None:
    ...

  def load_csv(self,
        csv_path: str, sheet_name_dst: str) -> None:
    ...

  def process(self) -> None:
    ...

def importDir():
  ...

  csv_importer = CSVImporterDir(input_dir)
  csv_importer.process()

The core part is the filter options. This part is in the loading CSV function.

APSO Dialog

You can load the macro using APSO

Merge CSV: Calc Macro: APSO Dialog

Initialization

The initialization part is simple, and almost common in all macro.

Merge CSV: Calc Macro: Initialization

  def __init__(self, input_dir) -> None:
    # config, all parameter arguments from tomli
    self.input_dir= input_dir

    # Get the current LibreOffice context
    self.desktop   = XSCRIPTCONTEXT.getDesktop()
    self.model_dst = self.desktop.loadComponentFromURL(
      "private:factory/scalc", "_blank", 0, ())

Just remember, whenever you want to refactor, the XSCRIPTCONTEXT should be in the mainscript.

Loading CSV

Filter Options

This is the trick that we are talking about. With this trick we can avoid dialog.

Merge CSV: Calc Macro: Loading CSV: Filter Options

First define the filter itself.

  def load_csv(self,
        csv_path: str, sheet_name_dst: str) -> None:

    # Prepare filter options for CSV import\
    filter_options = "44,34,0,1,"
    file_url = uno.systemPathToFileUrl(csv_path)

Then some ritual to set properties using official guidance.

    # Set properties for loading the CSV file
    properties = (
        uno.createUnoStruct("com.sun.star.beans.PropertyValue"),
        uno.createUnoStruct("com.sun.star.beans.PropertyValue"),
    )
    properties[0].Name  = "FilterName"
    properties[0].Value = "Text - txt - csv (StarCalc)"
    properties[1].Name  = "FilterOptions"
    properties[1].Value = filter_options

Then create this sheet from thin air.

    # Load the CSV file as a spreadsheet
    model_src = self.desktop.loadComponentFromURL(
      file_url, "_blank", 0, properties)
    model_src.Sheets[0].Name = sheet_name_dst
    self.model_dst.Sheets.importSheet(
      model_src, sheet_name_dst, 0)
    model_src.close(True)

This is basically based on practical magic. So I cannot really explain behind the scene.

Process

Multiple CSV

Instead of just loading one sheet, we can utilize this for mass CSV import.

The main process is here. This read multiple CSV files, and load them all into single workbook.

Merge CSV: Calc Macro: Process

We need to continue to retain one non empty sheet. And delete this sheet once we are finished.

  def process(self) -> None:
    # write something, do not delete this line
    first_sheet = self.model_dst.Sheets[0]
    first_sheet.getCellRangeByName(
      "A1").String = "Hello World!"

Then manage, the content of directory, as a feed in the loop.

    # Get a list of CSV files in the input directory
    all_files = sorted(os.listdir(self.input_dir))
    csv_files = [f for f in all_files if f.endswith('.csv')]

    # Iterate over the files in reverse order
    for idx, filename in enumerate(reversed(csv_files), start=1):
        ...

    self.model_dst.Sheets.removeByName(
      first_sheet.Name)

This is the actual loading process.

    # Iterate over the files in reverse order
    # with an index starting from the largest number
    for idx, filename in enumerate(reversed(csv_files), start=1):
        input_csv = os.path.join(self.input_dir, filename)

        # Format the sheet name with
        # a two-digit index followed by year
        output_sheet = f"{len(csv_files) - idx + 1:02d}-" + \
                       f"{filename[7:11]}"

        print(f"Loading: {input_csv} into sheet: {output_sheet}")
        self.load_csv(input_csv, output_sheet)

Class Instance

Path

The macro function is simply just creating class instance with path arguments. Just be aware that path in linux and windows is different.

Merge CSV: Calc Macro: Main Method

def importDir():
  # Configure paths or parameters as needed
  
  # Linux based
  input_dir = '/home/epsi/movies-by-year/'

  # Windows based
  # input_dir = 'D:/movies-by-year/'

  csv_importer = CSVImporterDir(input_dir)
  csv_importer.process()

APSO Console

APSO python console [LibreOffice]

If you need to debug, you can run APSO Console. Just type the function name importDir().

Merge CSV: Calc Macro: APSO Console

>>> importDir()
Loading: /home/epsi/movies-by-year/movies_2019.csv into sheet: 40-2019
Loading: /home/epsi/movies-by-year/movies_2018.csv into sheet: 39-2018
Loading: /home/epsi/movies-by-year/movies_2017.csv into sheet: 38-2017
Loading: /home/epsi/movies-by-year/movies_2016.csv into sheet: 37-2016
Loading: /home/epsi/movies-by-year/movies_2015.csv into sheet: 36-2015
Loading: /home/epsi/movies-by-year/movies_2014.csv into sheet: 35-2014
Loading: /home/epsi/movies-by-year/movies_2013.csv into sheet: 34-2013
Loading: /home/epsi/movies-by-year/movies_2012.csv into sheet: 33-2012
Loading: /home/epsi/movies-by-year/movies_2011.csv into sheet: 32-2011
Loading: /home/epsi/movies-by-year/movies_2010.csv into sheet: 31-2010

Sheet Result

The result of the macro is displayed in the screenshot below:

Merge CSV: Calc Macro: Sheet Result

Now you can save to ODS, for example to movies.ods.

Merge CSV: Calc Macro: Save to ODS

You can download example ODS result here:

We are done here.


2: CSV Importer: By Config

Real world case require more control, such as using TOML config.

TOML Configuration

The TOML Configuration would looks like below:

Merge CSV: Calc Macro: Config: TOML

# Linux Based
path = "/home/epsi/movies-by-year/"

# Windows Based
path = "D:/movies-by-year"

# Included Files
csv_s = [
  "movies_1980",
  "movies_1981",
  "movies_1982",
  ...
  ...
  "movies_2018",
  "movies_2019"
]

Just be aware that path in linux and windows is different.

Macro Skeleton

You can find the macro here.

There is only one additional method.

Merge CSV: Calc Macro: Class Skeleton

import tomli, uno

class CSVImporter:
  def __init__(self, config_path) -> None:
    ...

  def set_config(self) -> None:
    ...

  def load_csv(self, csv_path, sheet_name_dst):
    ...

  def process(self, sheetType) -> None:
    ...

def importCSVs():
  config_path = '/home/epsi/Dev/config.toml'  
  csv_importer = CSVImporter(config_path)
  csv_importer.process('Tabular')

Initialization

This is very similar as before, but we use config.toml as a source instead of directory.

Merge CSV: Calc Macro: Initialization

  def __init__(self, config_path) -> None:
    # config, all parameter arguments from tomli
    self.config_path = config_path
    self.set_config()

    # Get the current LibreOffice context
    self.desktop   = XSCRIPTCONTEXT.getDesktop()
    self.model_dst = self.desktop.loadComponentFromURL(
      "private:factory/scalc", "_blank", 0, ())

Configuration

We need to build list with dictionary:

  • filename (input-expand)
  • sheetname (sheet-expand)

The word expand here, is my internal naming. I have a habit to name my data CSV files variously depend on the process. This is my naming step by step: tabular, expand, pivot, and stat for statistics.

Merge CSV: Calc Macro: TOML Configuration

The list result would be stored in self.filenames.

  def set_config(self) -> None:
    # read toml configuration
    file_obj: TextIO = open(self.config_path, 'rb')
    config_root = tomli.load(file_obj)
    file_obj.close()

    # Get config content.
    path_root = config_root.get('path', '.')
    csv_s     = config_root.get('csv_s', [])

    # Define input and output file paths
    filenames = []
    for index, csv in enumerate(csv_s):
      filenames.append({
        'input-expand' : f'{path_root}/{csv}.csv',
        'sheet-expand' : f'{index+1:02d}-{csv[7:11]}', 
      })

    self.filenames = filenames

The worksheet names would looks like below:

Merge CSV: Calc Macro: Sheetname Tab

Loading CSV

Filter Options

This is exactly the same with previous method.

Merge CSV: Calc Macro: Loading CSV: Filter Options

So no need to give any source code.

Process

This is where we utilize the previously mentioned self.filenames.

Merge CSV: Calc Macro: Process

  def process(self, sheetType) -> None:
    # write something, do not delete this line
    first_sheet = self.model_dst.Sheets[0]
    first_sheet.getCellRangeByName(
      "A1").String = "Hello World!"

    # Main function to load Pivot CSV 
    for filename in reversed(self.filenames):
      input_csv    = filename['input-expand']
      output_sheet = filename['sheet-expand']

      print(f"Loading: {input_csv}")
      self.load_csv(input_csv, output_sheet)

    self.model_dst.Sheets.removeByName(
      first_sheet.Name)

Class Instance

Config Path

The main method is basically just instatiating class, along with prepared parameters.

Instead of folder path, we use config path. This way we can store more variable, than just directory.

Merge CSV: Calc Macro: Main Method

def importCSVs():
  # Configure paths or parameters as needed
  config_path = '/home/epsi/Dev/config.toml'  
  csv_importer = CSVImporter(config_path)
  csv_importer.process('Tabular')

Sheet Result

The result of the macro is similar with previous macro. But this time you have more control.

I’m using this script heavily, but of course my CSV data is not about movies.


3: Tab Colorizer

For a lot of worksheets in a workbook, it would be nice if we can distinct them by pretty color. We can do this by defining new python macro in libreoffice.

This is actually, just a nice to have feature. But I always use this feature, so I would like to share this code for you.

Macro Skeleton

You can find the macro here.

Sheet Tab Colorizer: Macro Skeleton

# Color scale 
blueScale  = { ... }
tealScale  = { ... }
amberScale = { ... }
brownScale = { ... }

class SheetTabColorizer:
  def __init__(self):
    ...

  # Apply Color To Sheets
  def run(self):
    ...

def main():
  # Create the colorizer instance and run it
  colorizer = SheetTabColorizer()
  colorizer.run()

Color Scale

I’m using Google Material Color as usual.

Sheet Tab Colorizer: GMC Color Scale

# Google Material color scale 
blueScale = {
  0: 0xE3F2FD, 1: 0xBBDEFB, 2: 0x90CAF9,
  3: 0x64B5F6, 4: 0x42A5F5, 5: 0x2196F3,
  6: 0x1E88E5, 7: 0x1976D2, 8: 0x1565C0,
  9: 0x0D47A1
}

tealScale = {
  0: 0xE0F2F1, 1: 0xB2DFDB, 2: 0x80CBC4,
  3: 0x4DB6AC, 4: 0x26A69A, 5: 0x009688,
  6: 0x00897B, 7: 0x00796B, 8: 0x00695C,
  9: 0x004D40
}

amberScale = {
  0: 0xFFF8E1, 1: 0xFFECB3, 2: 0xFFE082,
  3: 0xFFD54F, 4: 0xFFCA28, 5: 0xFFC107,
  6: 0xFFB300, 7: 0xFFA000, 8: 0xFF8F00,
  9: 0xFF6F00
}

brownScale = {
  0: 0xEFEBE9, 1: 0xD7CCC8, 2: 0xBCAAA4,
  3: 0xA1887F, 4: 0x8D6E63, 5: 0x795548,
  6: 0x6D4C41, 7: 0x5D4037, 8: 0x4E342E,
  9: 0x3E2723
}

You may change the color scale to suit your need.

Initialization

Just preparing variables.

Sheet Tab Colorizer: Initialization

class SheetTabColorizer:
  def __init__(self):
    # Get the current LibreOffice context and document
    self.document = XSCRIPTCONTEXT.getDocument()

    # List of palettes to cycle through
    self.palettes = [blueScale, tealScale, amberScale, brownScale]

Applying Color

You may change the logic to suit your need. This is just an example.

Sheet Tab Colorizer: Run: Applying Color

  # Apply Color To Sheets
  def run(self):
    # Loop through each sheet and apply color based on index
    sheets = self.document.Sheets

    for i in range(sheets.Count):
      sheet = sheets[i]
      
      # Determine which palette to use based on sheet index
      palette_index = (i // 10) % 4  # Cycle through 4 palettes
      color_index = i % 10  # Cycle through 0 to 9 for each palette

      # Get the color from the selected palette
      tab_color = self.palettes[palette_index][color_index]
      sheet.TabColor = tab_color  # Set the tab color

Class Instance

The main method is basically just instatiating class, along with prepared parameters.

Sheet Tab Colorizer: Main Method: Class Instance

def main():
  # Create the colorizer instance and run it
  colorizer = SheetTabColorizer()
  colorizer.run()

Sheet Result

The worksheet names would looks like below:

Merge CSV: Calc Macro: Sheet Result with Colorized Tab

You can compare with previous preview:

Merge CSV: Calc Macro: Sheetname Tab


What is Next 🤔?

Now we can automate the sheet formatting. Let’s start with a simple-config class, rearangging the column. The basic flow is just formatting single sheet and multiple sheet.

Consider continue reading [ Formatter - Simple Config - One ].