Formatter - Simple Config

Simple Configuration (Original)

Unified Configuration (Revisited)

Local Group

Preface

Goal: Separate formatting code and configuration. Example class using simple configuration.

Now we can automate sheet formatting. Let’s begin with a simple class for rearranging columns. The basic flow will cover formatting both single and multiple sheets.

Why Separate Config?

We often work with different types of source and result data, but we want to avoid modifying the code every time. Here’s a basic principle to follow:

Keep the code static.
Make the configuration flexible.

However, don’t take this rule to extremes. In many cases, it’s faster to directly alter the code to get the result. What we really need is a stable core, with macro code that can be adjusted as needed.

Example Sheet

You can download the example ODS source here:

github.com/…/movies/data/movies.ods

And the example XLSX result here:

github.com/…/movies/data/movies.xlsx

This workbook contains 40 worksheets.

Class Diagram

To provide a clear illustration of the overall structure, the main macro is depicted in a class diagram. While this is not a standard UML representation, djustments have been made for clarity.

The class diagram is created using multiple pages in Inkscape. You can access the source here and adapt it for your own needs.

class-diagram-macro-formatter-steps.svg

We will begin with a simple diagram, and build it up step by step until we reach the complete example.

About

This simple configuration represents my code before a heavy refactoring. Let’s treat this simple configuration as the first version.

As you might expect, the first version can be messy. Every writer knows that good writing requires revision, and the same goes for code. It often needs refactoring.

01: Step One

Step Overview

This section covers the following topics:

Class: FormatterBase, TabularFormatterCommon
Basic Flow: Working with multiple sheets
Formatting Procedure: Inserting column gaps

Macro Source

ThYou can access the macro source in this repository:

github.com/…/movies/steps-01/FormatterTabular01.py

Class Diagram

Why separate the base and tabular?

We begin with this simple class diagram. While it may look complex, let’s break down how it works.

The diagram focuses on visibility specifiers, rather than attributes or operations. To improve clarity, I’ve divided each class into three regions: private, protected, and public.

Class Diagram: Formatting: Simple Configuration: Step 01

The diagram consists of three basic parts:

Formatter Base: Basic Flow
Formatter Tabular Movies: Formatting Procedure
Macro Call: Represents the class in the macro

But why separate base and tabular? In real world scenarios, I work with various types of views. For instance, tabular/expand views may require, different implementations than pivot/summary views. I also handle numerous configurations for tabular data and summaries.

By keeping the core functionality in the base class intact, I can ensure flexibility in the descendant classes, allowing them to be customized for specific use cases.

XSCRIPTCONTEXT

Initialization

Since we now have two classes, and more coming later. It’s important to carefully handle the initialization process.

The core of the initialization is straightforward, and consists of the following:

class FormatterBase:
  def __init__(self, document) -> None:
    self.document   = XSCRIPTCONTEXT.getDocument()
    self.controller = self.document.getCurrentController()

However, the complete macro spans nearly 700 lines of code. As usual, I break down long code into chunks across different files. The catch is that XSCRIPTCONTEXT isn’t part of any external library. It only exists within LibreOffice macro files. This creates a challenge when separating code into multiple files.

Handling XSCRIPTCONTEXT

To work around this, we need to trick the system. Here are two approaches:

Pass it as a function argument
Use a property getter

In this example, I opt for the first approach. Passing XSCRIPTCONTEXT as an argument, to the base class during initialization.

Calc Macro: Step One: Class Initialization

class FormatterBase:
  def __init__(self, document) -> None:
    self.document   = document
    self.controller = self.document.getCurrentController()

In the main macro file, the subclass passes the XSCRIPTCONTEXT during instantiation. All we really need is the document object, so as long as XSCRIPTCONTEXT is accessible in the correct macro file, everything works smoothly.

class FormatterTabularMovies(FormatterBase):
  def __init__(self) -> None:
    super().__init__(XSCRIPTCONTEXT.getDocument())

Initialization

The complete initialization process is shown in the following code:

from com.sun.star.sheet import XSpreadsheetDocument

class FormatterBase:
  def __init__(self, document: XSpreadsheetDocument) -> None:
    self.__document = document
    self._sheet = None
    self._controller = self.__document.getCurrentController()

For clarity and better encapsulation, I apply visibility specifiers to methods and instance variables:

_ for protected (accessible within the class and subclasses)
__ for private (accessible only within the class)

Document in Diagram.

In the diagram above, the document property is marked as private (-), and shown in italic with a tilde (~).

The tilde typically indicates abstract methods or properties.
Abstract properties are rarely private, as they are meant to be overridden by subclasses.
Italics signify either abstract or dynamic properties.

In this context, the property is more like an uninitialized placeholder.

XModel is part of the LibreOffice API. For simplicity, I use XModel in the diagram, even though the actual service name is SpreadsheetDocument.

XModel is part of libreoffice API. For simplicity, I intentionally use XModel in the diagram, although the actual service name is SpreadsheetDocument. You can check the actual type using:

If you need to verify the exact type, you can run the following code:

Internal Diagram

To deepen your understanding of how the LibreOffice API operates internally, let’s visualize the class structure.

This diagram represents the LibreOffice API’s internal architecture. It’s designed to provide clarity rather than strictly follow UML standards.

As usual, the focus is on illustrating conceptual relationships rather than producing a by-the-book UML representation.

Class Diagram: LibreOffice API: XSpreadsheetDocument

Macro Call

Represent the Class in Macro

Since macros can only be called by functions directly, we instantiate the class within the function.

There are two types of processes in this macro:

Single-sheet processing
Multi-sheet processing

Calc Macro: Step One: Macro Calls

Process the active sheet only:

def tabular_single_movies() -> None:
  movies = FormatterTabularMovies()
  movies.process_one()

Process all sheets in the workbook:

def tabular_multi_movies() -> None:
  movies = FormatterTabularMovies()
  movies.process_all()

Looks simple right?

Single and Multiple

In the Macro Dialog, it looks like this:

Calc Macro: Step One: APSO Dialog

In practice, I have numerous functions, such as:

FormatterSummary
- summary_multi_movies
- summary_single_movies
- summary_multi_songs
- summary_single_songs
- summary_multi_movies_analytic
- summary_single_movies_analytic
FormatterTabular
- tabular_multi_movies
- tabular_single_movies
- tabular_multi_songs
- tabular_single_songs
- tabular_multi_movies_filtered
- tabular_single_movies_filtered

Each function corresponds to a configuration. All that’s needed is to adjust the configuration to match the column structure of the source sheet.

Having multiple configurations isn’t a problem. Since they’re usually short and straightforward.

In the APSO console, methods can be invoked by name:

Calc Macro: Step One: APSO Console

>>> tabular_single_movies()
>>>

Formatter Base Class

Basic Flow

All core logic resides here. At this stage, the focus is on format_one_sheet(). Its sole responsibility is resetting column positions.

The initial flow is minimal, a single one-liner of code:

  def __format_one_sheet(self) -> None:
    self._reset_pos_columns()

This method is triggered by either of the following:

process_one()
process_all()

Calc Macro: Step One: Basic Flow

process_one() operates on the active sheet:

  def process_one(self) -> None:
    self._sheet = self._controller.getActiveSheet()
    self.__format_one_sheet()

process_all() iterates through all sheets in the workbook:

  def process_all(self) -> None:
    for sheet in self.__document.Sheets:
      print(sheet.Name)
      self._sheet = sheet
      self.__format_one_sheet()

To enable tracing, each step is printed to the console. For a workbook with 40 sheets, the output will resemble:

Calc Macro: Step One: APSO Console

>>> tabular_multi_movies()
01-1980
02-1981
...
39-2018
40-2019
>>>

At this stage, only sheet names appear in the console. Detailed reports will come later as the code expands.

Formatter Tabular Movies Class

Formatting Procedure

This class handles specific configurations and methods, such as formatting a movie table.

You’ve already seen the initialization. Now let’s highlight the key method that handles column adjustments:

class FormatterTabularMovies(FormatterBase):
  def __init__(self) -> None:
    super().__init__(XSCRIPTCONTEXT.getDocument())

  def _reset_pos_columns(self) -> None:
    ...

The _reset_pos_columns() method is where the magic happens. Responsible for rearranging columns according to the desired layout.

We’ll dive into its implementation next.

Arranging column

What does it do exactly?

Let’s break down the column resetting process.

It’s easier to visualize with examples. Here’s the initial input, plain columns without any modifications:

Calc Sheet: Step One: Input

First, I add a 0.5 cm gap before all columns (column A). The result looks like this:

Calc Sheet: Step One: Output

Next, I insert another 0.5 cm gap as, a divider between column blocks (column H). Finally, I add one last 0.5 cm gap after all columns (column L). Here’s the final result:

Calc Sheet: Step One: Output

Implementation

Hardcoded for Now

For now, this process is hardcoded:

Gap size: 0.5 cm
Scaling factor: Locale-dependent (LibreOffice uses 1/100 mm units)

Calc Macro: Step One: Procedure: Reset Column Positions

  def _reset_pos_columns(self) -> None:
    columns = self._sheet.Columns
    column_width_div = 0.5 * 1000  # Width of 0.5 cm

    # Insert one column at the specified indexes
    columns.insertByIndex( 0, 1) # Column A
    columns.insertByIndex( 7, 1) # Column H
    columns.insertByIndex(11, 1) # Column L

    # Set widths for columns A, H
    columns.getByIndex( 0).Width = column_width_div
    columns.getByIndex( 7).Width = column_width_div
    columns.getByIndex(11).Width = column_width_div

We’ll turn this hardcoded gap into a configurable parameter later. For now, the goal is to implement formatting step by step.

Complete Formatting Procedure

Basic Flow

The final version of format_one_sheet() involves multiple steps, as shown below:

  def __format_one_sheet(self) -> None:
    self._max_row = self.__get_last_used_row()

    if not self.__is_first_column_empty():
      # Rearranging Columns
      print(' * Rearranging Columns')
      self._reset_pos_columns()
      self.__reset_pos_rows()
      self._max_row += 1

Next, the process continues with additional formatting:

    # Apply Sheet Wide
    print(' * Formatting Columns')
    self._set_sheetwide_view()
    self.__set_columns_format()

    # Apply Header Settings
    print(' * Formatting Header')
    self._add_merged_title()
    self._format_head_borders()
    self._format_head_colors()

    # Apply borders to the specified range
    print(' * Formatting Border')
    self._format_data_borders()

We’ll break down each procedure in the following sections.

02: Step Two

After configuring the columns, we will now focus on setting up the rows.

Step Overview

The topics covered in this section include:

Helper: Last Row
Helper: Prevent Reinsertion of Columns
Formatting Procedure: Insert Row Gaps, Set Row Heights

Macro Source

You can find the macro source code in the following repository:

github.com/…/movies/steps-01/FormatterTabular02.py

Class Diagram

In each step, I will highlight new additions such as constants, properties, or methods. This way, you can easily track the changes in the code.

Class Diagram: Formatting: Simple Configuration: Step 02

Basic Flow

After rearranging the columns, we can proceed to insert row gaps and format the rows.

Calc Macro: Step Two: Basic Flow

  def __format_one_sheet(self) -> None:
    self._max_row = self.__get_last_used_row()

    if not self.__is_first_column_empty():
      # Rearranging Columns
      print(' * Rearranging Columns')
      self._reset_pos_columns()
      self.__reset_pos_rows()
      self._max_row += 1

In this flow, the focus is on handling row formatting, in addition to column formatting.

Helper: Prevent Column Rearrangement

The goal of this step is to ensure, that the column and row gap insertion is only performed once, preventing any further changes, that might disrupt the order of columns or rows.

You can freely run the formatting macro multiple times, and the result will generally remain the same, such as modifying properties like color, borders, fonts, or alignment. However, this is not the case when adding gaps. The gap arrangement must be done only once. Any additional attempts to rearrange columns or rows, could alter the intended layout.

To achieve this, we use the function __is_first_column_empty(), to check if the first column has already been occupied by a gap.

Calc Macro: Step Two: Helper: Is First Column Empty

If the column is already filled with an empty cell, it indicates that the gap has already been inserted. We only need to check for this once, and further attempts to adjust the columns should be blocked.

For example, we can check the first few rows (e.g., row 10), to verify if the column has already been modified. Sampling a few rows should be sufficient.

  def __is_first_column_empty(self) -> bool:
    rows = self._sheet.Rows
    max_sampling_row = 10

    for row_index in range(max_sampling_row + 1):
      cell = self._sheet.getCellByPosition(0, row_index)
      # Indicates an empty cell
      if cell.String != "": return False
    return True

This function ensures that the macro doesn’t inadvertently re-insert gaps, preserving the order of the columns and rows after the initial setup.

Helper: Maximum Rows

Getting the end of the rows

The method __get_last_used_row() is commonly used in LibreOffice macros, to determine the last row that contains data, which is important for ensuring that we apply formatting, only to the relevant rows.

Calc Macro: Step Two: Helper: Get Last Row

  def __get_last_used_row(self) -> int:
    cursor = self._sheet.createCursor()
    cursor.gotoEndOfUsedArea(False)
    cursor.gotoStartOfUsedArea(True)
    rows = cursor.getRows()
    
    return len(rows)

This method uses a cursor to navigate through the used area of the sheet. By calling gotoEndOfUsedArea(False) and gotoStartOfUsedArea(True), the cursor identifies the actual range of data. We then retrieve the rows within that range and return their count, effectively determining the last used row.

With this, we can now know the exact range of rows to format, while the column range will be set manually in the code.

This method is essential when you’re working with dynamic data in LibreOffice, as it ensures the macro adapts to any changes in the sheet’s data range.

Formatting: Row Position

In this step, we first insert two rows to create a column group heading.

Calc Sheet: Step Two: Merged Header

Then, we ensure the height of all rows is consistent, setting it to 0.5 cm (the multiplier factor may vary depending on the locale).

Calc Macro: Step Two: Procedure: Reset Rows Positions

  def __reset_pos_rows(self) -> None:
    rows = self._sheet.Rows
    row_height = 0.5 * 1000  # Height of 0.5 cm

    # Range to be processed
    # Omit header and plus one for the last
    range_rows = range(2, self._max_row + 1)

    for row_index in range_rows:
      rows.getByIndex(row_index).Height = row_height

    # Insert one row at the specified indexes
    rows.insertByIndex(0, 2)  # Row 0

    row_height_div = 0.3 * 1000  # Height of 0.3 cm
    rows.getByIndex(0).Height = row_height_div
    rows.getByIndex(self._max_row + 2).Height = row_height_div

Breakdown

Row Height Adjustment: We set the height of each row (from row 2 to the last row) to 0.5 cm. This ensures uniformity across the rows.
Inserting Gap Rows: We insert two rows at the top (index 0) and apply a 0.3 cm gap to the first and last rows. These gap rows are intended to visually separate the table from any other content that might be above or below it.
Maintaining Simple Formatting: The row formatting is kept simple for this example. You can adjust the row heights further if needed, but for now, the formatting uses consistent 0.5 cm and 0.3 cm values for the rows.

This approach keeps your sheet well-organized, adding necessary gaps and making sure that all rows have the same height, while also ensuring the header and footer are appropriately spaced.

Sheet Result

The final result after applying the row and column formatting should look like this:

Calc Sheet: Step Two: Output

Console Logging.

To provide feedback during the execution of the script, I use simple print statements for informational logging. These messages help with status updates, debugging, and traceability of the code’s execution. The messages are displayed in the APSO Console.

For this step, the output in the APSO Console would look like:

>>> tabular_single_movies()
 * Rearranging Columns
>>>

This logging approach helps track the script’s flow, and ensures you can monitor the status of each operation, making debugging and tracking easier.

We will get into this later.

03: Step Three

In addition to formatting individual cells, it’s also important to adjust the sheet’s overall view settings, such as freezing rows and columns, and toggling gridlines. While gridlines are helpful during data entry, they can clutter the presentation when generating reports. Let’s combine these sheet-wide settings in one script for a cleaner look.

Step Overview

This section covers the following topics:

Abstract Method
Formatting Procedure: Show grid, Freeze.

Macro Source

You can obtain the macro source code from the repository:

github.com/…/movies/steps-01/FormatterTabular03.py

Class Diagram

Think slow, and plan your work.

In this diagram, you’ll notice a new highlighted method called set_sheetwide_view(). Additionally, there are two new abstract methods, marked with italic and tilde (~): reset_pos_columns() and set_sheetwide_view(). These denote that they are abstract methods, to be implemented in the derived classes.

Class Diagram: Formatting: Simple Configuration: Step 03

In UML, the abstract methods are denoted by the sharp symbol (#).

Abstract Method

To create an abstract class in Python, we need to use the ABC (Abstract Base Class) library.

The abstract methods are defined with the @abstractmethod decorator, signaling that they must be implemented by any subclass of FormatterBase. These methods will not have any implementation in the base class itself, as their purpose is to define a contract that the subclasses need to follow.

Calc Macro: Step Three: Abstract Methods

Here’s how the abstract methods are defined in the FormatterBase class:

from abc import abstractmethod

class FormatterBase(ABC):
  ...

  @abstractmethod
  def _reset_pos_columns(self) -> None:
    pass

  @abstractmethod
  def _set_sheetwide_view(self) -> None:
    pass

This structure ensures that any subclass of FormatterBase, will be required to implement the _reset_pos_columns() and _set_sheetwide_view() methods.

Omitting ABC

If you write:

class FormatterBase():

instead of:

class FormatterBase(ABC):

It will still work perfectly in Python, because this class will not be an abstract base class, meaning it doesn’t have any special restrictions, or functionality associated with abstract classes.

You can instantiate FormatterBase() directly, and it will behave as a normal class, without enforcing that its methods be overridden.

Basic Flow

Now we can call the function correctly in the basic flow.

Calc Macro: Step Three: Basic Flow

  def __format_one_sheet(self) -> None:
    self._max_row = self.__get_last_used_row()

    if not self.__is_first_column_empty():
      # Rearranging Columns
      print(' * Rearranging Columns')
      self._reset_pos_columns()
      self.__reset_pos_rows()
      self._max_row += 1

    # Apply Sheet Wide
    print(' * Formatting Columns')
    self._set_sheetwide_view()

Then, you implement the required functionality in the descendant class:

class FormatterTabularMovies(FormatterBase):
  ...

  def _set_sheetwide_view(self) -> None:
    ...

  def _reset_pos_columns(self) -> None:
    ...

This structure provides flexibility, ensuring that general formatting logic is kept in the base class, while allowing customization in subclasses.

Formatting: Sheetwide View

The set_sheetwide_view() method is responsible, for configuring two key features:

Hiding Gridlines: Gridlines are useful for data entry but can make reports look cluttered, so we hide them in the final output.
Freezing Panes: Freezing helps keep the header rows and columns visible, while scrolling through large datasets.

Calc Macro: Step Three: Procedure: Sheet Wide View

In the FormatterTabularMovies class, the set_sheetwide_view() method overrides, the abstract method from the FormatterBase class.

class FormatterTabularMovies(FormatterBase):
  def __init__(self) -> None:
    super().__init__(XSCRIPTCONTEXT.getDocument())

  def _set_sheetwide_view(self) -> None:
    # activate sheet
    spreadsheetView = self._controller
    spreadsheetView.setActiveSheet(self._sheet)

    # sheet wide
    spreadsheetView.ShowGrid = False
    spreadsheetView.freezeAtPosition(2, 3)

Customizing Freeze Position

Note that the freeze position (in this case, (2, 3)), can vary depending on the specific project or sheet layout. This position can either be configured dynamically later, or you can keep this method in the descendant class with hardcoded positions. The choice depends on your project requirements.

For now, let’s proceed with this simple implementation, and we can revisit the customization of this feature as needed.

Sheet Result

The resulting sheet will look as shown below:

Calc Sheet: Step Three: Output

Console Logging.

During execution, the following console logs, will be printed to indicate the script’s progress:

>>> tabular_single_movies()
 * Rearranging Columns
 * Formatting Columns
>>>

What is Next 🤔?

After completing the basic macro, the next step is to implement the simple configuration. We’ll explore how far we can take the formatting process, incorporating more customization and refinements along the way.

Additionally, we’ll apply design patterns. Specifically the abstract method pattern, to make the code more modular, extensible, and easier to maintain.

To continue with this approach, you can read more in the next section: [ Formatter - Simple Config- Two ].

Formatter - Simple Config - One

Preface

Why Separate Config?

Example Sheet

Class Diagram

About

01: Step One

Step Overview

Macro Source

Class Diagram

XSCRIPTCONTEXT

Handling XSCRIPTCONTEXT

Initialization

Document in Diagram.

Internal Diagram

Macro Call

Single and Multiple

Formatter Base Class

Formatter Tabular Movies Class

Arranging column

Implementation

Complete Formatting Procedure

02: Step Two

Step Overview

Macro Source

Class Diagram

Basic Flow

Helper: Prevent Column Rearrangement

Helper: Maximum Rows

Formatting: Row Position

Breakdown

Sheet Result

Console Logging.

03: Step Three

Step Overview

Macro Source

Class Diagram

Abstract Method

Omitting ABC

Basic Flow

Formatting: Sheetwide View

Customizing Freeze Position

Sheet Result

Console Logging.

What is Next 🤔?