LibreOffice - CSV Master Detail

LibreOffice

Python - Excel

Local Group

Preface

Goal: Master Detail CSV Reader

Using LibreOffice UNO Macro,
Using Faktur Pajak (Indonesian Tax) as Example,
With Metadata Approach, to avoid too many manual conditional branching.

Local Issue

Indonesia regulation required any VAT (PPn), to be reported using goverment application called ETaxInvoice. This tax application from goverment can be exported into CSV. But this CSV cannot be imported into LibreOffice, because the CSV is a master-detail table relationship.

I also use this case, as an example, of using LibreOffice Macro.

The macro shall be able to import CSV from ETaxInvoice, into LibreOffice.

Indonesian Tax

Bagi teman-teman yang butuh untuk meng-export, berkas CSV dari ETAXInvoice (eFaktur), dapat menggunakan script dari artikel ini, untuk di-ubah-pakai sesuai kebutuhan kantor masing-masing.

Script gratis ini hanya contoh, memang tidak sempurna, walaupun begitu cukup untuk kebutuhan sehari-hari.

Before you read this article, I suggest you to read these three articles as an introduction.

LibreOffice - Decoration Example (prerequisite).
Python - Excel - CSV - Part One (prerequisite).
Python - Excel - CSV - Part Two (optional).

The first two is a prerequisite, the third one is optional.

The second one contain basic CSV manipulation, in console.

Checking Entries: Output

Metadata Approach

To avoid too many hardcoded conditional branch, I keep all setting in metadata as possible.

1: CSV Case: Faktur Keluaran

Consider have a look at this example data.

Input: faktur-keluaran.csv

This data is simply a double quoted comma separated values. You can see the raw text file as below.

CSV: Faktur Keluaran: Plain

Have a look, one by one. There are at least three distinct content.

“FK”: This is the master table.
“FAPR”: This is also one to one relationship with “FK”.
“OF”: This is clearly a detail table of “FAPR”

Using Macro, we can achieve multi row master detail in a worksheet. For example this is the master table.

Python Macro: Worksheet Preview

And here below is the detail table, in the same worksheet.

Python Macro: Worksheet Preview

You can execute macro using APSO extension.

The result can be downloade here:

Output: sample.ods

2: Master Detail Class

It is a good idea to refactor stuff. Consider the tidier code using OOP below:

Source

This example is about only 276 lines of code.

gitlab.com/…/FakturMD/FakturMD.py

Skeleton

Consider begin with the big picture.

class FakturSample:
  def __init__(self, filename):
  def new_sheet(self):
  def split_quotes(self, header):
  def init_header_keys(self):
  def init_field_metadata(self):
  def write_header(self, fields):
  def set_divider_width(self):
  def get_number_format(self, format_string):
  def init_format(self):
  def date_ordinal(self, value, format_source):
  def write_cell(self, cell, metadata, pairs, field_key):
  def write_entry(self, row, fields, keys, values):
  def process_entry(self, row, line):
  def run(self):

def main():

We have a bunch of method definition as shown above.

UNO Import

Apart from common import, UNO constant can be accessed using import

Python Macro: Import

# coding: utf-8
from __future__ import unicode_literals

import re
from datetime import datetime
from com.sun.star.awt.FontWeight import BOLD
from com.sun.star.table.CellHoriJustify import CENTER

I also put, material color value, to be used later as cell background.

blueScale = {
  0: 0xE3F2FD, 1: 0xBBDEFB, 2: 0x90CAF9,
  3: 0x64B5F6, 4: 0x42A5F5, 5: 0x2196F3,
  6: 0x1E88E5, 7: 0x1976D2, 8: 0x1565C0,
  9: 0x0D47A1
}

Here, we import bold and center.

Class Initialization

This is a common part of loading a new blank worksheet.

Python Macro: Class Initialization

class FakturSample:
  def __init__(self, filename):
    # save initial parameter
    self.filename = filename

    # prepare header
    self.init_header_keys()
    self.init_field_metadata()

Nothing special here.

Main Program

The main program execution is just making class instance as below:

Python Macro: Main Program

def main():
  filename = '/home/epsi/Documents'+ \
             '/master-detail/faktur-keluaran.csv'

  sample = FakturSample(filename)
  sample.run()

You can put, your source csv in your own home directory. Or anywhere else to suit your need. Then update the path above.

Execute The Process

The whole process run step by step this run function.

Python Macro: Execute The Process

  def run(self):
    # open blank sheet
    self.new_sheet()

    # write headers
    self.write_header(self.fields_fk)
    self.write_header(self.fields_fapr)
    self.write_header(self.fields_of) 
    self.set_divider_width()

    # parse lines
    with open(self.filename) as f:
      lines = f.readlines()
      f.close()

    # write entries, ignore top headers
    row = 5
    for line in lines[3:]:
      row += 1
      self.process_entry(row, line)

Generally this is the step:

Open New Sheet
Write Headers
Write Entries using Loop

New Blank Sheet

All begin with blank sheet.

Python Macro: New Blank Sheet

  def new_sheet(self):
    # open new sheet
    desktop = XSCRIPTCONTEXT.getDesktop()
    model = desktop.loadComponentFromURL(
      "private:factory/scalc", "_blank", 0, ())

    self.sheet     = model.Sheets[0]
    self.numberfmt = model.NumberFormats
    self.locale    = model.CharLocale

    # post loading properties
    self.init_format()

Number format and locale can only be set, after creating a new blank sheet.

Helper: Split Quotes

Now we begin need to extract the CSV, using regular expression.

And also replace few things, to fix a few bugs, related to the csv source.

Python Macro: Helper: Split Quotes

  def split_quotes(self, header):
    header = header.replace(",\n", ',""')
    header = header.replace("\n", '')

    keys = re.split(r',(?=")', header)
    keys = [key.replace('"', '') for key in keys]
    return keys

Extract Header Field

I need to have verbatim copy from the government standard

Python Macro: Extract Header Field

This part is a copy-and-paste the column header definition.

  def init_header_keys(self):
    header_fk  = \
      '"FK","Kode","Ganti","Faktur","Masa",' + \
      '"Tahun","Tanggal","NPWP","Nama","Alamat",' + \
      '"DPP","PPn","PPnBM","Keterangan","FG",' + \
      '"UM DPP","UM PPn","UM PPnBM","Referensi"'

    header_fapr = \
      '"LT","Nama","Alamat","Perekam",' +\
      '"Wilayah","Timestamp","Hash"'

    header_of  = \
      '"OF","Kode","Nama","Satuan","Jumlah",' +\
      '"Total","Diskon","DPP","PPN","Tarif","PPnBM"'

And also extract the copy above, and make these keys internal properties of the class

    self.keys_fk   = self.split_quotes(header_fk)
    self.keys_fapr = self.split_quotes(header_fapr)
    self.keys_of   = self.split_quotes(header_of)

Consider see the result, so you can imagine how it is going to be.

Python Macro: Worksheet Preview

Field Metadata

The main advantage.

To keep all field tidy, I put all setting in metadata as possible.

You can make your own custom metadata template freely

The metadata control these:

Column Position (Cell Letter)
Column Width
Cell Data Type

This is the master table

Python Macro: Field Metadata

  def init_field_metadata(self):
    self.fields_fk = {
      'FK'       : { 'col': 'B', 'width': 0.3 },
      'Kode'     : { 'col': 'C', 'width': 0.4,
                     'format': '00' },
      'Ganti'    : { 'col': 'D', 'width': 0.4 },
      'Faktur'   : { 'col': 'E', 'width': 1.2, 'type': 'int',
                     'format': '000-00\.00000000' },
      'Lengkap'  : { 'col': 'F', 'width': 1.5 },
      'Masa'     : { 'col': 'G', 'width': 0.4, 'type': 'int', },
      'Tahun'    : { 'col': 'H', 'width': 0.5, 'type': 'int', },
      'Tanggal'  : { 'col': 'I', 'width': 0.8, 'type': 'date',
                     'format': 'DD-MMM-YY;@' },
      'NPWP'     : { 'col': 'J', 'width': 1.5, 'type': 'int',
                     'format': '00\.000\.000\.0-000\.000' },
      'Nama'     : { 'col': 'K', 'width': 3.0 },
      'Alamat'   : { 'col': 'L', 'hidden': True },
      'DPP'      : { 'col': 'M', 'width': 1.4, 'type': 'money' },
      'PPn'      : { 'col': 'N', 'width': 1.4, 'type': 'money' },
      'PPnBM'    : { 'col': 'O', 'width': 0.8, 'type': 'money' },
      'Keterangan' : { 'col': 'P', 'width': 0.8 },
      'FG'       : { 'col': 'Q', 'width': 0.3 },
      'UM DPP'   : { 'col': 'R', 'width': 1.4, 'type': 'money' },
      'UM PPn'   : { 'col': 'S', 'width': 1.4, 'type': 'money' },
      'UM PPnBM' : { 'col': 'T', 'width': 0.8, 'type': 'money' },
      'Referensi': { 'col': 'U', 'width': 0.8 }
    }

Python Macro: Field Metadata

The one to one relationship.

    self.fields_fapr = {
      'LT'       : { 'col': 'W',  'width': 0.5 },
      'Nama'     : { 'col': 'X',  'width': 2.0 },
      'Alamat'   : { 'col': 'Y',  'hidden': True },
      'Perekam'  : { 'col': 'Z',  'width': 1.0 },
      'Wilayah'  : { 'col': 'AA', 'width': 1.4 },
      'Timestamp': { 'col': 'AB', 'width': 0.8, 'type': 'date',
                       'format': 'DD-MMM-YY;@' }
    }

And the detail table.

    self.fields_of = {
      'OF'     : { 'col': 'AD',  'width': 0.3 },
      'Kode'   : { 'col': 'AE',  'width': 0.4 },
      'Nama'   : { 'col': 'AF',  'width': 1.5 },
      'Satuan' : { 'col': 'AG',  'width': 1.4, 'type': 'money' },
      'Jumlah' : { 'col': 'AH',  'width': 0.6, 'type': 'float' },
      'Total'  : { 'col': 'AI',  'width': 1.4, 'type': 'money' },
      'Diskon' : { 'col': 'AJ',  'width': 0.8, 'type': 'money' },
      'DPP'    : { 'col': 'AK',  'width': 1.4, 'type': 'money' },
      'PPn'    : { 'col': 'AL',  'width': 1.4, 'type': 'money' },
      'Tarif'  : { 'col': 'AM',  'width': 0.8, 'type': 'float' },
      'PPnBM'  : { 'col': 'AN',  'width': 0.8, 'type': 'money' },
    }

Writing Header to Sheet

Luckily we can make a generic method, for three headers “FK”, “FAPR”, or “OF”.

Python Macro: Writing Header to Sheet

It is a matter of decorating the cell.

  def write_header(self, fields):
    for key in fields:
      metadata = fields[key]
      letter = metadata['col']

      # take care cell
      cell = self.sheet["%s%d" % (letter, 2)]
      cell.String = key
      cell.CellBackColor = blueScale[3]
      cell.CharWeight    = BOLD
      cell.HoriJustify   = CENTER # or just 2

Python Macro: Writing Header to Sheet

And do not forget the width and visibility.

      # take care column
      column = self.sheet. \
        getColumns().getByName(letter)
      column.CharHeight   = 10
      column.CharFontName = "Arial"

      if 'width' in metadata.keys():
        column.Width = metadata['width'] * 2536

      # take care of visibility
      if ('hidden' in metadata.keys()) \
      and (metadata['hidden']==True):
        column.IsVisible = False

Divider Width

There might be something else not set in metadata.

We should set manually.

Python Macro: Divider Width

  def set_divider_width(self):
    # 2536 unit = 1"
    columns = self.sheet.getColumns()
    columns.getByName('A').Width  = 0.3 * 2536
    columns.getByName('V').Width  = 0.3 * 2536
    columns.getByName('AC').Width = 0.3 * 2536

I’m not sure the unit used in LibreOffice. I think it is about 100 mm. And I like to set up my LO in inch unit. So I end up with 2536 conversion number.

Consider see the preview here:

Python Macro: Worksheet Preview

Helper: Number Format

Since we cannot just set numberFormat. We have to add the format string to LO index first, then assign the index to the cell.

Python Macro: Number Format

    nf = self.numberfmt.queryKey(  \
              format_string, self.locale, True)
    if nf == -1:
       nf = self.numberfmt.addNew( \
              format_string, self.locale)
    return nf

Initialize Formatting String

There are two recurring format,

The money type and
The date type.

Python Macro: Initialize Formatting String

  def init_format(self):
    # beware of the comma or period,
    # depend on locale
    rupiah_string = \
      '" Rp"* #.##0,00 ;' + \
      '"-Rp"* #.##0,00 ;" Rp"* -# ;@ '

    date_string = 'DD-MMM-YY;@'

    self.rupiah_format = self. \
      get_number_format(rupiah_string)

    self.date_format = self. \
      get_number_format(date_string)

The Currency in Rupiah

Python Macro: Worksheet Preview

'" Rp"* #.##0,00 ;"-Rp"* #.##0,00 ;" Rp"* -# ;@'

The Date, using my favorite format

Python Macro: Worksheet Preview

'DD-MMM-YY;@'

Helper: Storing Date Value Programatically

We can make helper to have the value of the date. To be stored later in a cell in a worksheet.

Python Macro: elper: Storing Date Value Programatically

  def date_ordinal(self, value, format_source):
    # Offset of the date value
    # for the date of 1900-01-00
    offset = 693594

    date_value = datetime.strptime(
                   value, format_source)
    return date_value.toordinal() - offset

Writing Content to Cell

The main purpose is to put the right value to the cell, based on metadata, and also put the right format based on metadata.

But this could be tricky, for special case, such as date, that has different date format from the source. One timestamp, and the other is ordinary date.

Python Macro: Writing Content to Cell

  def write_cell(self, \
    cell, metadata, pairs, field_key):

    # extract common variable: value
    value = pairs[field_key]

    # date special case
    if 'type' in metadata.keys() \
    and metadata['type'] == 'date':
      if field_key=='Tanggal':
          cell.Value = self. \
            date_ordinal(value, "%d/%m/%Y")
      elif field_key=='Timestamp':
        if len(value) > 8:
          cell.Value = self. \
            date_ordinal(value[0:8], "%Y%m%d")
      cell.NumberFormat = self.date_format

Python Macro: Writing Content to Cell

The rest is easy. Just put the format according to metadata setting.

    # money special case
    elif 'type' in metadata.keys() \
    and metadata['type'] == 'money':
      cell.Value = float(value)
      cell.NumberFormat = self.rupiah_format

    # take care of format
    elif 'format' in metadata.keys():
      cell.Value = float(value)
      cell.NumberFormat = self. \
        get_number_format(metadata['format'])

    # no date, no number format
    else:  cell.String = value

Writing Each Line of Entry

This function extract the medatada and value.

Python Macro: Writing Each Line of Entry

  def write_entry(self, row, fields, keys, values):
    pairs = dict(zip(keys, values))

    for field_key in fields:
      metadata = fields[field_key]

      letter = metadata['col']
      cell = self.sheet["%s%d" % (letter, row)]

We can additionaly display progress to the console.

      # display progress
      if (values[0]=='FK') and (field_key == 'Nama'):
        print(pairs['Nama'])

And write each cell.

      # take care of value
      if field_key in keys:
        self.write_cell(cell, metadata, pairs, field_key)
      elif field_key=='Lengkap':
        faktur = "%013s" % pairs["Faktur"]
        faktur = faktur[:3] +'-'+ \
                 faktur[3:5] +'.'+ faktur[5:]
        cell.String = pairs["Kode"] + \
                 pairs["Ganti"] + "." + faktur

The only special case is, a field called ‘Lengkap’. This means complete serial number of the tax code. Such as:

010.004-21.31234567

This Lengkap field is a generated field content.

again, we can see the preview as below:

Python Macro: Worksheet Preview

Process Each Entry Line

We need to write three headers, based on the signature: “FK”, “FAPR”, or “OF”.

Python Macro: Process Each Entry Line

  def process_entry(self, row, line):
    values = self.split_quotes(line)

    match values[0]:
      case "FK"  : self.write_entry (row,
        self.fields_fk,   self.keys_fk,   values)
      case "FAPR": self.write_entry (row,
        self.fields_fapr, self.keys_fapr, values)
      case "OF"  : self.write_entry (row,
        self.fields_of,   self.keys_of,   values)

Let me remind the signature: “FK”, “FAPR”, or “OF”, from CSV source here:

CSV: Faktur Keluaran: Plain

All function definition has been explained here.

It is a good time to test the result!

Summary

As a summary, here is what this script do.

Using LibreOffice UNO Macro,
Using Faktur Pajak (Indonesian Tax) as Example,
With Metadata Approach, to avoid too many manual conditional branching.

Sheet Result

How about the output?

Consider to open the sample.ods spreadsheet.

Output: sample.ods

This is going to be very wide sheet. So I took five screenshots.

Python Macro: Worksheet Preview

I think that’s all.

What is Next 🤔?

Still about tax, I have a cool trick to transform plain row based data into cute form view.

Consider continue reading [ LibreOffice - Macro - Copy Range ].

LibreOffice - CSV Master Detail

Preface

Local Issue

Indonesian Tax

Related Article

Metadata Approach

1: CSV Case: Faktur Keluaran

2: Master Detail Class

Source

Skeleton

UNO Import

Class Initialization

Main Program

Execute The Process

New Blank Sheet

Helper: Split Quotes

Extract Header Field

Field Metadata

Writing Header to Sheet

Divider Width

Helper: Number Format

Initialize Formatting String

Helper: Storing Date Value Programatically

Writing Content to Cell

Writing Each Line of Entry

Process Each Entry Line

Summary

Sheet Result

What is Next 🤔?