Preface
Goal: Extract worksheet from one xlsx into multiple xlxs.
The reverse of previous process is easier.
Example Data
What case do I need to work with? Yeps, still my last annual accounting report workbook.
- Balance Sheet, monthly from January until December
- Profit Loss, monthly from January until December
Sheet Naming
I name my balance sheet as BS-01
,
and profit loss sheet as ‘PL-01`.
You can see them all in this sheet tabs:
You can see all the sheet name in sheets tab.
File Naming
Let’s extract some sheet
The expected result is similar to this folder below:
How does it works.
Loop in a loop.
-
The outer loop, we query the sheet name. Each item defined by its index.
-
The Inner loop, we walk for each sheet. Remove unused sheet, and keep sheet with the right index.
Limitation
The same as previous.
1: Query Sheet
Script
Very similar with previous script.
But we need index
to identify which sheet to export.
import openpyxl
from openpyxl import load_workbook
wb = load_workbook('monthly.xlsx')
total = len(wb.worksheets)
print("Total %d sheet(s)." % total)
for index, sheet in enumerate(wb.worksheets, start=1):
print("%2d: %s" % (index, sheet.title))
Result
The result is as simply as below:
Total 24 sheet(s).
1: 01-BS
2: 01-PL
3: 02-BS
4: 02-PL
5: 03-BS
6: 03-PL
7: 04-BS
8: 04-PL
9: 05-BS
10: 05-PL
11: 06-BS
12: 06-PL
13: 07-BS
14: 07-PL
15: 08-BS
16: 08-PL
17: 09-BS
18: 09-PL
19: 10-BS
20: 10-PL
21: 11-BS
22: 11-PL
23: 12-BS
24: 12-PL
2: Extract One
Prepare Directory
First we need to prepare directory.
import os
import openpyxl
from openpyxl import load_workbook
loadname = 'monthly.xlsx'
noext = os.path.splitext(loadname)[0]
try:
os.mkdir(noext)
except OSError as error:
print(error)
Execute the Inner Loop
For testing we use the second sheet.
Here we use hardcoded if index_loop == 2:
wb_loop = load_workbook(loadname)
savename = ""
for index_loop, sheet_loop in enumerate(
wb_loop.worksheets, start=1):
if index_loop == 2:
print("Keep %2d: %s" % (index_loop, sheet_loop.title))
savename = "%s.xlsx" % sheet_loop.title
else:
wb_loop.remove(sheet_loop)
Save
And finally save the sheet.
# Save the file
if savename:
wb_loop.save("./%s/%s" % (noext, savename))
print("%s saved" % savename)
else:
print("No sheet saved")
Result
The result is as simply as below:
Keep 2: 01-PL
01-PL.xlsx saved
The sheet
Also check the file.
there should be only one sheet inside file 01-PL.xlsx
3: Extract All sheet
Very similar with previous script, but using loop in a loop.
Prepare Directory
The same as previous.
Prepare Outer Loop
Using main workbook.
wb_main = load_workbook(loadname)
total = len(wb_main.worksheets)
print("Total %d sheet(s)." % total)
Execute the Outer Loop
This also prepare the inner loop.
for index_main, sheet_main in enumerate(
wb_main.worksheets, start=1):
wb_loop = load_workbook(loadname)
savename = ""
Execute the Inner Loop for Each Sheet
This is executed inside the outer loop (wb.main
).
For real script we compare the index
if index_loop == index_main:
for index_loop, sheet_loop in enumerate(
wb_loop.worksheets, start=1):
if index_loop == index_main:
print("Keep %2d: %s" % (index_loop, sheet_loop.title))
savename = "%s.xlsx" % sheet_loop.title
else:
wb_loop.remove(sheet_loop)
Save Each Sheet
This also is executed inside the outer loop (wb.main
).
And finally save the sheet.
# Save the file
if savename:
wb_loop.save("./%s/%s" % (noext, savename))
print("%s saved" % savename)
else:
print("No sheet saved")
Result
The result is rather long:
Total 24 sheet(s).
Keep 1: 01-BS
01-BS.xlsx saved
Keep 2: 01-PL
01-PL.xlsx saved
Keep 3: 02-BS
02-BS.xlsx saved
Keep 4: 02-PL
02-PL.xlsx saved
Keep 5: 03-BS
The sheet
Also check the file. There should be many sheet inside the folder.
With each file consist only one sheet only.
What is Next 🤔?
We have done with merging. How about extracting?
Consider continue reading [ Python - Excel - Extract Sheets ].