Combine several Excel sheets on Knime

MicrosoftTeams-image (1)

Introduction:

Today, we are going to approach a particular scenario that can appear boring but becomes simple to solve once we settled our logic.

Let’s suppose that we must analyze the sailing evolution of a company on the last three years.

The sailing’s historic is to find on an Excel file, that combines over 36 sheets (each sheet contains the sailings of a given month or year).

To start the analyze, we must then group our different sheets into a unique one.

Doing it manually, is an option but when the number of sheets is as high as it is in our case, it is more judicious to use Knime’s functionalities. Let’s continue!

Inputs:

You’ll receive this Excel sheet as to proceed to our analyze.

It contains 36 sheets grouping sailings from 01-2020 to 12-2022.

How to proceed?

To combine our different sheets, 4 steps are necessary:

  1. Supply an Excel file, containing sheets/tabs that we want to combine.
  2. Convert the Excel sheets into variables and create a loop.
  3. Read the sheet corresponding to the actual variable.
  4. Keep data of the sheet that had just been read and skip to the next variable.
The loop arrives to its end when all the variables had been browsed.

Step by step:

Now that we have incorporated the logic behind our method, it is time to build our workflow:

Step 1:

Double-click on the node Read Excel Sheet Names to give the location and the file which we are working with. That node will have 2 output columns: Path ( location of the sheet ) and Sheet ( appellation of the sheet ).
Yet it is the Sheet column that is important since it will be converted in the next step, into a variable.

Step 2:

The configuration of the node Table row to variable loop starts :

The green box is the location where we insert the columns that we want to convert into variable.

In our case, we want to extract data for each sheet’s name in the file. We then need the sheet’s name, in that case Sheet, to be in the green box to be the variable that leads to the loop. Now that we are configured, we can execute the loop.

Step 3:

Now, we are going to configure the Excel Reader loop.

First, we are going to supply the file from which we want to extract data, then we are going to give the variable that contains the names of the sheets we are trying to combine. To do so, we need to click on the Flow Variables tab.

Click on Settings then select sheet in sheet_name as it is shown:

Here, the sheet’s name option is defined on the variable that we have created during the previous step. We get back data from the file X for the sheet that is now defined by the loop

Step 4:

Last, and easiest step, it is now about configuring the Loop End according to your preferences.

Personally, I wish to have a new row of numbers and don’t need the iteration’s column.

Result:

Now that all our nodes are configured, we can execute our workflow.

Once the loop done, the final table should look like this:

As you can see, we succeed to combine data from different sheets of our Excel file!

Conclusion:

The transformation of the sheets’ name into variable and the creation of a loop are the solution when you end up working with that kind of data. Your time is too precious to waste it on repetitive and tedious tasks when Knime offers functionalities that allows to automate them!

Write a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.