15 November, 2008

Getting Started With Spoon

Conventions Used

Bold Text : represent new / important concept such as type of step and also representing interface components such as button, menu item, etc.

[Bold in Bracket] : represent step name


What is Spoon ?

Spoon is graphical utility in Kettle suite to do these functionalities :
  • design and run job / transformation
  • job / transformation profiling
  • set up database connection, variable/object sharing, configuration, etc
This article will serve as introductory tutorial for those who are new to Spoon and Kettle.

Running Spoon

  • Spoon in Windows can be executed using these 2 files :
    • Binary : kettle.exe
    • Batch script : spoon.bat




  • Execute one of the two files





  • Upon starting, you will be presented a dialog to choose a repository - a relasional database system repository to store job/transfformation object detail. Clik on No repository button for now to choose our filesystem as our storage.







  • A tips dialog will pop up shortly. Click on close to continue.
  • You will be presented a "Welcome to Pentaho Data Integration: Kettle Project" page in Welcome tag.
  • Done

Spoon Interface

  • Pulldown Menu  : a main navigation for spoon. Comprising of all needed operations on designing, previewing and executing job / transformation.







  • Toolbar : contain icons for basic file operation, printing, repository view and job / transformation operations.






  • Left panel  : contain configuration, steps and favorite steps used splitted in Main Tree, Core Objects dan Favorite Steps sections.







  • Workspace Tabs : act as divider between your workspaces. Picture below show the tag collections of  1 welcome page, 2 transformations and 1 job.





Creating a Transformation File

Here I show you an example how to create a transformation that read a csv (comma separated value)  file containing 33 provinces of Indonesia and dump it to another csv file. The propinsi.csv sample file can be downloaded at the end of this article or directly through this link. Save the file in C:\contoh_kettle folder. 

propinsi.csv

Kode_Propinsi,Deskripsi
P01,Bali
P02,Bengkulu
P03,Banten
P04,Gorontalo
P05,Irian Jaya Barat
P06,Papua
P07,Jambi
P08,Jawa Barat
P09,Jawa Tengah
P10,Jawa Timur
P11,Kalimantan Barat
P12,Kalimantan Tengah
P13,Kalimantan Timur
P14,Kalimantan Selatan
P15,Kepulauan Bangka Belitung
P16,Kepulauan Riau
P17,Lampung
P18,Maluku
P19,Maluku Utara
P20,Nusa Tenggara Barat
P21,Nusa Tenggara Timur
P22,Riau
P23,Sulawesi Barat
P24,Sulawesi Tengah
P25,Sulawesi Tenggara
P26,Sulawesi Selatan
P27,Sulawesi Utara
P28,Sumatra Barat
P29,Sumatra Selatan
P30,Sumatra Utara
P31,DI Yogyakarta
P32,DKI Jakarta
P33,Nanggroe Aceh Darussalam


Now here are the steps to create a transformation handling the csv files :
  • To create a transformation you can do it with several ways :
    • Choose File | New | Transformation from the pulldown menu 







    • Click on  New | Transformation on the toolbar





    • With a CTRL + N shortcut key

  • A new Transformation 1 workspace tab will show up, you rename it by saving our newly created transformation. Press CTRL + S and save file as c:\contoh_kettle\baca_propinsi.ktr
  • Notice that now our tab is changed to baca_propinsi.

Transformation Steps

  • Transformation will consist of several steps. Now we will drop a step visually in our workspace to read the content of ourC:\contoh_kettle\propinsi.csv file.
  • In the left panel open up Core Objects | Input category. Here you will find several steps to read input from several file formats.
  • For our need, we will use CSV file input step.




  • Click on CSV file input icon and drag it into our workspace.






  • Double click on the [CSV file input] step until a dialog shows up.
    • In Filename section fill in c:\contoh_kettle\propinsi.csv. Left the other as they are now.
    • Click on Get Fields to retrieve our known fields. Click OK and Close subsequently for another dialogs that pop up.






    • You will a configuration with values shown like picture below. Click OK to return to our workspace.





  • Save our transfomation file

Data Preview

  • We can preview several records read from our step using preview facility in Spoon.
  • To demonstrate this, click on [CSV file input] and click Preview icon in the toolbar then click on Quick Launch button that shows up.





  • In seconds you will have Examine Preview Data dialog with a number data of records previewing in this window. Close it for now by clicking Close button.







  • Done

Joining Step with a Hop

  • From the left panel open Core Objects | Scripting and drag Modified Java Script Value step into your transformation workspace.
  • Hold CTRL key, click on both [CSV File Input] step and [Modified Java Script Value] then right click and choose New Hop. Click OK on dialog  that show up.






  • We just created a hop that bridging the two steps we created before.






Transform our Data

  • One of  Modified Java Script Value step functionality is to change our data using programmatically using several built in operator and functions. If you know Java very well, you can also embedded Java code in this step. Double click on the step and type following code in the editor that appears.





    var Deskripsi_lengkap = "Propinsi " + Deskripsi;
    var No_urut = getProcessCount("r");

  • Make sure that Compatibility mode ? is unchecked
  • Click on Get variables button.
  • You will have a dialog look as below.






  • Click OK.
  • Evaluate this step by previewing data on it.

Output Step

  • Now we will dump our result from [Modified Java Script Value] to a text file, C:\contoh_kettle\propinsi.txt.
  • Again, from the left panel open Core Objects | Output and drag Text file output type step to workspace.
  • Joining [Modified Java Script Value] and [Text file output] with a hop.





  • Double click on [Text file output]
  • In the pop up Text file output dialog  click on file tab and type  C:\contoh_kettle\propinsi in Filename section.
  • Still on the dialog, click on Fields tab and click Get Fields button to have 4 fields show up (Kode_Propinsi, Deskripsi,Deskripsi_lengkapText file output).
  • Click OK.

Running Transformation

  • Now our transformation already has our goals : read a csv text file, change some value and put it into another 2 fields,  and save the combining fields into a new csv text file.
  • Run the transformation by click Run button on the toolbar.








  • Click Launch on the dialog.
  • You will be redirected to a log workspace with running steps detail information, for example how many rows that are read and written in the step. I will not going further by explaining parts of this workspace but please notice at the bottom panel where there are  detailed logs output there. You see in the last lines that our transformation has been successfully executed.





    Execution logs

    2008/08/24 18:31:22 - baca_propinsi - Dispatching started for transformation [baca_propinsi]
    2008/08/24 18:31:23 - Spoon - The transformation has finished!!
    2008/08/24 18:52:54 - Spoon - Launching transformation [baca_propinsi]...
    2008/08/24 18:52:54 - Spoon - Started the transformation execution.
    2008/08/24 18:52:54 - baca_propinsi - Dispatching started for transformation [baca_propinsi]
    2008/08/24 18:52:55 - Spoon - The transformation has finished!!
  • Now take a look at our C:\contoh_kettle folder, we will have 1 more file there. A newly created propinsi.txt. Open the file with your favorite text editor and see the change from the original file.




    propinsi.txt

    Kode_Propinsi;Deskripsi;Deskripsi_lengkap;No_urut
    P01;Bali                     ;Propinsi Bali;00000000000001.00
    P02;Bengkulu                 ;Propinsi Bengkulu;00000000000002.00
    P03;Banten                   ;Propinsi Banten;00000000000003.00
    P04;Gorontalo                ;Propinsi Gorontalo;00000000000004.00
    P05;Irian Jaya Barat         ;Propinsi Irian Jaya Barat;00000000000005.00
    P06;Papua                    ;Propinsi Papua;00000000000006.00
    P07;Jambi                    ;Propinsi Jambi;00000000000007.00
    P08;Jawa Barat               ;Propinsi Jawa Barat;00000000000008.00
    P09;Jawa Tengah              ;Propinsi Jawa Tengah;00000000000009.00
    P10;Jawa Timur               ;Propinsi Jawa Timur;00000000000010.00
    P11;Kalimantan Barat         ;Propinsi Kalimantan Barat;00000000000011.00
    P12;Kalimantan Tengah        ;Propinsi Kalimantan Tengah;00000000000012.00
    P13;Kalimantan Timur         ;Propinsi Kalimantan Timur;00000000000013.00
    P14;Kalimantan Selatan       ;Propinsi Kalimantan Selatan;00000000000014.00
    P15;Kepulauan Bangka Belitung;Propinsi Kepulauan Bangka Belitung;00000000000015.00
    P16;Kepulauan Riau           ;Propinsi Kepulauan Riau;00000000000016.00
    P17;Lampung                  ;Propinsi Lampung;00000000000017.00
    P18;Maluku                   ;Propinsi Maluku;00000000000018.00
    P19;Maluku Utara             ;Propinsi Maluku Utara;00000000000019.00
    P20;Nusa Tenggara Barat      ;Propinsi Nusa Tenggara Barat;00000000000020.00
    P21;Nusa Tenggara Timur      ;Propinsi Nusa Tenggara Timur;00000000000021.00
    P22;Riau                     ;Propinsi Riau;00000000000022.00
    P23;Sulawesi Barat           ;Propinsi Sulawesi Barat;00000000000023.00
    P24;Sulawesi Tengah          ;Propinsi Sulawesi Tengah;00000000000024.00
    P25;Sulawesi Tenggara        ;Propinsi Sulawesi Tenggara;00000000000025.00
    P26;Sulawesi Selatan         ;Propinsi Sulawesi Selatan;00000000000026.00
    P27;Sulawesi Utara           ;Propinsi Sulawesi Utara;00000000000027.00
    P28;Sumatra Barat            ;Propinsi Sumatra Barat;00000000000028.00
    P29;Sumatra Selatan          ;Propinsi Sumatra Selatan;00000000000029.00
    P30;Sumatra Utara            ;Propinsi Sumatra Utara;00000000000030.00
    P31;DI Yogyakarta            ;Propinsi DI Yogyakarta;00000000000031.00
    P32;DKI Jakarta              ;Propinsi DKI Jakarta;00000000000032.00
    P33;Nanggroe Aceh Darussalam ;Propinsi Nanggroe Aceh Darussalam;00000000000033.00

  • Done

Conclusion

Congratulations, you have just created a simple transfomation designed and executed in Spoon. For more reading on transformation you can check on Pentaho wiki. We will also frequently update our samples in Kettle section. Soalways stay tuned :)

If you have any question at this article, feel free to drop us a note at info@phi-integration.com.

You can download the sample files from http://pentaho.phi-integration.com/kettle/spoon.

No comments: