Kettle, the tool in SkyVault Analytics that performs the ETL (Extract, Transform and
Load) jobs, can be configured to start and run automatically.
-
Create a user named kettle that you will use to run the Kettle
automatic processes, and ensure that this user has write permissions to the
data-integration Analytics installation directory and sub
directories:
sudo useradd -m kettle
-
Test that you can run Kettle as the kettle user:
sudo su kettle -c 'cd opt//data-integration && ./kitchen.sh /file:ETL/all_fct.kjb > all_fct.log 2>&1'
-
Create a crontab file for the kettle user to run
Kettle at intervals:
crontab -u kettle -e
-
Set the time interval for running Kettle in the crontab file.
For example, to specify that Kettle runs every ten minutes, add the following code to the crontab file:
*/10 * * * * cd opt//data-integration && ./kitchen.sh /file:ETL/all_fct.kjb > all_fct.log 2>&1
You should set Kettle to run every few minutes to ensure that the ETL runs regularly. If the previous ETL job has not completed, Kettle will restart when the previous job has finished.
Note: Make sure that you leave an empty line at the end of your crontab file or it might not run. Also check that there is not an exclamation symbol (!) in the second colon-delimited field for the kettle user in /etc/shadow directory. Replace the exclamation symbol with an asterisk symbol (*), before re-saving the crontab file.