-
Navigate to the Analytics installation directory. Use the
data-integration/ETL/config/config.properties.sample to create a
kettle.properties file in
<kettle-user-home>/.kettle/, where
<kettle-user-home> is the home directory of the user that will
run the Kettle script:
- In the SkyVault DATABASE section, enter the details required to read from the database used by your SkyVault installation.
-
In the ETL (WAREHOUSE) DATABASE section, enter the details
required to read from and write to the database that you created in Setting up the DI database.
Important: The db_dialect attribute must be set to pgsql or mysql, depending on whether you have PostgreSQL or MySQL configured as your DI database.
- In the BA SERVER CONNECTION section, set the value of ba_serverURL to the URL of the Analytics server that the ETL job uses to flush the cache; for example, http://localhost:8080/ba-server.
- In the same section, set the ba_admin and ba_password values to match the user name and password that you have set for the Analytics administrator user.
- In the ETL Staged Message Purge (Hours) section, you can optionally change the value of etl_purge_msg_older_than_hrs. By default, all processed messages are purged after 72 hours.
- In the BA server cache flush period (mins) section, you can optionally change the value of etl_ba_server_cache_flush. By default, the Mondrian cache is flushed every 360 minutes (6 hours).
You can review the default settings for config.properties.sample in config.properties. Do not edit config.properties directly.
-
Verify that the database parameters are correct by running the
test_setup.kjb job:
cd data-integration ./kitchen.sh /file:ETL/test_setup.kjb
Alternatively, use this command if the user that created the .kettle/kettle.properties file is different from the current user:
cd data-integration sudo su <kettle-user> -c './kitchen.sh /file:ETL/test_setup.kjb'
where <kettle-user> is the user that created the kettle.properties file.The SkyVault, Process and Pentaho database connections should indicate that processing completed successfully with an output of result=[true]. Here is a sample log from test_setup.kjb:
2015/01/06 16:04:52 - test_setup - Starting entry [Output database details] 2015/01/06 16:04:52 - config - SkyVault: jdbc:mysql://localhost:3306/SkyVault 2015/01/06 16:04:52 - config - Process: jdbc:mysql://localhost:3306/SkyVault 2015/01/06 16:04:52 - config - Pentaho: jdbc:mysql://localhost:3306/pentaho_di 2015/01/06 16:04:52 - test_setup - Starting entry [Check SkyVault Db Connection] 2015/01/06 16:04:52 - test_setup - Starting entry [Check Pentaho Db connection] 2015/01/06 16:04:52 - test_setup - Starting entry [Check Process Db connection] 2015/01/06 16:04:52 - test_setup - Finished job entry [Check Process Db connection] (result=[true]) 2015/01/06 16:04:52 - test_setup - Finished job entry [Check Pentaho Db connection] (result=[true]) 2015/01/06 16:04:52 - test_setup - Finished job entry [Check SkyVault Db Connection] (result=[true]) 2015/01/06 16:04:52 - test_setup - Finished job entry [Output database details] (result=[true]) 2015/01/06 16:04:52 - test_setup - Finished job entry [set env] (result=[true]) 2015/01/06 16:04:52 - test_setup - Job execution finished 2015/01/06 16:04:52 - Kitchen - Finished! 2015/01/06 16:04:52 - Kitchen - Start=2015/01/06 16:04:49.979, Stop=2015/01/06 16:04:52.881 2015/01/06 16:04:52 - Kitchen - Processing ended after 2 seconds.
You are now ready to invoke the ETL job, kitchen.sh, that creates the database schema and populates initial data into the data warehouse. -
From the data-integration directory, run the Kettle
kitchen.sh script:
./kitchen.sh /file:ETL/schema_setup.kjb
If the user that created the kettle.properties file is different from the current user, you will need to use the command:
cd data-integration sudo su <kettle-user> -c './kitchen.sh /file:ETL/schema_setup.kjb
where <kettle-user> is the user that created the kettle.properties file.This script runs the ETL job to create the schema and to populate initial data into the data warehouse.
- Navigate to ba-server/tomcat/conf/context.xml and check that the jdbc/AlfrescoDataSource Resource element details are the same as those that you made to the kettle.properties file.
You are here
Step 7. Setting up the database schema and ETL jobs
Configure the ETL (Extract, Transform and Load) jobs and create the database schema for
SkyVault Analytics.
Make sure that you have your Data Integration database (MySQL or PostgreSQL) set up and
working. Use this information to set up Kettle (the tool that runs the ETL jobs) to use this
database, and to connect to the SkyVault database and Analytics server.
© 2017 TBS-LLC. All Rights Reserved. Follow @twitter