You are here

Scheduled Jobs

SkyVault automatically runs a number of scheduled jobs, for example the content store cleaner job and temporary file cleaner job. It is possible to configure new scheduled jobs.
Information Scheduled Jobs
Support Status Full Support
Architecture Information Platform Architecture
Description A scheduled job in SkyVault can be compared to a Unix cron job. It is kicked off based on a cron expression and can then execute a piece of Java code or JavaScript code. The SkyVault Repository embeds the Quartz job scheduler, which is part of the Spring Framework. It works with triggers, jobs, and job details to enable definition of all kinds of scheduled jobs.

To define a new job we start with the job implementation, create a class with an execute method as follows:

public class ScheduledJobExecuter {
    private static final Logger LOG = LoggerFactory.getLogger(ScheduledJobExecuter.class);

    /**
     * Public API access
     */
    private ServiceRegistry serviceRegistry;

    public void setServiceRegistry(ServiceRegistry serviceRegistry) {
        this.serviceRegistry = serviceRegistry;
    }

    /**
     * Executer implementation
     */
    public void execute() {
        LOG.info("Running the scheduled job");

        // Work/Job implementation goes here...
    }
}

The class can be called anything you like, but it is good practice to name it after the job it is executing. In this case it is just a template for how it should be done and it just prints a log statement. Use the ServiceRegistry to get to any Public API services that are needed for the implementation, such as the NodeService.

We then create the Job details class as follows:

public class ScheduledJob extends AbstractScheduledLockedJob implements StatefulJob {
    @Override
    public void executeJob(JobExecutionContext context) throws JobExecutionException {
        JobDataMap jobData = context.getJobDetail().getJobDataMap();

        // Extract the Job executer to use
        Object executerObj = jobData.get("jobExecuter");
        if (executerObj == null || !(executerObj instanceof ScheduledJobExecuter)) {
            throw new SkyVaultRuntimeException(
                    "ScheduledJob data must contain valid 'Executer' reference");
        }

        final ScheduledJobExecuter jobExecuter = (ScheduledJobExecuter) executerObj;

        AuthenticationUtil.runAs(new AuthenticationUtil.RunAsWork<Object>() {
            public Object doWork() throws Exception {
                jobExecuter.execute();
                return null;
            }
        }, AuthenticationUtil.getSystemUserName());
    }
}

The Job details class extends the AbstractScheduledLockedJob class, which has job lock service functionality to lock job, so it can run safely in a cluster. It is also important that it implements the StatefulJob interface so the job is not triggered concurrently on different nodes. The Job details class expects the Job executer to be passed in to it so it can use it to execute the scheduled job. The runAs section of the code makes it possible to set what user that should be used when executing the job, in this case it has been set up to use the System user. If you wanted to use the Admin user you would have to change AuthenticationUtil.getSystemUserName() to AuthenticationUtil.getAdminUserName(). This is the only implementation needed, the rest is Spring configuration.

Start defining a Spring bean for the Job executer as follows:

<bean id="org.alfresco.tutorial.scheduledjob.actions.ScheduledJobExecuter"
    class="org.alfresco.tutorial.scheduledjob.actions.ScheduledJobExecuter">
  <property name="serviceRegistry">
      <ref bean="ServiceRegistry" />
  </property>
</bean>
Then the Job detail bean:
<bean id="org.alfresco.tutorial.scheduledjob.jobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">
  <property name="jobClass">
      <value>org.alfresco.tutorial.scheduledjob.jobs.ScheduledJob</value>
  </property>
  <property name="jobDataAsMap">
      <map>
          <entry key="jobExecuter">
              <ref bean="org.alfresco.tutorial.scheduledjob.actions.ScheduledJobExecuter" />
          </entry>
          <entry key="jobLockService">
              <ref bean="jobLockService" />
          </entry>
      </map>
  </property>
</bean>

The Job detail bean takes a jobClass representing the Job details implementation and a jobExecuter class that will do the actual work. The jobLockService is passed in to handle locking in a cluster environment.

Next step is to define a Job trigger bean:

<bean id="org.alfresco.tutorial.scheduledjob.trigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
  <property name="jobDetail">
      <ref bean="org.alfresco.tutorial.scheduledjob.jobDetail" />
  </property>
  <property name="cronExpression">
      <value>${org.alfresco.tutorial.scheduledjob.cronexpression}</value>
  </property>
  <property name="startDelay">
      <value>${org.alfresco.tutorial.scheduledjob.cronstartdelay}</value>
  </property>
</bean>

In this case we have defined a Cron trigger, there are other triggers like SimpleTriggerFactoryBean that can be used too. The trigger bean takes a reference to the jobDetail bean so it knows what job to kick off. It also takes to parameters with the cron expression and cron start delay. In this case we have defined these parameters as being set via external properties. This is good practice so a System Administrator can manage the scheduled jobs. These properties will go into the modules SkyVault-global.properties as follows:

org.alfresco.tutorial.scheduledjob.cronexpression=0 0/2 * * * ?
org.alfresco.tutorial.scheduledjob.cronstartdelay=240000
In this case the scheduled job is set up to be run every second minute. And there will be a start delay of 4 minutes for the job. The start delay is important as this makes it possible to delay all scheduled jobs until the SkyVault server has started up properly, otherwise search might not work properly for example.

The last thing needed to get this scheduled job going is to pass in the trigger to a scheduler, this can be done as follows:

<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
  <property name="triggers">
      <list>
          <ref bean="org.alfresco.tutorial.scheduledjob.trigger"/>
      </list>
  </property>
</bean>

Sometimes we might want the scheduled job to be applied to a set of nodes determined by a query and the job implementation to be in the form of a Repository Action. Having the job implemented as a Repository Action is handy as it can then be re-used in other places. Implementing this kind of scheduled job usually starts off with the Repository Action class:

public class SimpleRepoActionExecuter extends ActionExecuterAbstractBase {
    private static final Logger LOG = LoggerFactory.getLogger(SimpleRepoActionExecuter.class);

    public static final String PARAM_SIMPLE = "simpleParam";

    /**
     * The SkyVault Service Registry that gives access to all public content services in SkyVault.
     */
    private ServiceRegistry serviceRegistry;

    public void setServiceRegistry(ServiceRegistry serviceRegistry) {
        this.serviceRegistry = serviceRegistry;
    }

    @Override
    protected void addParameterDefinitions(List<ParameterDefinition> paramList) {
        paramList.add(new ParameterDefinitionImpl(
                PARAM_SIMPLE,
                DataTypeDefinition.TEXT,
                true,
                getParamDisplayLabel(PARAM_SIMPLE)));
    }

    @Override
    protected void executeImpl(Action action, NodeRef actionedUponNodeRef) {
        // Get parameter values
        String simpleParam = (String) action.getParameterValue(PARAM_SIMPLE);

        LOG.info("Simple Repo Action called from scheduled Job, [" + PARAM_SIMPLE + "=" + simpleParam + "]");

        if (serviceRegistry.getNodeService().exists(actionedUponNodeRef) == true) {
            // The implementation of the Repo Action goes here...
            String nodeName = (String)serviceRegistry.getNodeService().getProperty(
                    actionedUponNodeRef, ContentModel.PROP_NAME);

            LOG.info("Simple Repo Action invoked on node [name=" + nodeName + "]");
        }
    }
}

The repository action class extends the ActionExecuterAbstractBase as usual and implements the addParameterDefinitions and executeImpl methods that are part of the action interface. See more information about how to implement repo actions here. We use the ServiceRegistry to get to the public API, such as the NodeService. The actionedUponNodeRef will contain a node that is part of a result from a query set up in a Spring bean (we will configure this bean in a bit).

This is the only Java code that needs to be implemented, the rest is Spring bean configuration, let's start with the Repository Action:

<bean id="simple-action"
	  class="org.alfresco.tutorial.scheduledjob.actions.SimpleRepoActionExecuter"
	  parent="action-executer">
	<property name="serviceRegistry">
		<ref bean="ServiceRegistry" />
	</property>
</bean>

This simple repository action has been given the ID simple-action. Next bean up is:

<bean id="templateActionModelFactory"
      class="org.alfresco.repo.action.scheduled.FreeMarkerWithLuceneExtensionsModelFactory">
    <property name="serviceRegistry">
        <ref bean="ServiceRegistry" />
    </property>
</bean>

This defines a factory implementation that builds suitable models for the FreeMarker templating language. Next is the template action bean definition that will refer to the repository action bean and pass in any needed parameters to the action:

<bean id="org.alfresco.tutorial.scheduledjob.repoaction.simpleTemplateActionDefinition"
          class="org.alfresco.repo.action.scheduled.SimpleTemplateActionDefinition">
        <property name="actionName">
            <value>simple-action</value>
        </property>
        <property name="parameterTemplates">
            <map>
                <entry>
                    <key><value>simpleParam</value></key>
                    <value>Simple param value</value>
                </entry>
            </map>
        </property>
        <property name="templateActionModelFactory">
            <ref bean="templateActionModelFactory" />
        </property>
        <property name="dictionaryService">
            <ref bean="DictionaryService" />
        </property>
        <property name="actionService">
            <ref bean="ActionService" />
        </property>
        <property name="templateService">
            <ref bean="TemplateService" />
        </property>
    </bean>

Here the action-name will reference the Repository Action ID, which is simple-action in our case. The parameterTemplates map contain any parameters that the repository action is expecting, such as the simpleParam. The last bean we need to define is the one that specifies the node query and the cron expression for when to run the scheduled job:

<bean id="org.alfresco.tutorial.scheduledjob.repoaction.simpleRepoActionCronJob"
          class="org.alfresco.repo.action.scheduled.CronScheduledQueryBasedTemplateActionDefinition">
        <property name="transactionMode">
            <value>UNTIL_FIRST_FAILURE</value>
        </property>
        <property name="compensatingActionMode">
            <value>IGNORE</value>
        </property>
        <property name="searchService">
            <ref bean="SearchService" />
        </property>
        <property name="templateService">
            <ref bean="TemplateService" />
        </property>
        <property name="queryLanguage">
            <value>lucene</value>
        </property>
        <property name="stores">
            <list>
                <value>workspace://SpacesStore</value>
            </list>
        </property>
        <property name="queryTemplate">
            <value>PATH:"/app:company_home/*"</value>
        </property>
        <property name="cronExpression">
            <value>${org.alfresco.tutorial.scheduledjob.repoaction.cronexpression}</value>
        </property>
        <property name="jobName">
            <value>SimpleRepoActionJob</value>
        </property>
        <property name="jobGroup">
            <value>AlfrescoTutorialsJobGroup</value>
        </property>
        <property name="triggerName">
            <value>triggerSimpleRepoAction</value>
        </property>
        <property name="triggerGroup">
            <value>AlfrescoTutorialsTriggers</value>
        </property>
        <property name="scheduler">
            <ref bean="schedulerFactory" />
        </property>
        <property name="actionService">
            <ref bean="ActionService" />
        </property>
        <property name="templateActionModelFactory">
            <ref bean="templateActionModelFactory" />
        </property>
        <property name="templateActionDefinition">
            <ref bean="org.alfresco.tutorial.scheduledjob.repoaction.simpleTemplateActionDefinition" />
        </property>
        <property name="transactionService">
            <ref bean="TransactionService" />
        </property>
        <property name="runAsUser">
            <value>System</value>
        </property>
    </bean>

The queryTemplate property should contain the node query. In this case we have specified a Lucene PATH query PATH:"/app:company_home/*" that will return all nodes under /Company Home. The repository action, which is indirectly specified via the templateActionDefinition property, will be called ones for each one of these nodes. And the whole job will be kicked off based on the cronExpression that we specify, in this case we pass it in via external property that is specified in the SkyVault-global.properties file.

When the job is kicked off and the repo action is called for each node matching the query there are a couple of parameters that can be used to control the behavior, first the transactionMode:

  • ISOLATED_TRANSACTIONS - for each node the action is run in an isolated transaction. Failures are logged.
  • UNTIL_FIRST_FAILURE - for each node the action is run in an isolated transaction. The first failure stops this.
  • ONE_TRANSACTION- the actions for all nodes are run in one transaction. One failure will roll back all.

Then we got the compensatingActionMode parameter:

  • IGNORE - This parameter is not used when the action is implemented as a SimpleTemplateActionDefinition
  • RUN_COMPENSATING_ACTIONS_ON_FAILURE - This parameter can be used to indicate that in case of the action failing call another compensation action. This requires the use of the compensatingTemplateActionDefinition property in the SimpleTemplateActionDefinition definition.

Sometimes it is convenient to be able to have the action implemented as a server side JavaScript. This can be done with the following type of SimpleTemplateActionDefinition:

    <bean id="runScriptAction" class="org.alfresco.repo.action.scheduled.SimpleTemplateActionDefinition">
    <property name="actionName">
        <value>script</value>
    </property>
    <property name="parameterTemplates">
        <map>
            <entry>
                <key>
                    <value>script-ref</value>
                </key>
                <value>\$\{selectSingleNode('workspace://SpacesStore', 'lucene', 'PATH:"/app:company_home/app:dictionary/app:scripts/cm:exampleScript.js"' )\}</value>
            </entry>
        </map>
    </property>
    <property name="templateActionModelFactory">
        <ref bean="templateActionModelFactory"/>
    </property>
    <property name="dictionaryService">
        <ref bean="DictionaryService"/>
    </property>
    <property name="actionService">
        <ref bean="ActionService"/>
    </property>
    <property name="templateService">
        <ref bean="TemplateService"/>
    </property>
</bean>

Here the action-name refers to one of the out-of-the-box repository actions called script that can be used to execute a JavaScript. The script is passed in as the script-ref parameter.

Deployment - App Server Most suitable for JavaScript backed jobs using the script template action. For Java backed jobs use a Repo AMP as described below.
  • Spring Beans: tomcat/shared/classes/alfresco/extension/my-content-model-context.xml (File name has to end in -context.xml to be picked up as Spring Bean context file)
  • JavaScript file: Upload to /Company Home/Data Dictionary/Scripts
These file locations are untouched by re-deployments and upgrades.
Deployment SDK Project
  • Java job implementations: repo-amp/src/main/java/
  • Job default configuration: repo-amp/src/main/amp/config/alfresco/module/repo-amp/alfresco-global.properties
  • Spring Beans: repo-amp/src/main/amp/config/alfresco/module/repo-amp/context/scheduler-context.xml
More Information
Sample Code
Tutorials None
SkyVault Developer Blogs None