Occasionally, an Elasticsearch index can get corrupted and become unusable. All data that are sent to Elasticsearch is stored in the relational database (except if the property event.processing.processed.events.action has been set to delete, in which case the data is lost).
You might have to rebuild the indices when changing the core Elasticsearch settings (for example, number of shards).
Events are stored in the ACT_EVT_LOG table before they are processed. The IS_PROCESSED_ flag is set to 0 when inserting an event and changing it to 1 to process for ElasticSearch. An asynchronous component will move those table rows with 1 for the flag to the PROCESSED_ACTIVITI_EVENTS.
Therefore, to rebuilding the Elasticsearch index, you must do the following:
-
Remove the data from Elasticsearch (deleting the data folders for example in the embedded mode)
-
Copy the rows from PROCESSED_ACTIVITI_EVENTS to ACT_EVT_LOG and setting the IS_PROCESSED flag to 0 again.
To make things a little bit complex, due to historical reasons, the DATA_ column has different types in ACT_EVT_LOG (byte array) and PROCESSED_ACTIVITI_EVENTS (long text). So a data type conversion is needed when moving rows between those tables.
The following code example does that (using Java, but could be done with other languages). It also first writes the content of PROCESSED_ACTIVITI_EVENTS to a .csv file. This is also useful when this table becomes too big in size: store the data in a file and remove the rows from the database table.