# Basics

In the following chapter, you will learn how data synchronization and data transfer work.

# Functionality

ELO Replication compares entries between multiple repositories. The relevant repositories can be installed at different sites. ELO Replication transfers the data to the relevant repositories. This means the repositories do not necessarily have to be available to one another.

ELO Replication is a web app based on Java that is installed on an Apache Tomcat.

# Replication set

The replication set is a characteristic that you assign to individual repositories in order to replicate them in another repository. Create sites and add repositories in the web-based configuration. Using the ELO Indexserver URL, specify which repositories data is replicated from. A replication set is created automatically in the configuration when you add a new repository. Each replication set stands for one repository. The individual entries (folders, documents) to be replicated are selected in the ELO Java Client or ELO Web Client using the "Assign replication sets" function. With this function, you select the repository you want to export the entries to. This means you select a repository the entries are replicated to.

# Data synchronization

ELO Replication captures, distributes, and transfers changes to the relevant repositories. An extension to the ELO Indexserver creates a synchronization data set with the changes to a repository. The format of this data set is a compressed stream of JSON objects from the ELO Indexserver API. The data in this stream is selected in the ELO Indexserver based on its synchronization status. The following options are available:

Entries without a replication set: Entries that you do not want to synchronize are not assigned a replication set. A replication set determines which other repositories the entry is synchronized with. Entries without a replication set are not added to the synchronization data set.

Entries with a new replication set: Entries that you want to synchronize are assigned new replication sets. The information from the entries is added to the synchronization data set. All the entry information is only sent to the repositories assigned the new replication set. If entries have already been assigned replication sets, their repositories are simply informed that the entries are now replicated with additional repositories.

The ELO Indexserver assigns replicated entries the tstampsync field when creating the synchronization data set in the database. The tstampsync field contains the value from the corresponding tstamp field at the time the data was read from the database. In the synchronization data set, the tstampsync contains the value it had when the database was read. This difference plays an important role when importing the data set.

Information

The name of the tstampsync field varies depending on the table.

Entries changed since the last synchronization: The entries were changed since the last synchronization. To recognize a change, the tstamp and tstampsync fields are compared. In case of changes, the ELO Indexserver automatically sets the tstamp field to the current time in UTC (Coordinated Universal Time).

Entries unchanged since the last synchronization: The entries were not changed since the last synchronization. For unchanged entries, the values in the tstamp and tstampsync fields are identical. Unchanged entries are not added to the synchronization data set.

# Data transfer

The synchronization data set is created by the ELO Indexserver. ELO Replication initiates its creation based on a configured schedule. The synchronization data set is streamed by the ELO Replication to the other sites, and from there it is streamed to the ELO Replication of the other repositories so it can be imported. During streaming, the data set is processed between the ELO Replication instances. Only the data required is sent at the target is sent.

To compensate for instabilities during transmission, ELO Replication caches the data sets. If disconnected, ELO Replication re-attempts to send the data set once a minute. The data set is transferred all over again, regardless of when the previous transfer was interrupted.

The SSHD library from the Apache MINA project is used for data transfer. The method with public and private keys is used exclusively for authentication. The keys are automatically generated for each site when configuring ELO Replication.

Information

You can add additional options for transferring data by creating and integrating plug-ins.

The ELO Indexserver imports the synchronization data set into the target repository. The following options are available:

Entry does not exist in repository: The GUID is used to check whether an entry already exists in the repository. If the entry does not exist in the repository, it is imported.

Entry already exists in the repository and has not changed since the last synchronization: The values in the tstampsync fields of the synchronization data set and repository entries are identical. The entry is imported into the repository. The entry in the repository is overwritten with the values from the synchronization data set.

Entry already exists in the repository and has competing changes: The values in the tstamp fields of the synchronization data set and repository entries do not match. If an entry has been changed in multiple repositories, the latest change is applied. If the latest change comes from the target repository, the entry in the synchronization data set is ignored. Otherwise, the values of the entry in the repository would be replaced with the values in the synchronization data set.

In rare cases, the values in the tstampsync fields may differ. This occurs when a synchronization data set is created in the local repository before the synchronization data set of the other repository has been read and the entry is changed in both repositories. The latest change is applied in this case as well.

# What data is replicated?

The following data is synchronized during replication:

  • Folders
  • Documents
  • Sticky notes
  • Relations
  • Workflows
  • Workflow templates
  • Map data
  • Feed
  • Master data: users and groups (via owners and ACLs), metadata forms, aspects, colors, replication sets

Master data is resolved recursively. For example, if a user is listed in the ACL of a folder, this user's groups are also included in the synchronization data set.

Please note

File version histories are not replicated.

# How is master data replicated?

During data export, the ELO Indexserver checks which master data belongs to a SORD.

Only users that a SORD explicitly refers to are replicated. If you want to replicate a specific user, this user has to be referenced by a SORD, e.g. via permissions (Metadata > Permissions) or owner rights (e.g. Create folder, File document, Apply stamp/annotation, or Start workflow).

Please note

If a user is replicated, the groups the user belongs to are also replicated (without the individual members).

If a group is replicated, for example because it has permission to a SORD, the individual group members are not replicated.

If a metadata form is replicated, a time stamp is set in the masktstampsync field of the docmasks table in the database. Metadata forms that have already been replicated are added to the synchronization data set if changes have been made to the forms.

Aspects are replicated based on their use in replicated metadata forms. Changes are captured via the values of the tstamp and tstampsync fields. In the event of a conflict, the "winning" aspect overwrites the "losing" aspect. In the future, after initial replication due to metadata form use, aspects will be replicated after every change.

# How are workflows replicated?

The following section describes the behavior starting with ELO Indexserver version 23.00.

With the default settings, a workflow can only run in one repository. During data export, the workflow is assigned a flag indicating which repository it is running in. In the target repository, the workflow is displayed after replication but it will not continue. You cannot start, edit, or delete the workflow in the target repository.

To use the workflow in the target repository as well, the flag needs to be changed during export. This is done with a server transfer node, which is added in the workflow designer. If a server transfer node is set, the workflows stops at this node. Once the data has been transferred through replication, the workflow continues in the target repository. The entire workflow including all subworkflows is always replicated.

Please note

The server transfer can only take place in a main workflow, not in a subworkflow. If you want to start a subworkflow at site B, the server transfer must take place in the main workflow at site A. If a group is replicated, for example because it has permission to a SORD, the individual group members are not replicated.

A subworkflow should only run at one site. You should not start a subworkflow at site A and continue it at site B, as conflicts that cannot be resolved automatically can occur if the main workflow and subworkflow are running in different repositories at the same time.

# How are workflow templates replicated?

Workflow templates are first replicated based on their use in replicated workflows. After this, changes to workflow templates are also tracked with the values of the tstamp and tstampsync fields and replicated after every change. In the event of a conflict, the "winning" template overwrites the "losing" template.

# How is map data replicated?

Map data is replicated based on its association with the SORD/document of the respective assigned replication sets. Map data is arranged in different map domains. These domains are also replicated. In the map domain, a flag can be used to indicate whether they should also be replicated. If this flag is set to false, the entire domain and all map data in it are ignored by replication.

Changes are tracked per map and SORD and captured via the values of the tstamp and tstampsync fields. Conflicts during import are therefore handled at the map level, and not for individual map fields. The "winning" map therefore overwrites the "losing" map.

Dernière mise à jour: 12 septembre 2024 à 05:28