# Concepts

This section describes the concepts used for synchronizing data in ELO Sync.

# Pipeline

A two-stage pipeline is currently used for synchronization, where multiple middleware components are registered.

Each stage of the pipeline must be completed before the next stage is started. This ensures that different changes in multiple systems cannot interfere with each other, thereby reducing the complexity required to process these changes.

A major disadvantage of this approach is that all changes are stored in the memory until the next stage can be processed, leading to higher memory consumption.

Generally, synchronization is done top down, meaning folders are processed before their contents, ensuring that any changes to folder contents can trust that their parent folders have already been synchronized completely.

The first stage includes the analysis and implementation of changes unrelated to deletion. The data of all synchronized systems is retrieved and compared with the metadata stored in ELO Sync to determine which changes have been made since the last synchronization. All recognized changes are converted into modifications for later implementation.

If deletions are found, these are first scheduled normally like other changes and then transferred to the second stage of the pipeline. This is necessary to prevent entries that were not deleted from being deleted. Example: A user moves a file from folder A to folder B and then deletes folder A. If deletion is not scheduled for a later time and it is not guaranteed that the parents are synchronized before their children, deletion would delete folder A, including the moved file.

For more information, refer to the Pipeline chapter.

# System

A system describes software, a server, or a part thereof that contains data to be synchronized.

This can be a SaaS system such as OneDrive or SharePoint Online, or a local ELO repository.

# System Provider

A software component that provides the implementation for access to a system. It is only used sometimes when delegation of system creation is required.

# Collection

Collections are a subset of the contents of a system that provide abstract access to these contents.

Some systems use several collections (especially when contents cannot be uniquely identified), or just a single collection.

The exact details depend on the data provided by the system.

# Subcollection

The term subcollection is used for collections that are contained in another collection, or that are referenced by an element in a collection.

# Synchronization Entry

A synchronization entry, or simply entry, is an abstract unit that describes a thing that is synchronized between several systems.

An entry does not contain its own data, but is instead the superset of all items that are synchronized together.

If it is necessary to uniquely identify an entry, we recommend using the identity of the associated data mapping. This identity is maintained as long as the individual entry items exist.

# Synchronization Item

A synchronization item describes a certain item in a synchronized system, such as a file or a folder. ELO Sync can also synchronize more abstract data such as e-mails or chats.

# Data Mapping

The data mapping contains the metadata of a synchronized entry. This generally means that the data mapping contains hash values or similar elements to also be able to recognize changes at a later time.

The data in the mapping can be stored for a long time and therefore must not contain potentially protected information (such as PII). This is another reason for the use of hash algorithms (or comparable methods) for storing all data in the mapping.

# Structure Mapping

The structure mapping of an entry stores its relative position in the synchronized structure.

In contrast to data mapping, structure mapping does not hash or encrypt data, as it must be reconstructable during later synchronization runs.

# Field

The data of the synchronized elements is accessed through fields. Each field enables access to a single, non-divisible piece of information on the element.

Non-divisible here means that this information should either be synchronized completely or not at all. It does not mean that information cannot be divided and combined.

# System Field

The system fields are an exception that must be provided by all synchronized systems to enable synchronization.

The system fields currently defined are: Name, Content, and Parent.

Of these fields, only the Name field must have contents.

The two Content and Parent fields are required but can be left blank.

# Field Mappings

By using field mappings, individual fields are compared across different systems so that the synchronization process knows which data must be compared and exchanged.

Normally, the user configures the field mappings when creating a synchronization job.

The system fields are excepted from the mappings configured by the user. These fields are always mapped automatically.

In ELO, it is possible to create multiple documents with different metadata forms in the same synchronized folder. This means the field mappings are not simply 1:1 mappings between two systems.

The fields in the synchronization process are always considered n:m relationships, but when certain elements are synchronized, the mapping rules should result in a 1:1 mapping.

# Example

The mappings defined for the synchronization between an ELO repository and OneDrive are listed here as an example.

flowchart LR
subgraph system-a[ELO]
    subgraph mail[Mask: E-Mail]
        field-a2[Field: Subject]
    end
    subgraph rechnung[Mask: Bill]
        field-a3[Field: Title]
    end
    subgraph document[Mask: *]
        field-a1[Short name]
    end
end
subgraph system-b[OneDrive]
    field-b1[Filename]
end
field-a1 <--> field-b1
field-a2 <--> field-b1
field-a3 <--> field-b1

The file in OneDrive only has the file name, but there are multiple metadata forms available in the ELO repository with which this file name can be synchronized.

However, if a certain document is synchronized with the corresponding file in OneDrive, the metadata form of the document is known and the mappings are therefore reduced to a single mapping.

Information

This is just an example.

In a production scenario, the file name is a system field and is always synchronized with the Short name system field in ELO.

If the document has the metadata form Bill in ELO, the following synchronization would be performed:

flowchart LR
subgraph system-a[ELO]
    subgraph invoice["Document (Bill)"]
        field-a3[Field: Title]
    end
end
subgraph system-b[OneDrive]
    subgraph file[File]
        field-b1[File_name]
    end
end
field-a3 <--> field-b1

Synchronization job configuration settings via the user interface →