# Redundancy in the EPICS Archiver Appliance

The EPICS Archiver Appliance has limited support for archiving the same
PV in multiple clusters and merging in the data from both appliances
during data retrieval. This feature allows for a some redundancy when
archiving a small set of critical PVs. At a high level

- Archive the same PV in two independent clusters.
  - The two clusters need not have the same policy. For example, you
    can designate one cluster the **primary** cluster and the other
    one the **secondary** cluster
  - The primary cluster can archive the PV using your usual policies
    while the secondary can store data for a much smaller timeframe.
- Configure one of the clusters ( for performance reasons, the smaller
  of the two, most likely the secondary ) to proxy the other one.
  - When creating the proxy, add a param `mergeDuringRetrieval`. For
    example, add a _External EPICS Archiver Appliance_ proxy with a
    URL that looks like so
    `http://archapp.slac.stanford.edu/retrieval?mergeDuringRetrieval=true`
- Periodically, if needed, merge in the data manually from the
  secondary cluster to the primary using the `mergeInData` BPL.

## Case study

In an ideal world, to achive redundancy when archiving PV\'s, we\'d have
multiple large identical independent clusters archiving the same set of
PVs. For financial and other reasons, this may not be possible for all
installations. This case study outlines a setup for achieving redundancy
for a small subset of critical PVs.

Because we wish to accomplish redundancy for a small set of critical
PVs, we have two asymmetric independent clusters. The **primary**
cluster is your main archiver and will archive millions of PV\'s (
including the small subset of critical PVs ) and process almost all the
data retrieval requests. The **secondary** cluster is an independent
cluster and is the backup archiver and will archive only the small
subset of critical PVs. In this case, the data is stored only for 6
months and then is automatically deleted using a blackhole plugin.

When we add the `mergeDuringRetrieval` proxy to the **secondary**
cluster, all data retrieval requests to the **secondary** cluster will
automatically make a call to the **primary** cluster with the same
parameters and then merge in the data from both clusters. Note that this
does not alter the stored data in any way; the merge is only done for
data retrieval. Thus, data retrieval calls to the **secondary** cluster
will also include data from the **primary** cluster. Folks interested in
a complete data set can make data retrieval calls to the **secondary**
cluster; the data retrieval will be slightly slower because of the
merging operation. Folks making data retrieval calls to the **primary**
will only get data from the primary cluster; but because no merging is
performed, retrieval calls are much faster.

Periodically, one can manually merge in data from the **secondary**
cluster to the **primary** cluster using the `mergeInData` BPL. This
requires data the PV to be paused in the **primary** cluster while the
merge is happening. The `mergeInData` picks up all data from the
**secondary** cluster and merges it into the **primary** cluster.