# Storage Plugins The starting point for using an alternate storage technology is to create implementations of the [StoragePlugin](../_static/javadoc/org/epics/archiverappliance/StoragePlugin.html) interface and register them in [StoragePluginURLParser](../_static/javadoc/org/epics/archiverappliance/config/StoragePluginURLParser.html). In addition to the StoragePlugin interface, additional optional interfaces facilitate additional functionality 1. [ETLSource](../_static/javadoc/org/epics/archiverappliance/etl/ETLSource.html) \-- The lets a StoragePlugin act as a source of data in the ETL process. 2. [ETLDest](../_static/javadoc/org/epics/archiverappliance/etl/ETLDest.html) \-- The lets a StoragePlugin act as a destination of data in the ETL process. 3. [StorageMetrics](../_static/javadoc/org/epics/archiverappliance/etl/StorageMetrics.html) \-- The lets a StoragePlugin provide metrics that are displayed in the UI and participate in capacity planning. Writing a new StoragePlugin does take some effort but with this separation, you should be able to support a wide variety of storage technologies. For more details, please see the Javadoc. ## NIO2 The [PlainStoragePlugin](../_static/javadoc/edu/stanford/slac/archiverappliance/plain/PlainStoragePlugin.html) can be viewed as a chunking [StoragePlugin](../_static/javadoc/org/epics/archiverappliance/StoragePlugin.html). It chunks data into `Time instant t ↔ Chunk key`, well-defined time-partitions (instead of individual samples) and various business processes in the archiver appliance understand these time-partitions and deal with them efficiently. Each chunk has a well defined [key](#key-mapping) and one can choose to store a chunk in any storage provider that provides block storage. The PlainStoragePlugin supports multiple backends for serialization: 1. [**Protocol Buffers (PB)**](../developer/pb_pbraw): The default backend, using the `pb:` scheme. 2. [**Apache Parquet**](../developer/parquet): A columnar storage backend, using the `parquet:` scheme. For existing PVs, you can modify storage parameters or switch backends using the [`/changeStore`](#changestore-bpl) Management BPL action. This allows for updating partition granularity, compression settings, or migrating between PB and Parquet without losing data. The PlainStoragePlugin uses Java [NIO2](http://docs.oracle.com/javase/7/docs/api/java/nio/file/package-summary.html) as the storage API. Java NIO2 has a [documented mechanism](http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/filesystemprovider.html) for developing custom file system providers. Using custom NIO2 file system providers, one can store the chunks generated by the PlainStoragePlugin using storage technologies like 1. Database BLOBS 2. Any key/value store (for example, SciDB) 3. Other technologies that may be more appropriate To add custom NIO2 file system providers for use in the archiver appliance, please look at the JavaDoc for our version of Java\'s [Paths](../_static/javadoc/org/epics/archiverappliance/utils/nio/ArchPaths.html). ## Type systems The archiver appliance uses Google\'s [ProtocolBuffers](https://developers.google.com/protocol-buffers) as the serialization scheme. There are plenty of other algorithms that offer the same functionality, of great interest is the serialization scheme used in the [EPICS V4 protocol](http://epics-pvdata.sourceforge.net/pvAccess_Protocol_Specification.html). Support for alternate serialization mechanisms is possible by adding support for [alternate type systems](../_static/javadoc/org/epics/archiverappliance/config/TypeSystem.html). Please contact the collaboration if you\'d want to consider using alternate serialization mechanisms. For more detailed information on implementation and tools for analyzing Parquet files, see the [Parquet backend](../developer/parquet) page. ## Configuration To use the Parquet backend, use the `parquet:` scheme in your `dataStores` definition within `policies.py`. For the protobuf backend, use the `pb:` scheme. Example: ```python "dataStores": [ "pb://localhost?name=STS&rootFolder=${ARCHAPPL_SHORT_TERM_FOLDER}&partitionGranularity=PARTITION_HOUR", "parquet://localhost?name=MTS&rootFolder=${ARCHAPPL_MEDIUM_TERM_FOLDER}&partitionGranularity=PARTITION_DAY&compress=ZSTD&zstdLevel=0", "parquet://localhost?name=LTS&rootFolder=${ARCHAPPL_LONG_TERM_FOLDER}&partitionGranularity=PARTITION_YEAR&compress=ZSTD&zstdLevel=5" ] ``` ### Compression Parquet supports various compression codecs. The archiver appliance specifically supports: - **UNCOMPRESSED**: No compression. - **SNAPPY**: High speed, reasonable compression. - **ZSTD**: Excellent balance between compression ratio and speed. #### ZSTD Configuration When using ZSTD compression, several advanced configuration options are available via the storage plugin URL: - `zstdBufferPool` (boolean): Enables the use of a buffer pool for ZSTD compression/decompression. Defaults to `false`. - `zstdLevel` (integer): Sets the ZSTD compression level (typically 1-22). Defaults to `3`. - `zstdWorkers` (integer): Sets the number of worker threads for ZSTD. Defaults to `0` (single-threaded). ## Conversion The archiver appliance provides several ways to convert data between different backends or to update storage parameters for existing PVs. ### ConvertFile Utility The `ConvertFile` utility can be used for ad-hoc conversion of individual files. ```bash # Convert a PB file to Parquet with ZSTD compression java -cp ... edu.stanford.slac.archiverappliance.plain.utils.ConvertFile /data/pv.pb PARQUET compress=ZSTD zstdLevel=3 ``` ### ChangeStore BPL For a more automated approach, the `/changeStore` [Management BPL action](../../developer/references/mgmt_scriptables.md) can be used to update the storage configuration for a single PV. This action allows you to: - **Change Backend**: Convert existing data from one backend to another (e.g., from `pb` to `parquet`). - **Update Partition Granularity**: Change partition granularity (e.g., from `PARTITION_HOUR` to `PARTITION_DAY`) - **Compression Parameters**: any other storage plugin URL parameters (like compression settings). - **etc**