Log file locations
Most deployments have four Tomcat containers per appliance — one each
for the engine, ETL, retrieval and mgmt components. All logs are
typically sent to arch.log or similar files in the logs/ folder of
each container’s CATALINA_BASE. Log levels are controlled using a
log4j2.xml file in the TOMCAT_HOME/lib folder of each container.
Monitoring your EPICS archiver appliance
Here are some aspects of the EPICS archiver appliance that should be monitored
- Logs
Monitor the logs periodically for Exceptions, OutOfMemory and FATAL error messages. You can use a variation of these commands
find /arch/tomcats -wholename '*/logs/*' -exec grep -l xception {} \; find /arch/tomcats -wholename '*/logs/*' -exec grep -l FATAL {} \; find /arch/tomcats -wholename '*/logs/*' -exec grep -l OutOfMemoryError {} \;
While exceptions in the retrieval and mgmt components could potentially be from user errors, any exceptions/FATAL messages in the ETL/Engine components should immediately be investigated.
- Disk free space
Monitor the disk free space in each of your stores (raising alarms if disk usage increases about a certain limit).
- Connected PVs
You can use the
getApplianceMetricsBPL (seesamples/checkConnectedPVs.py) to monitor the number of currently disconnected PVs. You can then send an email notification to the system administrators if this is greater than a certain percentage or absolute number.- Type changes
You can use the
/getPVsByDroppedEventsTypeChangeBPL (seesamples/checkTypeChangedPVs.py) to watch for any PV’s that have changed type. If a PV changes type, the EPICS archiver appliance will suspend archiving this PV until the situation is manually resolved.
You can rename the PV to a new name.
Pause the PV under the current name.
Rename the PV to a new name using the
/renamePVBPL or the UIDelete the PV under the current name.
Re-archive under the current name.
This should now archive the PV using the new type; however, requests for the older data (which is of the older type) will have to made using the older name.
The EPICS archiver appliance has some support for converting data from one type to the other. This is not available in all cases but you should be able to convert most scalars.
Pause the PV
If needed, consolidate and make a backup of the data for this PV.
Convert to the new type using the
/changeTypeForPVBPLResume the PV (if the conversion process succeeds)
The
/changeTypeForPValters the data that has already been archived; so you may want to make a backup first.
- Maintaining a clean system
Monitoring connected PVs (see above) is made significantly easier if you maintain a clean system. One strategy that can be used is to pause PV’s that have been disconnected for more than a certain time. The
/getCurrentlyDisconnectedPVsreturns a list of currently disconnected PVs and some notion of when the connection to this PV was lost.
You can (perhaps automatically) pause PVs that have been disconnected for more than a certain period of time.
You can (perhaps automatically) resume PVs that have been paused (obtained using the
/getPausedPVsReport) but are now alive.Optionally, you can potentially delete PVs that have been paused for some time and are still not alive.
Scripting for bulk operations
For bulk operations, most administrators will find the scripting
interface useful. All actions in the management UI (plus a few not
exposed in the UI) are accessible from scripts — see
Scripting for details, and the
management scriptables reference
for the full list of callable operations. Sample scripts are also
available in the tomcat_mgmt/webapps/mgmt/ui/help/samples/ folder.