Alerts
Alert Management
| Action | WebUI Instructions |
|---|---|
View latest alerts |
Bell Icon (toolbar) |
View all alerts |
Bell Icon (toolbar) > View all notifications |
Navigate to alerted resource |
Bell Icon (toolbar) > [Alert] Bell Icon (toolbar) > View all notifications > [Alert] |
Acknowledge alert |
Bell Icon (toolbar) > [Alert] Overflow Menu > Mark Read Bell Icon (toolbar) > View all notifications > [Alert] Overflow Menu > Mark Read |
Delete alert |
Bell Icon (toolbar) > View all notifications > [Alert] Overflow Menu > Delete Note: Only alerts which have been marked as read can be deleted. |
Alert Definitions
The alerts are listed here by their KeySafe 5 alert type.
The "nShield Monitor Alert" row, where available, indicates that the alert is similar, although not necessarily identical, to a legacy alert from nShield Monitor.
Data marked with "*" are returned only as part of the summary and not as a specific alert record field. For more information about what data is returned, see Labels.
HSM PSU Failure
HSM PSU failed error condition has occurred.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_error_conditions(source="psu_failed") |
Other data returned |
esn |
nShield Monitor alert |
NShieldPowerSupplyFailure |
HSM Fan Failure
HSM Fan failed error condition has occurred.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_error_conditions(source="fanX") |
Other data returned |
esn, source* |
HSM Chassis Battery
HSM Chassis battery error condition has occurred.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_error_conditions(source="chassis_battery_low") |
Other data returned |
esn |
HSM Fan Speed
The alert is triggered if a fan speed drops below the minimum limit or exceeds the maximum limit.
Valid parameters |
for, over |
|---|---|
OpenMetrics used |
nshield_fan_speed_rpm, nshield_fan_speed_limit_rpm |
Other data returned |
esn, fan_id |
nShield Monitor alert |
NShieldXCFanSpeedZero |
HSM Memory Usage Percentage
This is the sum of the module kernel and user memory, expressed as a percentage of the total amount of available memory.
Valid parameters |
min, max, for, over |
|---|---|
OpenMetrics used |
nshield_module_mem_bytes, nshield_module_mem_alloc_kernel_bytes, nshield_module_mem_alloc_user_bytes |
Other data returned |
esn |
nShield Monitor alert |
memoryUsageHighAlert / memoryUsageOkAlert |
HSM Temperature Percentage
The temperature of any sensor, expressed as a percentage of its reported maximum value (calculated from 0°C), is over the specified maximum. The maximum allowed value is 150 percent.
Valid parameters |
max, for, over |
|---|---|
OpenMetrics used |
nshield_temperature_celsius, nshield_temperature_limit_celsius |
Other data returned |
esn, sensor* |
nShield Monitor alert |
NShieldTemperaturePeak |
HSM Queue Percentage
The percentage of the jobs queue length relative to the job queue limit is under the specified minimum or over the specified maximum value.
Valid parameters |
min, max, for, over |
|---|---|
OpenMetrics used |
nshield_queue_in_progress, nshield_queue_length_limit |
Other data returned |
esn, vcm |
nShield Monitor alerts |
DeviceNShieldUtilizationOverloads / DeviceNShieldUtilizationPeakEvent |
HSM Objects Count
The number of created objects is under the specified minimum or over the specified maximum value.
Valid parameters |
min, max, for, over |
|---|---|
OpenMetrics used |
nshield_objects_stored_total, nshield_objects_destroyed_total |
Other data returned |
esn, vcm |
nShield Monitor alert |
DeviceNShieldHigHObjectCount |
Host Hardserver
The host hardserver has failed to communicate recently.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_hardserver_liveness |
Other data returned |
host |
nShield Monitor alert |
ClientHostHardserverFailure |
HSM Liveness
The HSM has failed to respond and supply metrics.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_hsm_liveness |
Other data returned |
esn, vcm |
nShield Monitor alert |
DeviceConnStatus |
Host Liveness
The Host has failed to communicate recently.
Valid parameters |
for |
|---|---|
OpenMetrics used |
nshield_host_liveness |
Other data returned |
host |
Licence Expiry
KeySafe 5 licence will expire in less than the specified minimum time.
Valid parameters |
min |
|---|---|
OpenMetrics used |
keysafe5_licence_expiry |
Other data returned |
licence |
HSM Client Licences Remaining
The number of crypto client licences remaining is less than the specified minimum.
Valid parameters |
min, for, over |
|---|---|
OpenMetrics used |
nshield_current_crypto_clients nshield_current_crypto_clients_limit |
Other data returned |
esn |
Certificate Expiry
KeySafe 5 certificate will expire in less than the specified minimum time.
Valid parameters |
min |
|---|---|
OpenMetrics used |
keysafe5_certificate_expiry |
Other data returned |
type, agent1 |
-
typeandagentvalues determine which of the following certificates is expiring:-
Central platform will have type "central" or type "ca" (System certificate, System CA certificate)
-
Agents will have type "agent" or type "ca" but will have an agent id (Agent <agent id> certificate, Agent <agent id> CA certificate)
-