Alerts

Alert Management

Action WebUI Instructions

View latest alerts

Bell Icon (toolbar)

View all alerts

Bell Icon (toolbar) > View all notifications

Navigate to alerted resource

Bell Icon (toolbar) > [Alert]

Bell Icon (toolbar) > View all notifications > [Alert]

Acknowledge alert

Bell Icon (toolbar) > [Alert] Overflow Menu > Mark Read

Bell Icon (toolbar) > View all notifications > [Alert] Overflow Menu > Mark Read

Delete alert

Bell Icon (toolbar) > View all notifications > [Alert] Overflow Menu > Delete

Note: Only alerts which have been marked as read can be deleted.

Alert Definitions

The alerts are listed here by their KeySafe 5 alert type.

The "nShield Monitor Alert" row, where available, indicates that the alert is similar, although not necessarily identical, to a legacy alert from nShield Monitor.

Other data returned

Data marked with "*" are returned only as part of the summary and not as a specific alert record field. For more information about what data is returned, see Labels.

HSM PSU Failure

HSM PSU failed error condition has occurred.

Valid parameters

for

OpenMetrics used

nshield_error_conditions(source="psu_failed")

Other data returned

esn

nShield Monitor alert

NShieldPowerSupplyFailure

HSM Fan Failure

HSM Fan failed error condition has occurred.

Valid parameters

for

OpenMetrics used

nshield_error_conditions(source="fanX")

Other data returned

esn, source*

HSM Chassis Battery

HSM Chassis battery error condition has occurred.

Valid parameters

for

OpenMetrics used

nshield_error_conditions(source="chassis_battery_low")

Other data returned

esn

HSM Fan Speed

The alert is triggered if a fan speed drops below the minimum limit or exceeds the maximum limit.

Valid parameters

for, over

OpenMetrics used

nshield_fan_speed_rpm, nshield_fan_speed_limit_rpm

Other data returned

esn, fan_id

nShield Monitor alert

NShieldXCFanSpeedZero

HSM Memory Usage Percentage

This is the sum of the module kernel and user memory, expressed as a percentage of the total amount of available memory.

Valid parameters

min, max, for, over

OpenMetrics used

nshield_module_mem_bytes,

nshield_module_mem_alloc_kernel_bytes,

nshield_module_mem_alloc_user_bytes

Other data returned

esn

nShield Monitor alert

memoryUsageHighAlert / memoryUsageOkAlert

HSM Temperature Percentage

The temperature of any sensor, expressed as a percentage of its reported maximum value (calculated from 0°C), is over the specified maximum. The maximum allowed value is 150 percent.

Valid parameters

max, for, over

OpenMetrics used

nshield_temperature_celsius,

nshield_temperature_limit_celsius

Other data returned

esn, sensor*

nShield Monitor alert

NShieldTemperaturePeak

HSM Queue Percentage

The percentage of the jobs queue length relative to the job queue limit is under the specified minimum or over the specified maximum value.

Valid parameters

min, max, for, over

OpenMetrics used

nshield_queue_in_progress,

nshield_queue_length_limit

Other data returned

esn, vcm

nShield Monitor alerts

DeviceNShieldUtilizationOverloads / DeviceNShieldUtilizationPeakEvent

HSM Objects Count

The number of created objects is under the specified minimum or over the specified maximum value.

Valid parameters

min, max, for, over

OpenMetrics used

nshield_objects_stored_total,

nshield_objects_destroyed_total

Other data returned

esn, vcm

nShield Monitor alert

DeviceNShieldHigHObjectCount

Host Hardserver

The host hardserver has failed to communicate recently.

Valid parameters

for

OpenMetrics used

nshield_hardserver_liveness

Other data returned

host

nShield Monitor alert

ClientHostHardserverFailure

HSM Liveness

The HSM has failed to respond and supply metrics.

Valid parameters

for

OpenMetrics used

nshield_hsm_liveness

Other data returned

esn, vcm

nShield Monitor alert

DeviceConnStatus

Host Liveness

The Host has failed to communicate recently.

Valid parameters

for

OpenMetrics used

nshield_host_liveness

Other data returned

host

Licence Expiry

KeySafe 5 licence will expire in less than the specified minimum time.

Valid parameters

min

OpenMetrics used

keysafe5_licence_expiry

Other data returned

licence

HSM Client Licences Remaining

The number of crypto client licences remaining is less than the specified minimum.

Valid parameters

min, for, over

OpenMetrics used

nshield_current_crypto_clients

nshield_current_crypto_clients_limit

Other data returned

esn

Certificate Expiry

KeySafe 5 certificate will expire in less than the specified minimum time.

Valid parameters

min

OpenMetrics used

keysafe5_certificate_expiry

Other data returned

type, agent1

  1. type and agent values determine which of the following certificates is expiring:

    • Central platform will have type "central" or type "ca" (System certificate, System CA certificate)

    • Agents will have type "agent" or type "ca" but will have an agent id (Agent <agent id> certificate, Agent <agent id> CA certificate)