Logging, debugging, and diagnostics

This appendix describes the settings and tools you can use to access the logging and debugging information generated by the Security World Software. You are also shown how to obtain system information using the nfdiag command-line utility.

Logging and debugging

The current release of Security World Software uses controls for logging and debugging that differ from those used in previous releases. However, settings you made in previous releases to control logging and debugging are still generally supported in the current release, although in some situations the output is now formatted differently.
Some text editors, such as Notepad, can cause NFLOG to stop working if the NFLOG file is open at the same time as the hardserver is writing the logs.

Environment variables to control logging

The Security World for nShield generates logging information that is configured through a set of four environment variables:

NFLOG_FILE

This environment variable specifies the name of a file (or a file descriptor, if prefixed with the & character) to which logging information is written. The default is stderr (the equivalent of &2).

Ensure that you have permissions to write to the file specified by NFLOG_FILE.

NFLOG_SEVERITY

This environment variable specifies a minimum severity level for logging messages to be written (all log messages less severe than the specified level are ignored). The level can be one of (in order of greatest to least severity):

  1. FATAL

  2. SEVERE

  3. ERROR

  4. WARNING

  5. NOTIFICATION

  6. `DEBUG`N, where N can be an integer from 1 to 10 inclusive that specifies increasing levels of debugging detail, with 10 representing the greatest level of detail, although the type of output is depends on the application being debugged.

    The increasingly detailed information provided by different levels of `DEBUG`N is only likely to be useful during debugging, and we recommend not setting the severity level to `DEBUG`N unless you are directed to do so by Support.

    The default severity level is WARNING.

NFLOG_DETAIL

This environment variable takes a hexadecimal value from a bitmask of detail flags as described in the following table (the logdetail flags are also used in the hardserver configuration file to control hardserver logging; see: server_settings):

Hexadecimal flag Function logdetail flags

0x00000001

This flag shows the external time (that is, the time according to your machine’s local clock) with the log entry. It is on by default.

external_time

0x00000002

This flag shows the external date (that is, the date according to your machine’s local clock) with the log entry.

external_date

0x00000004

This flag shows the external process ID with the log entry.

external_pid

0x00000008

This flag shows the external thread ID with the log entry.

external_tid

0x00000010

This flag shows the external time_t (that is, the time in machine clock ticks rather than local time) with the log entry.

external_time_t

0x00000020

This flag shows the stack backtrace with the log entry.

stack_backtrace

0x00000040

This flag shows the stack file with the log entry.

stack_file

0x00000080

This flag shows the stack line number with the log entry.

stack_line

0x00000100

This flag shows the message severity (a severity level as used by the NFLOG_SEVERITY environment variable) with the log entry. It is on by default.

msg_severity

0x00000200

This flag shows the message category (a category as used by the NFLOG_CATEGORIES environment variable) with the log entry.

msg_categories

0x00000400

This flag shows message writeables, extra information that can be written to the log entry, if any such exist. It is on by default.

msg_writeable

0x00000800

This flag shows the message file in the original library. This flag is likely to be most useful in conjunction with Security World Software-supplied example code that has been written to take advantage of this flag.

msg_file

0x00001000

This flag shows the message line number in the original library. This flag is likely to be most useful in conjunction with example code we have supplied that has been written to take advantage of this flag.

msg_line

0x00002000

This flag shows the date and time in UTC (Coordinated Universal Time) instead of local time.

options_utc

0x00004000

This flag shows the full path to the file that issued the log messages.

options_fullpath

0x00008000

This flag includes the number of microseconds in the timestamp.

options_time_us

0x00010000

This flag enables logging of potentially secret
values in generic stub log output.

msg_secrets

NFLOG_CATEGORIES

This environment variable takes a colon-separated list of categories on which to filter log messages (categories may contain the wild-card characters * and ?). If you do not supply any values, then all categories of messages are logged. This table lists the available categories:

Category Description

nflog

Logs all general messages relating to nflog.

nflog-stack

Logs messages from StackPush and StackPop functions.

memory-host

Logs messages concerning host memory.

memory-module

Logs messages concerning module memory.

gs-stub

Logs general generic stub messages. (Setting this category works like using the dbg_stub flag with the logging functionality found in previous Security World Software releases.)

gs-stubbignum

Logs bignum printing messages. (Setting this category works like using the dbg_stubbignum flag with the logging functionality found in previous Security World Software releases.)

gs-stubinit

Logs generic stub initialization routines. (Setting this category works like using the dbg_stubinit flag with the logging functionality found in previous Security World Software releases.)

gs-dumpenv

Logs environment variable dumps. (Setting this category works like using the dbg_dumpenv flag with the logging functionality found in previous Security World Software releases.)

nfkm-getinfo

Logs nfkm-getinfo messages.

nfkm-newworld

Logs messages about world generation.

nfkm-admin

Logs operations using the Administrator Card Set.

nfkm-kmdata

Logs file operations in the directory.

nfkm-general

Logs general NFKM library messages.

nfkm-keys

Logs key loading operations.

nfkm-preload

Logs preload operations.

nfkm-ppmk

Logs softcard operations.

serv-general

Logs general messages about the local hardserver.

serv-client

Logs messages relating to clients or remote hardservers.

serv-internal

Logs severe or fatal internal errors.

serv-startup

Logs fatal startup errors.

servdbg-stub

Logs all generic stub debugging messages.

servdbg-env

Logs generic stub environment variable messages.

servdbg-underlay

Logs messages from the OS-specific device driver interface

servdbg-statemach

Logs information about the server’s internal state machine.

servdbg-perf

Logs messages about the server’s internal queuing.

servdbg-client

Logs external messages generated by the client.

servdbg-messages

Logs server command dumps.

servdbg-sys

Logs OS-specific messages.

pkcs11-sam

Logs all security assurance messages from the PKCS #11 library.

pkcs11

Logs all other messages from the PKCS #11 library.

rqcard-core

Logs all card-loading library operations that involve standard message passing (including slot polling).

rqcard-ui

Logs all card-loading library messages from the current user interface.

rqcard-logic

Logs all card-loading library messages from specific logics.

You can set a minimum level of hardserver logging by supplying one of the values for the NFLOG_SEVERITY environment variable in the hardserver configuration file, and you can likewise specify one or more values for the NFLOG_CATEGORIES environment variable. For detailed information about the hardserver configuration file settings that control logging, see server_settings.

If none of the four environment variables are set, the default behavior is to log nothing, unless this is overridden by any individual library. If any of the four variables are set, all unset variables are given default values.

Logging and debugging information for PKCS #11

To produce PKCS #11 debug output, the CKNFAST_DEBUG variable can be a given value from 1 through to 11, where the greater the value the more detailed debug information is provided. A value of 7 is a reasonable compromise between too little and too much debug information. A value of 0 switches the debug output off.

This environment variable takes a colon-separated list of categories on which to filter log messages (categories may contain the wildcards characters * and ?).

The following table maps PKCS #11 debug level numbers to the corresponding NFLOG_SEVERITY value:

PKCS #11 debug level PKCS #11 debug meaning NFLOG_SEVERITY value Output in log

0

DL_None

NONE

1

DL_EFatal

FATAL

"Fatal error:"

2

DL_EError

ERROR

"Error:"

3

DL_Fixup

WARNING

"Fixup:"

4

DL_Warning

WARNING

"Warning:"

5

DL_EApplic

ERROR

"Application error:"

6

DL_Assumption

NOTIFICATION

"Unsafe assumption:"

7

DL_Call

DEBUG2

">> "

8

DL_Result

DEBUG3

"< "

9

DL_Arg

DEBUG4

"> "

10

DL_Detail

DEBUG5

"D "

11

DL_DetailMutex

DEBUG6

"DM "

Hardserver debugging

Hardserver debugging is controlled by specifying one or more servdbg-* categories (from the NFLOG_CATEGORIES environment variable) in the hardserver configuration file; see server_settings. However, unless you also set the NFAST_DEBUG environment variable to a value in the range 1 – 7, no debugging is produced (regardless of whether or not you specify servdbg-* categories in the hardserver configuration file). This behavior helps guard against the additional load debugging places on the CPU usage; you can set the desired servdbg-* categories in the hardserver configuration file, and then enable or disable debugging by setting the NFAST_DEBUG environment variable.

The NFAST_DEBUG environment variable controls debugging for the general stub or hardserver. The value is an octal number, in the range 1 – 7. It refers bitwise to a number of flags:

Flag Result

1

Generic stub debugging value.

2

Show bignum values.

4

Show initial NewClient or ExistingClient command and response.

For example, if the NFAST_DEBUG environment variable is set to 6, flags 2 and 4 are used.

If the NFAST_DEBUG environment variable value includes flag 1 (Generic stub debugging value), the logdetail value in the hardserver configuration file (one of the values for the NFLOG_DETAIL environment variable) controls the level of detail printed.

Do not set the NFAST_DEBUG environment variable to a value outside the range 1 – 7. If you set it to any other value, the hardserver does not start.

Debugging information for Java

This section describes how you can specify the debugging information generated by Java.

Setting the Java debugging information level

In order to make the Java generic stub output debugging information, set the Java property NFJAVA_DEBUG. The debugging information for NFJAVA, NFAST, and other libraries (for example, KMJAVA) can all use the same log file and have their entries interleaved.

You set the debugging level as a decimal number. To determine this number:

  1. Select the debugging information that you want from the following list:

    NONE = 0x00000000 (debugging off)
    MESS_NOTIFICATIONS = 0x00000001 (occasional messages including important errors)
    MESS_VERBOSE = 0x00000002 (all messages)
    MESS_RESOURCES = 0x00000004 (resource allocations)
    FUNC_TRACE = 0x00000008 (function calls)
    FUNC_VERBOSE = 0x00000010 (function calls + arguments)
    REPORT_CONTEXT = 0x00000020 (calling context e.g ThreadID and time)
    FUNC_TIMINGS = 0x00000040 (function timings)
    NFJAVA_DEBUGGING = 0x00000080 (Output NFJAVA debugging info)
  2. Add together the hexadecimal value associated with each type of debugging information.

    For example, to set NFJAVA_DEBUGGING and MESS_NOTIFICATIONS, add 0x00000080 and 0x00000001 to make 0x00000081.

  3. Convert the total to a decimal and specify this as the value for the variable.

    For example, to set NFJAVA_DEBUGGING and MESS_NOTIFICATIONS, include the line:

    NFJAVA_DEBUG=129

    For NFJAVA to produce output, NFJAVA_DEBUG must be set to at least NFJAVA_DEBUGGING + MESS_NOTIFICATIONS. Other typical values are:

    • 255: All output

    • 130: nfjava debugging and all messages (NFJAVA_DEBUGGING and MESS_VERBOSE)

    • 20: function calls and arguments and resource allocations (FUNC_VERBOSE and MESS_RESOURCES)

Setting Java debugging with the command line

You can set the Java debug options by immediately preceding them with a -D. Use the NJAVA_DEBUGFILE property to direct output to a given file name, for example:

java -DNFJAVA_DEBUGFILE=myfile -DNFJAVA_DEBUG=129 -classpath .... classname
Do not set NFJAVA_DEBUG or NFJAVA_DEBUGFILE in the environment because Java does not pick up variables from the environment.

If NFJAVA_DEBUGFILE is not set, the standard error stream System.err is used.

Set these variables only when developing code or at the request of Support.
Debug output contains all commands and replies sent to the hardserver in their entirety, including all plain texts and the corresponding cipher texts as applicable.

Diagnostics and system information

Besides the diagnostic tools described in this section, we also supply a performance tool that you can use to test Web server performance both with and without an nShield HSM. This tool is supplied separately. If you require a copy, contact your Sales representative.

nfdiag: diagnostics utility

The nfdiag command-line utility is a diagnostics tool that gathers information about the system on which it is executed. It can save this information to either a .zip file or a text file.

Under normal operating conditions, you do not need to run nfdiag. You can run nfdiag before contacting Support and include its output file with any problem report.

Usage

Run nfdiag with the standard -h|--help option to display information about the options and parameters that control the program’s behavior.

If you want to supply additional diagnostic files, run:

nfdiag -e|--extrainfo <FILENAME>

You can only attach plaintext files.

The nfdiag command-line utility is an interactive tool. When you run it, it prompts you to supply the following information:

Option Actions to take

which application(s) you are using

Identify all application software installed on the machine on which any problem with the nShield product occurs.

what APIs you are using

Describe any custom software, especially any interaction it has with the nShield security system.

a description of the problem

Include as much detail as possible, including any error messages you have seen.

a Support ticket number (if you have one)

When you contact Support you are supplied with a Support ticket number. Enter this number to help Support expedite the collection of any information you have sent.

a contact email address

Supply an email address that has as few e-mail/spam filters as possible so that any additional files that Support sends to you are not blocked. We use the e-mail address you supply here only for communication directly related to your problem report.

a contact name

Enter your name (or the name of an appropriate person for contact by Support).

a contact telephone number

Include the appropriate country and any region code for your contact telephone number.

Except for a Support ticket number, nfdiag requires non-NULL answers to all its prompts for information.

Supplying this information helps nfdiag capture as much relevant information as possible for any problem report to Support. As you supply information at each prompt in turn, press Enter to confirm the information and continue to the next prompt. Information you supply cannot extend over multiple lines, but if you need to supply this level of information, you can include it in additional attached files, as described.

By default, nfdiag runs in verbose mode, providing feedback on each command that it executes and which log files are available. If the system is unable to execute a command, the verbose output from nfdiag shows where commands are stalling or waiting to time out.

At any time while nfdiag is running, you can type Ctrl-C to cancel its current commands and re-run it.

Output

After you have finished supplying information for each required prompt, nfdiag generates a plain text output file and displays its file name.

If the file exists, nfdiag automatically includes this file in its output. If the file does not exist, nfdiag warns you that it could not process this file. This warning does not affect the validity of the generated output file.

When complete, this output file contains the following:

  • The information supplied interactively to nfdiag when run

  • Details about the client machine

  • Details about any environment variables

  • Output from the following command-line utilities:

    • enquiry

    • stattree

    • ncversions

    • nfkminfo

  • The contents of the following log files (if they are available):

    • hardserver.log

    • keysafe.log

    • cmdadp.log

    • ncsnmpd.log

  • Any attached diagnostic files

Because the contents of the output file are plain text, they are human readable. You can choose to view the output file to ensure no sensitive information has been included.

The nfdiag utility does not capture any passphrases in the output file.

nfkminfo: information utility

The nfkminfo utility displays information about the Security World and the keys and card sets associated with it.

Usage

nfkminfo -w|--world-info [-r|--repeat] [-p|--preload-client-id]
nfkminfo -k|--key-list [<APPNAME> [<IDENT>]]
nfkminfo -l|--name-list [<APPNAME> [<APPNAME>...]]
nfkminfo [-c|--cardset-list]|[-s|--softcard-list] [<TOKENHASH>]
nfkminfo --cardset-list [<TOKENHASH>] --key-list [<APPNAME>[<APPNAME>]]|--name-list <APPNAME>[<IDENT>...]]
Security World options

-w|--world-info

This option specifies that you want to display general information about the Security World. These options are the default and need not be included explicitly.

-r|--repeat

This option displays the information repeatedly. There is a pause at the end of each set of information. The information is displayed again when you press Enter.

-p|--preload-client-id

This option displays the preloaded client ID value, if any.

Key, card set, and softcard options

-k|--key-list [<APPNAME>[<APPNAME>]]

This option lists keys without key names. If <APPNAME> is specified, only keys for these applications are listed.

-l|--name-list [<APPNAME>[<IDENT>]]

This option lists keys with their names. If <APPNAME> is specified, only keys for these applications are listed. If <IDENT> is listed, only the keys with the specified identifier are listed.

-c|--cardset-list [<TOKENHASH>]

If <TOKENHASH> is not specified, this option lists the card sets associated with the Security World. The output is similar to this:

Cardset list - 1 cardsets: (P)ersistent/(N)ot, (R)emoteable/(L)ocal-only
Operator logical token hash      k/n timeout name <hash>                             1/1 none-PL <name>

If <TOKENHASH> is specified, these options list the details of the card identified by hash. The output is similar to this:

Cardset
 name        "name"
 k-out-of-n  1/1
 flags       Persistent PINRecoveryForbidden(disabled) !RemoteEnabled
 timeout     none
 card names   ""
 hkltu       hash
 gentime 2005-10-14 10:56:54
Keys protected by cardset hash:
 AppName app Ident keyident
 AppName app Ident keyident
 ...     ...  ...   ...

-s|--softcard-list TOKENHASH

This option works like the -c|--cardset-list option, except it lists softcards instead of card sets. If <TOKENHASH> is not specified, this option lists the softcards associated with the Security World.

Security World output info

If you run nfkminfo with the -w|--world-info option, it displays information similar to that shown in these examples:

generation 1
state      0x70000 Initialised Usable Recovery !PINRecovery
!ExistingClient !RTC !NVRAM !FTO !SEEDebug
n_modules  1
hknso      hash_knso
hkm        hash_km
hkmwk      hash_knwk
hkre       hash_kre
hkra       hash_kra
ex.client  none

...
...
Module #1
 generation    1
 state         0x1 Usable
 flags         0x10000 ShareTarget
 n_slots       2
 esn           34F3-9CB4-753B
 hkml          hash_kml
 Module #1 Slot #0 IC 11
 generation    1
 phystype      SmartCard
 slotlistflags 0x2
 state         0x4 Operator
 flags         0x20000 RemoteEnabled
 shareno       2
 shares
 error         OK
 Cardset
 name          "fred"
 k-out-of-n    1/2
 flags         NotPersistent
 timeout       none
 card names    "" ""
 hkltu         hash_kt
 Module #1 Slot #1 IC 0
 generation    1
 phystype      SmartCard
 slotlistflags 0x2 SupportsAuthentication
 state         0x4 Admin
 flags         0x10000 passphrase
 shareno       1
 shares        LTNSO(PIN) LTM(PIN) LTR(PIN) LTNV(PIN) LTRTC(PIN) LTDSEE(PIN)
 LTFTO(PIN)
 error         OK
 No Cardset

No Pre-Loaded Objects
World

nfkminfo reports the following information about the Security World:

generation

This indicates the internal number.

state

This indicates the status of the current world:

Initialised

This indicates that the Security World has been initialized.

Usable

This indicates that there is at least one usable HSM in this Security World on this host.

!Usable

This indicates that there are no usable HSMs in this Security World on this host.

Recovery

This indicates that the Security World has the OCS and softcard replacement and the key recovery features enabled.

!Recovery

This indicates that the Security World has the OCS and softcard replacement and the key recovery features disabled.

AdminAuthRequired

This indicates that additional authorization is required for the following operations:

  • Key generation

  • Public key import

  • Operator cardset creation

  • Softcard creation. This authorization is supplied by presenting any operator or administration card from the same Security World. A passphrase is not required:

ExistingClient

This indicates that there is a Client ID set, for example, by preload. This Client ID is given in the ex.client output if the --preload-client-id flag was supplied.

!ExistingClient

This indicates that no Client ID is set. The ex.client output will be empty.

AlwaysUseStrongPrimes

This indicates that the Security World always generates RSA keys in a manner compliant with FIPS 186-3.

!AlwaysUseStrongPrimes

This indicates that the Security World leaves the choice of RSA key generation algorithm to individual clients.

SEEDebug

This indicates that the Security World has an SEE Debugging delegation key.

!SEEDebug

This indicates the Security World has no SEE Debugging delegation key.

SEEDebugForAll

This indicates no authorization is required for SEE Debugging.

PINRecovery

This indicates that the Security World has the passphrase replacement feature enabled.

!PINRecovery

This indicates that the Security World has the passphrase replacement feature disabled.

FTO

This indicates that the Security World has an FTO delegation key.

!FTO

This indicates that the Security World has no FTO delegation key.

NVRAM

This indicates that the Security World has an NVRAM delegation key.

!NVRAM

This indicates that the Security World has no NVRAM delegation key.

RTC

This indicates that the Security World has an RTC delegation key.

!RTC

This indicates that the Security World has no RTC delegation key.

AuditLogging

This indicates that Audit Logging is enabled for this Security World.

!AuditLogging

This indicates that Audit Logging is not enabled for this Security World.

n_modules

This indicates the number of nShield HSMs connected to this computer.

hknso

This indicates the SHA-1 hash of the Security Officer’s key.

hkm

This indicates the SHA-1 hash of the Security World key.

hkmwk

This indicates the SHA-1 hash of a dummy key used to load the Administrator Card Set (the dummy key is the same on all HSMs that use Security Worlds and is not secret).

hkre

This indicates the SHA-1 hash of the recovery key pair.

hkra

This indicates the SHA-1 hash of the recovery authorization key.

ex.client

This indicates the ClientID required to use any pre-loaded keys and tokens.

k-out-of-n

This indicates the values of K and N for this Security World.

other quora

This indicates the number (quora) of Administrator Cards (K) required to perform certain other functions as configured for this Security World.

ciphersuite

This indicates the name of the Cipher suite that the Security World uses.

Mode

none

This indicates that the Security World is in an unregulated mode. The Security World can be configured to meet the needs of your security policy. This includes, but is not limited to, creating a Security World that is compliant with FIPS140 Level 2.

fips1402level3

This indicates that the Security World is in a mode compliant with FIPS 140 Level 3.

commoncriteriacmts

This indicates that the Security World is in a mode compliant with Common Criteria Protection Profile EN 419 221-5, for Cryptographic Modules for Trust Services.

Assigned Keys

max usage

This indicates the maximum key usage reauthorization condition for Assigned Keys. (common-criteria-cmts mode only).

max timeout

This indicates the maximum key timeout reauthorization condition for Assigned Keys (common-criteria-cmts mode only).

Module

For each HSM in the Security World, nfkminfo reports:

generation

This indicates the version of the HSM data.

state

This indicates one of the following:

PreInitMode

This indicates that the HSM is in the pre-initialization state.

InitMode

This indicates that the HSM is in the initialization state.

Unknown

This indicates that the HSM’s state could not be determined.

Usable

This indicates that the HSM is programmed in the current Security World and can be used.

Uninitialized

This indicates that the HSM does not have the Security Officer’s key set and that the HSM must be initialized before use.

Factory

This indicates that the HSM has module key zero only and that the Security Officer’s key is set to the factory default.

Foreign

This indicates that the HSM is from an unknown Security World.

AccelOnly

This indicates that the HSM is acceleration only.

Unchecked

This indicates that, although the HSM appears to be in the current Security World, nfkminfo could not find a module initialization certificate (a module_<ESN> file) for this HSM.

Failed

This indicates that the HSM has failed.

For nShield PCIe HSMs running firmware 2.61.2 and above, use the enquiry utility for further information about the failure reason.

MaintMode

This indicates that the HSM is in the maintenance state.

flags

This displays ShareTarget if the HSM has been initialized to allow reading of remote card sets.

n_slots

This indicates the number of slots on the HSM (there is one slot for each physical smart card reader, one slot for each soft token, one slot for each available Remote Operator slot and one slot for each associated Dynamic Slots).

esn

This indicates the electronic serial number of the HSM (if the HSM is not in the Usable state, the electronic serial number may not be available).

hkml

This indicates the hash of the HSM signing key (if the HSM is not in the Usable state, this value may not be available).

Slot

For each slot on the HSM, nfkminfo reports:

IC

This indicates the insertion count for this slot (which is 0 if there is no card in the slot).

generation

This indicates the version of the slotinfo structure.

phystype

This indicates the type of slot, which can be one of:

  • SmartCard

  • SoftToken

slotlistflags

These are flags describing the capabilities of the slot. Single letters in parentheses are the flag codes reported by the slotinfo utility:

0x2

(A) SupportsAuthentication
This indicates that the slot supports token-level challenge-response authentication.

0x40000

(R) RemoteSlot
This indicates that the slot is a Remote Operator slot that has been imported from a remote HSM.

0x80000

(D) DynamicSlot
This indicates that it is a Dynamic Slot.

0x100000

(a) Associated
This indicates that a Remote Administration Client has associated a card reader with this

0x200000

(t) TimedOut
This indicates that no response has been received from the smartcard in this Dynamic Slot within the configured timeout.

0x400000

(f) SecureChannelFailed
This indicates that the secure channel between the HSM and the smartcard in this Dynamic Slot has failed in some way.

state

This can be one or more of the following flags:

Blank

This indicates that the smart card in the reader is unformatted.

Admin

This indicates that the smart card in the reader is part of the Administrator Card Set.

Empty

This indicates that there is no smart card in the reader.

Error

This indicates that the smart card in the reader could not be read (the card may be from a different Security World).

Operator

This indicates that the smart card in the reader is an Operator Card.

flags

This displays passphrase if the smart card requires a passphrase.

shareno

This indicates the number of the card within the card set.

shares

If the card in the slot is an Operator Card, no values are displayed for shares.

If the card in the slot is an Administrator Card, values are displayed indicating what key shares are stored on the card. Each share is prefixed with the letters LT (Logical Token), and the remaining letters identify the key (for example, the value LTNSO indicates that a share of KNSO, the Security Officer’s key, is stored on the card).

error

This indicates the error status encountered if the smart card could not be read:

OK

This indicates that there were no errors.

TokenAuthFailed

This indicates that the smart card in the reader failed challenge response authentication (the card may come from a different Security World).

PhysTokenNotPresent

This indicates that there is no card in the reader.

If you purchased a developer kit, you can refer to the relevant developer documentation for a full list of error codes.

Card set

If there is an Operator Card in the reader, nfkminfo reports:

name

This indicates the name given to this card set.

k-out-of-n

This indicates the values of K and N for this card.

flags

This displays one or more of each of the following pairs of flags:

NotPersistent

This indicates that the Operator Card is not persistent.

Persistent

This indicates that the Operator Card is persistent.

NotRemoteEnabled

This indicates that the card in the slot is not from a Remote Operator Card Set.

RemoteEnabled

This indicates that the card in the slot is from a Remote Operator Card Set.

PINRecoveryForbidden(disabled)

This indicates that the card in the slot does not have passphrase replacement enabled. This is always true if passphrase replacement is disabled for the Security World.

PINRecoveryRequired(enabled)

This indicates that the card in the slot does have passphrase replacement enabled.

timeout

the period of time in seconds after which the HSM automatically removes the Operator Card Set. If timeout is set to none, the Operator Card Set does not time out.

card

lists the names of the cards in the set, not all software can give names to individual cards in a set.

hkltu

the SHA-1 hash of the secret on the card.

perfcheck: performance measurement checking tool

Use the perfcheck command-line utility to run various tests measuring the cryptographic performance of an nShield HSM.

Run perfcheck with the standard -h|--help option to display information about the options and parameters that control the program’s behavior.

The available tests are grouped into suites:

  • kx (key exchange)

  • keygen (key generation)

  • signing (signing)

  • verify (verification)

  • enc (encryption)

  • dec (decryption)

  • misc (miscellaneous).

To see the list of tests available in a particular suite, run a command of the form:

perfcheck --list suite

For example, to list all the signing tests, run the command:

perfcheck --list signing
>>> Suite `signing' -- Signing (222 tests)
>>>    1 - DSA using RIPEMD160 with 512-bit p and 160-bit q.
>>>    2 - DSA using RIPEMD160 with 1024-bit p and 160-bit q.
>>>    3 - DSA using RIPEMD160 with 2048-bit p and 160-bit q.
>>>    4 - DSA using RIPEMD160 with 3072-bit p and 160-bit q.
>>> ...

In the output, each listed test in the suite is identified with a number.

You can reference a test either by its number or by its name:

  • by test number:

    perfcheck suite:test_number

    To use test 16 of the signing suite:

    perfcheck signing:16
  • by test name:

    perfcheck "exact name"

    Example:

    perfcheck "signing:RSA using RSApPKCS1 with 2048-bit n."

The test numbers change between releases. If you want to rerun tests for comparison, reference the tests by their names.

perfcheck prints the results of individual tests to output as it goes along, and then prints a full report at the end. By default, perfcheck runs each test three times for both minimum and maximum queue sizes, and then collates the results in the final report. See --help for the options to adjust this behavior.

Optionally, perfcheck can write its output to a directory in multiple formats using the --outputdir option to specify a directory name. This will create a new subdirectory under the specified directory to write the output. The --nosubdir option can be added as well to write output to the specified directory directly, in which case that directory must not already exist. The output directory will contain perfcheck.html, perfcheck.txt, perfcheck.csv, and perfcheck.json files that contain the report in HTML, text, CSV, and JSON format respectively. JSON files that contain the detailed results of individual tests will also be written to the output directory.

Output reports from test suites include the following information about each test:

Value Description

Queue

This value is the number of outstanding jobs in the queue when the test was run.

By default, most tests run both with a queue of 1, and with a fully maxed out module queue, to give an indication of both one-at-a-time performance and the bandwidth for the operation. The queue can be set differently using the --queue option, in which case only that queue length will be run with, except for some misc suite tests which set their own queue.

Rate (Units/s)

This value is a measure of throughput. It is calculated by dividing the number of repetitions by total time.

If a test has been rerun to improve accuracy, as is the case by default, then this is the mean across all the runs.

Some tests, for example enc, set the Unit to something other than an operation, for example KB, to indicate the amount of data that can be encrypted.

Min latency (ms)

This value is the time in milliseconds that the quickest individual job across all the test runs took to round-trip.

Mean latency (ms)

This value is the mean time in milliseconds that jobs took to round-trip.

If a test has been rerun, this is the mean of the mean latency values from each run.

Max latency (ms)

This value is the time in milliseconds that the slowest individual job across all the test runs took to round-trip.

CV (%)

This value is the coefficient of variation expressed as a percentage of the mean latency. It gives an indication of the variability in the time it takes individual jobs to complete.

If a test has been rerun, this is the mean of the CV (%) values from each run.

Min rate (tps)

This is the estimated lower bound of the throughput for this queue size in transactions per second.

The value becomes more accurate if more test runs of the same test are done. When it is compared against Mean rate (tps) and Max rate (tps), Min rate (tps) gives an indication of the variability between runs.

Mean rate (tps)

This is a measure of throughput. Unlike Rate (Units/s), it is expressed in transactions per second, that is, as the number of jobs that round-trip per second.

Mean rate (tps) is included for comparison against the Min rate (tps) and Max rate (tps) figures.

Max rate (tps)

This is the estimated upper bound of the throughput for this queue size in transactions per second.

The value becomes more accurate if more test runs of the same test are done. When it is compared against Min rate (tps) and Mean rate (tps), Max rate (tps) gives an indication of the variability between runs.

Reps

This value is the number of repetitions that were actually carried out, that is, the number of jobs that were round-tripped over all tests of this operation for this queue size.

If a test was rerun, this is the sum of the repetitions for each run. The target repetitions for an individual run can be set using the --repetitions option but note that in most cases more repetitions will be run depending on the --accuracy setting provided that the timeout is not reached. It is recommended to set --accuracy rather than --repetitions to control the accuracy of the test instead of adjusting the repetitions.

How perfcheck calculates statistics

When an nCore command is submitted to an HSM by a client application, it is processed as follows:

  1. The command is passed to the hardserver.

  2. The client hardserver encrypts the command.

  3. When the HSM is free, the command is submitted from the hardserver queue.

  4. The command is executed by the HSM, and the reply is given to the hardserver.

  5. The unit hardserver queues the reply.

  6. The unit hardserver sends the command back to the client hardserver over the network.

  7. When the client application is ready, the queued reply is returned to it.

Because an HSM can execute several commands at once, throughput is maximized by ensuring there is always at least one command in the hardserver queue (so that there are always commands available to give to the HSM).

The perfcheck utility sends multiple simultaneous nCore commands to keep the HSM busy. It can send more commands if a required number of repetitions has not yet been reached.

After sending some initial commands, perfcheck begins marking commands with the time at which are submitted; when a command comes back with a timestamp, perfcheck checks the amount of time needed to complete the command and updates the values for Std dev and Latency. The value of Total time is the amount of time from sending the first job to receiving the final one.

stattree: information utility

The stattree utility returns the statistics gathered by the hardserver and HSMs.

Usage

stattree [<node> [<node> [...]]]

Output

Running the stattree utility displays a snapshot of statistics currently available on the host machine. Statistics are gathered both by the hardserver (relating to the server itself, and its current clients) and by each attached HSM.

Times are listed in seconds. Other numbers are integers, which are either real numbers, IP addresses, or counters. For example, a result -CmdCount 74897 means that there have been 74,897 commands submitted.

A typical fragment of output from stattree looks like this:

+PerModule:
  +#1:
     +ModuleObjStats:
        -ObjectCount          5
        -ObjectsCreated       5
        -ObjectsDestroyed     0
     +ModuleEnvStats:
        -MemTotal             15327232
        -MemAllocKernel       126976
        -MemAllocUser         0
     +ModuleJobStats:
        -CmdCount             169780
        -ReplyCount           169778
        -CmdBytes             3538812
        -ReplyBytes           4492764
        -HostWriteCount       169772
        -HostWriteErrors      0
        -HostReadCount        437472
        -HostReadErrors       0
        -HostReadEmpty        100128
        -HostReadDeferred     167578
        -HostReadTerminated   0
        -PFNIssued            102578
        -PFNRejected          1
        -PFNCompleted         102577
        -ANIssued             1
        -CPULoadPercent       0
     +ModuleSerialStats:
        -HostReadCount        437476
        -HostReadDeferred     167580
        -HostReadReconnect    167579
        -HostReadErrors       0
        -HostWriteCount       169774
        -HostWriteErrors      0
     +ModuleDriverStats:
         -DriverIRQs           2547906
         -DriverReadIRQs       1274069
         -DriverWriteIRQs      1276373
         -DriverWriteFails     0
         -DriverWriteBlocks    1276373
         -DriverWriteBytes     49625888
         -DriverReadFails      0
         -DriverReadBlocks     0
         -DriverReadBytes      0
         -DriverEnsureFail     0
         -DriverEnsure         1274065

PerModule, ModuleObjStats, and ModuleEnvStats are node tags that identify classes of statistics. 1 identifies an instance node.

ObjectCount, MemTotal, and the remaining items at the same level are statistics IDs. Each has a corresponding value.

If <node> is provided, stattree uses the value given as the starting point of the tree and displays only information at or below that node in the tree. Values for <node> can be numeric or textual. For example, to view the object counts for local module number 3:

$ stattree PerModule 3 ModuleObjStats
+#PerModule:
  +#3:
     +#ModuleObjStats:
        -ObjectCount          6
        -ObjectsCreated       334
        -ObjectsDestroyed     328

The value of <node> must be a node tag; it must identify a node in the tree and not an individual statistic. Thus, the following command does not work:

$ stattree PerModule 3 ModuleObjStats ObjectCount
+#PerModule:
  +#3:
     +#ModuleObjStats:
Unable to convert 'ObjectCount' to number or tag name.

ModuleDriverStats fields:

Field Description

DriverIRQs

Total number of interrupts

DriverReadIRQs

Read interrupts

DriverWriteIRQs

Write interrupts

DriverWriteFails

Write failures

DriverWriteBlocks

Blocks written

DriverWriteBytes

Bytes written

DriverReadFails

Read failures

DriverReadBlocks

Blocks read

DriverReadBytes

Bytes read

DriverEnsureFail

Read request failures

DriverEnsure

Read requests

Node tags

These hold statistics for each HSM:

Category Contains

ModuleJobStats

This tag holds statistics for the Security World Software commands (jobs) processed by this HSM.

ModulePCIStats

This tag does not apply to nShield USB-attached HSMs.

ServerGlobals

Aggregate statistics for all commands processed by the hardserver since it started. The standard statistics (as described below) apply to the commands sent from the hardserver to HSMs. Commands processed internally by the server are not included here. The Uptime statistic gives the total running time of the server so far.

Connections

Statistics for connections between clients and the hardserver. There is one node for each currently active connection. Each node has an instance number that matches the log message generated by the server when that client connected. For example, when the hardserver message is Information: New client #24 connected, the client’s statistics appear under node #24 in the stattree output.

PerModule

Statistics kept by the HSMs. There is one instance node for each HSM, numbered using the standard HSM numbering. The statistics provided by each HSM depend on the HSM type and firmware version.

ModuleObjStats

Statistics for the HSM’s Object Store, which contains keys and other resources. These statistics may be useful in debugging applications that leak key handles, for example.

ModuleEnvStats

General statistics for the HSM’s operating environment.

Statistics IDs
ID Value

Uptime

The length of time (in seconds) since an HSM was last reset, the hardserver was started, or a client connection was made.

CmdCount

The total number of commands sent for processing from a client to the server, or from the server to an HSM. Contains the number of commands currently being processed.

ReplyCount

The total number of replies returned from server to client, or from HSM to server.

CmdBytes

The total length of all the command blocks sent for processing.

ReplyBytes

The total length of all the reply blocks received after completion.

CmdMarshalErrors

The number of times a command block was not understood when it was received. A nonzero value indicates either that the parties at each end of a connection have mismatched version numbers (for example, a more recent hardserver has sent a command to a less recent HSM that the HSM does not understand), or that the data transfer mechanism is faulty.

ReplyMarshalErrors

The number of times a reply was not understood when it was received. A nonzero value indicates either that the parties at each end of a connection have mismatched version numbers (for example, a more recent hardserver has sent a command to a less recent HSM that the HSM does not understand), or that the data transfer mechanism is faulty.

ClientCount

The number of client connections currently made to the server. This appears in the hardserver statistics.

MaxClients

The maximum number of client connections ever in use simultaneously to the hardserver. This gives an indication of the peak load experienced so far by the server.

DeviceFails

The number of times the hardserver has declared a device to have failed. The hardserver provides a diagnostic message when this occurs.

DeviceRestarts

The number of times the hardserver has attempted to restart an HSM after it has failed. The hardserver provides a Notice message when this occurs. The message does not indicate that the attempt was successful.

QOutstanding

The number of commands waiting for an HSM to become available on the specified client connection. When an HSM accepts a command from a client, this number decreases by 1 and DevOutstanding increases by 1. Commands that are processed purely by the server are never included in this count.

DevOutstanding

The number of commands sent by the specified client that are currently executing on one or more HSMs. When an HSM accepts a command from a client, this number increases by 1 and QOutstanding decreases by 1. Commands that are processed purely by the server are never included in this count.

LongOutstanding

The number of LongJobs sent by the specified client that are currently executing on one or more HSMs. When an HSM accepts a LongJobs command from a client, this number increases by 1 and QOutstanding decreases by 1. Commands that are processed purely by the server are never included in this count.

RemoteIPAddress

The remote IP address of a client who has this connection. A local client has the address 0.0.0.0.

HostWriteCount

The number of write operations (used to submit new commands) that have been received by the HSM from the host machine. One write operation may contain more than one command block. The operation is most efficient when this is the case.

HostWriteErrors

The number of times the HSM rejected the write data from the host. A nonzero value may indicate that data is being corrupted in transfer, or that the hardserver/device driver has got out of sync with the HSM’s interface.

HostWriteBadData

Not currently reported by the HSM. Attempts to write bad data to the HSM are reflected in HostWriteErrors.

HostWriteOverruns

Not currently reported by the HSM. Write overruns are reflected in HostWriteErrors.

HostWriteNoMemory

Not currently reported by the HSM. Write failures due to a lack of memory are reflected in HostWriteErrors.

HostReadCount

The number of times a read operation to the HSM was attempted. The HSM can defer a read if it has no replies at the time, but expects some to be available later. Typically the HSM reports HostReadCount in two places: the number under ModuleJobStats counts a deferred read twice, once when it is initially deferred, and once when it finally returns some data. The number under ModulePCIStats counts this as one operation.

HostReadErrors

The number of times a read to an HSM failed because the parameters supplied with the read were incorrect. A nonzero value here typically indicates some problem with the host interface or device driver.

HostReadEmpty

The number of times a read from the HSM returned no data because there were no commands waiting for completion. In general, this only happens infrequently during HSM startup or reset. It can also happen if PauseForNotifications is disabled.

HostReadUnderruns

Not currently reported by the HSM.

HostReadDeferred

The number of times a read operation to the HSM was suspended because it was waiting for more replies to become available. When the HSM is working at full capacity, a sizeable proportion of the total reads are likely to be deferred.

HostReadTerminated

The number of times an HSM had to cancel a read operation which has been deferred. This normally happens only if the clear key is pressed while the HSM is executing commands. Otherwise it might indicate a device driver, interface, or firmware problem.

PFNIssued

The number of PauseForNotifications commands accepted by the HSM from the hardserver. This normally increases at a rate of roughly one every two seconds. If the hardserver has this facility disabled (or a very early version), this does not occur.

PFNRejected

The number of PauseForNotifications commands rejected by the HSM when received from the hardserver. This can happen during HSM startup or reset, but not in normal use. It indicates a hardserver bug or configuration problem.

PFNCompleted

The number of PauseForNotifications commands that have been completed by the HSM. Normally, this is one less than the PFNIssued figure because there is normally one such command outstanding.

ANIssued

The number of Asynchronous Notification messages issued by the HSM to the hardserver. These messages indicate such things as the clear key being pressed and the HSM being reset. In later firmware revisions inserting or removing the smartcard or changing the non-volatile memory also generate asynchronous notifications.

ChanJobsIssued

The number of fast channel jobs issued to the HSM. The fast channel facility is unsupported on current HSMs. This number should always be 0.

ChanJobsCompleted

The number of fast channel jobs completed by the HSM. The fast channel facility is unsupported on current HSMs. This number should always be 0.

CPULoadPercent

The current processing load on the HSM, represented as a number between 0 and 100. Because an HSM typically contains a number of different types of processing resources (for example, main CPU, and RSA acceleration), this figure is hard to interpret precisely. In general, HSMs report 100% CPU load when all RSA processing capacity is occupied; when performing non-RSA tasks the main CPU or another resource (such as the random number generator) can be saturated without this statistic reaching 100%.

HostIRQs

On PCI HSMs, the total number of interrupts received from the host. On current HSMs, approximately equal to the total of HostReadCount and HostWriteCount.

ChanJobErrors

The number of low-level (principally data transport) errors encountered while processing fast channel jobs. Should always be 0 on current HSMs.

HostDebugIRQs

On PCI HSMs, the number of debug interrupts received. This is used only for driver testing, and should be 0 in any production environment.

HostUnhandledIRQs

On PCI HSMs, the number of unidentified interrupts from the host. If this is nonzero, a driver or PCI bus problem is likely.

HostReadReconnect

On PCI HSMs, the number of deferred reads that have now completed. This should be the same as HostReadDeferred, or one less if a read is currently deferred.

ObjectsCreated

The number of times a new object has been put into the object store. This appears under the HSM’s ModuleObjStats node.

ObjectsDestroyed

The number of items in the HSM’s object store that have been deleted and their corresponding memory released.

ObjectCount

The current number of objects (keys, logical tokens, buffers, SEE Worlds) in the object store. This is equal to ObjectsCreated minus ObjectsDestroyed. An empty HSM contains a small number of objects that are always present.

CurrentTempC

The current temperature (in degrees Celsius) of the HSM main circuit board. First-generation HSMs do not have a temperature sensor and do not return temperature statistics.

MaxTempC

The maximum temperature recorded by the HSM’s temperature sensor. This is stored in non-volatile memory, which is cleared only when the unit is initialized. First-generation HSMs do not have a temperature sensor and do not return temperature statistics.

MinTempC

The minimum temperature recorded by the HSM’s temperature sensor. This is stored in non-volatile memory, which is cleared only when the unit is initialized. First-generation HSMs do not have a temperature sensor and do not return temperature statistics.

MemTotal

The total amount of RAM (both allocated and free) available to the HSM. This is the installed RAM size minus various fixed overheads.

How data is affected when a module loses power and restarts

nShield modules use standard RAM to store many kinds of data, and data stored in such RAM is lost in the event that a module loses power (either intentionally, because you turned off power to it, or accidentally because of a power failure).

Therefore, after restoring power to a module, you must reload any keys that had been loaded onto it before it lost power. After reloading, the KeyIDs are different.

Likewise, after restoring power to a module, you must reload any cards that were loaded onto it before it lost power.

However, data stored in NVRAM is unaffected when a module loses power.

If you are using multiple nShield modules in the same Security World, and have the same key (or keys) loaded onto each module as part of a load-sharing configuration, loss of power to one module does not affect key availability (as long as at least one other module onto which the keys are loaded remains operational). However, in such a multiple-module system, after restoring power to a module, you must still reload any keys to that module before they can be available from that module.