Debugging SEE machines
This chapter provides some guidance on debugging an SEE machine.
Debugging settings and output
To debug an SEE application effectively, you must have:
-
Enabled SEE debugging when creating the Security World in which the application is to run, see new-world (
dsee
anddseeall
options). -
Set
Cmd_CreateSEEWorld_Args_flags_EnableDebug
when creating the SEE World.If you try to set the Cmd_CreateSEEWorld_Args_flags_EnableDebug
flag in a Security World that does not allow SEE debugging, theCreateSEEWorld
command returnsAccessDenied
. This also occurs if you callCreateSEEWorld
in a Security World where SEE debugging is restricted and an appropriate certifier is not present.
Debugging authorization
Access to the SEE trace buffer is controlled by the Security World in which the SEE machine runs. Every Security World has exactly one of the following properties:
-
Restricted SEE debugging
This is the default setting. When SEE debugging is restricted, there is no delegation key from
K
NSO for accessing the SEE trace buffer. All Security Worlds created by software released before the introduction of SEE have restricted SEE debugging. A full quorum of Administrator Cards is required to access the SEE trace buffer in such Security Worlds. -
Authorized SEE debugging
In this case, a delegation key from
K
NSO exists to allow access to the SEE trace buffer. A subset of a full quorum of the Administrator Cards is required to access the SEE trace buffer in such Security Worlds. This delegation key must have been created and the number of cards required to authorize access to the SEE trace buffer must have been specified when the Security World was created. -
No access-control SEE debugging
In this case, no authorization of any kind is required for accessing the SEE trace buffer. No cards are required to access the SEE trace buffer in such Security Worlds. This property must have been specified when the Security World was created.
Obtaining debugging output
For SEE machines that require support from a host-side see-*-serv
utility, you can run the see-*-serv
utilities with the --trace
or --plain-trace
option to perform tracing automatically.
For SEE machines using the SEElib
architecture, the TraceSEEWorld()
command can be used to return debugging information.
An example of this is provided in the a3a8
host-side example code.
See A3A8 example.
Data written to standard output and standard error on the HSM is written to the SEE World’s Trace Buffer.
The Trace Buffer is a 3000 character circular buffer: if more than 3000 characters are written to it without being retrieved, information is lost on a first-in/first-out basis.
The TraceSEEWorld
command retrieves the contents of the buffer so that the host can analyze or display them.
If the SEE machine crashes, a SEE register dump is printed to the SEE Trace Buffer for the nShield Solo, but not for the nShield Solo XC.
For example, assume that the HSM code calls the following command:
printf("Hello World!\n");
The string Hello World!\n
is pushed into the Trace Buffer.
A host-side call to TraceSEEWorld
would then return this string and empty the buffer.
If a SEE World is terminated by the HSM (for instance, if its last remaining thread exits or it causes a fatal signal to be raised), a diagnostic message is usually sent to the Trace Buffer to help debug the problem.
Example Debug
If an illegal access violation (segmentation fault) occurs, the tail of the Trace Buffer looks similar to this:
*** World exits: thread 28 caused CPU exception
DSI exception:
Exception vector 00300h
r0 =001D9E40h r1 =001D9F38h r2 =00C4E090h r3 =00000008h
r4 =00000000h r5 =00C00444h r6 =00000000h r7 =001C21B1h
r8 =00C39CB8h r9 =00000019h r10=40000000h r11=00002000h
r12=00000000h r13=00D08048h r14=00000000h r15=00000000h
r16=00000000h r17=00000000h r18=00000000h r19=00000000h
r20=00000000h r21=00000000h r22=00000000h r23=00C40000h
r24=FFFC5CD0h r25=00C3A750h r26=00C40000h r27=00C40000h
r28=00000000h r29=00000000h r30=00000000h r31=00D00000h
XER=20000000h CR =20000000h LR =00C00444h CTR=00C39B9Ch
PC =00C00448h MSR=0000F030h
f0 =0000000000000000h f1 =0000000000000000h
f2 =0000000000000000h f3 =0000000000000000h
f4 =0000000000000000h f5 =0000000000000000h
f6 =0000000000000000h f7 =0000000000000000h
f8 =0000000000000000h f9 =0000000000000000h
f10 =0000000000000000h f11 =0000000000000000h
f12 =0000000000000000h f13 =0000000000000000h
f14 =0000000000000000h f15 =0000000000000000h
f16 =0000000000000000h f17 =0000000000000000h
f18 =0000000000000000h f19 =0000000000000000h
f20 =0000000000000000h f21 =0000000000000000h
f22 =0000000000000000h f23 =0000000000000000h
f24 =0000000000000000h f25 =0000000000000000h
f26 =0000000000000000h f27 =0000000000000000h
f28 =0000000000000000h f29 =0000000000000000h
f30 =0000000000000000h f31 =0000000000000000h
FPSCR=00000000h
The program counter, which is currently at position 00C00448h
in the PowerPC-based compilation shows where this access occurs.
The following excerpt from the PowerPC based map file created at application link time (by specifying the -map
option to the linker) indicates that the problem address is in main.o
:
.text 0x00c00000 0x3a0ac
*(.text.stub.text.*.gnu.linkonce.t.*)
.text 0x00c00000 0xa5c usermain.o
0x00c00160 main
.text 0x00c00a5c 0x544 .\lib-ppc-gcc\seelib.a(nfstrerr.o)
0x00c00a5c NFast_StrError
To find out which instruction is causing the segmentation fault, calculate the offset into main.o
.
The formula is:
program_counter - object_base_address
The calculation is as follows:
00C00448h -
00C00000
--------
0x00448h
Once the location of the problem is located in this way, investigate it as follows:
-
Recompile the source with the
-g
option and no optimization (if you did not originally compile it with these options). -
Run an object dump utility on the object files
powerpc-codesafe-linux-gnu-objcopy
.
The head of the generated object is now similar to the following for PowerPC based objects:
434: 38 7a 03 34 addi r3,r26,820
438: 38 80 00 08 li r4,8
43c: 4c c6 31 82 crclr 4*cr1+eq
440: 48 00 00 01 bl 440 <main+0x2e0>
444: 38 60 00 08 li r3,8
448: 80 03 00 00 lwz r0,0(r3)
44c: 4b ff fe 74 b 2c0 <main+0x160>
450: 3c 80 00 00 lis r4,0
From this output is it possible to see that the segmentation fault is caused by an illegal access to the pointer held in R4
(which the register dump showed to be 80000004h
, an obviously invalid user mode memory address).
The source shows plainly that the instruction at offset 0458h
in usermain.o
is trying to assign to *i
, but i
has not been allocated.
The bug can now be fixed and the program rebuilt.
Finding memory leaks with stattree
You can use the stattree
command-line utility to find memory leaks.
Run the command:
Linux
stattree | grep Mem
Windows
stattree | find "Mem"
For each HSM in the Security World, this command produces output that reports values for the total memory (MemTotal
), the memory currently allocated to the kernel (MemAllocKernel
), and the memory currently allocated to the loaded SEE machine (MemAllocUser
).
If no SEE machine is loaded, the output from this stattree
command (if there is only one HSM) looks similar to the following:
-MemTotal 128921600
-MemAllocKernel 1355776
-MemAllocUser 0
If an SEE machine is loaded, the output from this stattree
command (if there is only one HSM) looks similar to the following:
-MemTotal 128921600
-MemAllocKernel 1355776
-MemAllocUser 1032192
You can monitor a loaded SEE machine’s memory usage by either repeatedly running and checking output from stattree
or by writing code to call the nCore statistics APIs directly.
In any case, if any reported memory value appears to being growing continuously over time, this probably indicates some kind of memory leak.
Segment addresses for Solo
SEE executables are non-relocatable; that is, they are loaded in memory at the addresses specified in the image. Ensure that you choose these addresses carefully so that they map onto usable RAM and do not overlap with memory being used by the kernel. Typically, this means you must choose an address at the high end of RAM.
Different HSM types have different mappable memory ranges.
-
The CodeSafe compiler sets all values for Solo XC and later HSM models.
-
You have to set the ranges in the CodeSafe application code if you are developing for Solo +.
The rest of this section describes guidelines for Solo +.
To determine your HSM type, run the enquiry
command-line utility and check the SEE Machine Type
output.
You can then determine where the mappable memory range starts from this table:
SEE Machine Type |
Start of mappable range |
---|---|
|
0x00400000 |
These ranges follow the approximately 4MB of RAM reserved for use by the kernel.
You can use the stattree
command-line utility to find the total length of the mappable range.
Run the command:
Linux
stattree | grep MemTotal
Windows
stattree | find "MemTotal"
This command produces output that reports values for the total memory (MemTotal
) for each HSM in the Security World.
For Solo +, we recommend the following segment addresses as starting points:
SEE Machine Type |
PowerPCSXF |
|
|
|
|
Arguments to the linker |
|
For large SEE machines more space may be needed in the text segment, causing a linker error of the following form:
powerpc-codesafe-linux-gnu-ld: section .data [00d00000 -> 00d0327f] overlaps section .text [00c00000 -> 00d7bd8b]
powerpc-codesafe-linux-gnu-ld: section .sdata [00d03280 -> 00d035ef] overlaps section .text [00c00000 -> 00d7bd8b]
powerpc-codesafe-linux-gnu-ld: section .sbss [00d035f0 -> 00d036ab] overlaps section .text [00c00000 -> 00d7bd8b]
powerpc-codesafe-linux-gnu-ld: section .bss [00d036b0 -> 00d0854f] overlaps section .text [00c00000 -> 00d7bd8b]
To resolve this example error, you could move the data segment start point upward (for example, to 0x00e00000
) as necessary to prevent the overlap.
Alternatively (or additionally), you could move the text segment start point downward.
Vulnerability test harness
We supply a test harness called vulnerability.o
that can be used for debugging SEE machines.
It supplies a standard set of command-line arguments and environment variables to the SEE environment, as well as providing the standard stdioe
and socket support.
Because the vulnerability.o test harness is insecure, we recommend that you not link vulnerability.o into a production SEE machine.
|
Troubleshooting guide
Symptom | Possible problems | Solution |
---|---|---|
|
The SEE machine has deadlocked or entered an infinite loop which prevents the job from returning and causes the |
Check the code for possible deadlocks or infinite loops.
Non-obvious problems can be debugged by writing progress reports to the Trace Buffer and calling |
|
No SEE machine is loaded. |
Load an SEE machine |
SEE machine loading fails with |
The file being loaded is not a correctly formatted SAR file. |
Ensure that the correct SEE machine file is being loaded. Ensure that the SEE machine has been properly processed by the Trusted Code Tool into a SAR file. |
The SEE machine file is corrupted. |
Rebuild the SEE machine, or revert to a known good back-up. |
|
The SEE machine has been compiled or linked with the wrong options. |
SEE machines must be nonexecutable, uncompressed, non-relocatable AIFs or SXFs, packaged as SAR files. |
|
|
The machine signing hash on |
Ensure the correct SEE machine with the correct signatures is loaded. Ensure the correct user data is being passed to Ensure the user data signatures are correct. |
SEE machine loading fails with |
The SEE machine signatures were created incorrectly. |
SEE machine signatures must be created with the machine key specification |
The SEE machine crashes, and Trace Buffer output shows raised signal. |
Dependent on signal number. |
Check |
|
SEE World debugging is not available in Security World. |
Check the Security World’s SEE debugging policy. |
SEE machine is returning |
Check the SEE machine set-up code to see where it might be passing |
|
All |
|
If you are using |
|
Segment addresses clash with kernel pages. |
Adjust segment positions away from kernel RAM; see Segment addresses for Solo |
Segment addresses overlap. |
Adjust segment away from each other; see Segment addresses for Solo |
|
Segment addresses are not usable RAM. |
Adjust segment positions to usable RAM; see Segment addresses for Solo |
|
|
Userdata has been specified but is not expected. |
Exclude the userdata. |
The previous SEE machine has not been cleared |
Clear the previous SEE machine; see clearing a SEE machine from the front panel or clearing a SEE machine remotely |
|
Error from link: |
Segment addresses overlap. |
Adjust segment away from each other; see Segment addresses for Solo |