HP (Hewlett-Packard)メーカーV2500の使用説明書/サービス説明書
ページ先へ移動 of 314
HP Diagnostics Guide V2500 Server First Edition A5075-96006 HP Diagnostics Guide: V2500 Server Customer Order Number: A5075-90006 December 1998 Printed in: USA.
Revision History Edition: First Document Number: A5075-90006 Remarks: Initial release. December , 1998. Notice Copyright Hewlett-P ackard Company 1998. All Rights Reserved. Reproduction, adaptation, or translation without prior written permission is prohibited, except as allowed under the copyright laws .
T able of Contents iii Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Notational conventions . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv T able of Contents FPGA configuration and status. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Board over -temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 F an sensing . . . . . . . . . . . . . . . .
T able of Contents v LCD messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Node status line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Processor status line .
vi T able of Contents Main menu. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 T est Configuration menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Example of running diagnostics from T est Controller command line .
T able of Contents vii T eststation setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 pdcfl commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7 cpu3000 . . .
viii T able of Contents Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Type one error format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Type two errors . . . . . . .
T able of Contents ix address decode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 AutoRaid recovery map (arrm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Starting arrm . . . . .
x T able of Contents ver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Event processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 event_logger.
List of Figures xi F igures Figure 1 Location of the Utilities board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Figure 2 Utilities board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xii List of Figures Figure 39 V2500 DIMM locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Figure 40 Format of parameter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of T ables xiii T ables Table 1 Environmental conditions monitored by the SMUC and power-on circuit . . .8 Table 2 Processor initialization steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 Table 3 Processor run-time status codes .
xiv List of T ables Table 41 io3000 Class 16 subtests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Table 42 io3000 test parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Table 43 io3000 user test parameter word 0 bit definition .
List of T ables xv Table 85 kill_by_name options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Table 86 sppdsh parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvi List of T ables.
Preface xvii Preface This document describes the offline diagnostics for V2500 servers . It is not intended to be a tutorial or troubleshooting guide but a reference guide that contains information on all utilties and scripts used to troubleshoot these systems .
xviii Preface Preface Notational con ventions NO TE A note highlights important supplemental information. CA UTION A caution highlights procedures or information necessary to avoid damage to equipment, damage to software , loss of data, or invalid test results .
Chapter 1 1 1 Introduction This chapter presents an overview of the diagnostic mechanism for V2500 servers ..
2 Chapter 1 Introduction Utilities board Utilities board The diagnostic mechanism in the V2500 servers is centered around the Stingray Core Utilities board (SCUB). The SCUB is mounted under the MidPlane Interconnect board (MIB) toward the front of the system.
Chapter 1 3 Introduction Utilities board Figure 1 Location of the Utilities board Power board MidPlane Utilities board 12/7/98 IOEXS120.
4 Chapter 1 Introduction Utilities board The following devices connect to the Utilities board: • Core logic bus • Environmental sensors • T est points • Liquid crystal display (LCD) • Attention lightbar • T eststation The teststation connects to the system via the ethernet and RS232 connections .
Chapter 1 5 Introduction Utilities board The microprocessor -controlled JTAG interface captures incoming command packets and sends out scan information packets across the ethernet connection to the teststation. Through the teststation connection, one can read and write every CSR in the system.
6 Chapter 1 Introduction Utilities board Core logic The core logic contains initialization and booting firmware and is described in the following sections . Flash memory The core logic contains a four -MByte electrically erasable programmable read only memory (EEPROM) storage for Processor -Dependent Code (PDC).
Chapter 1 7 Introduction Utilities board Console ethernet The ethernet I/O port provides a connection to the teststation over LAN1. Attention lightbar and LCD The attention light bar displays environmental information, such as the source of an environmental error that caused the Utilities board to power down the node.
8 Chapter 1 Introduction Utilities board SMUC environmental monitoring The following environmental conditions are monitored: • ASIC installation error sensing • FPGA configuration and status • Thermal sensing • F an Sensing • P ower failure sensing • 48-V failure • 48-V maintenance • Ambient air temperature sensing.
Chapter 1 9 Introduction Utilities board Environmental condition detected by power-on function The power -on function detects environmental errors (such as ASIC Not Installed OK or FPGA Not OK). It does not turn on power to the node until the conditions are corrected.
10 Chapter 1 Introduction Utilities board The environmental error interrupt and the 1.2 second delay provide the system adequate time to read CSRs to determine the cause of the error , log the condition in NVRAM, and display the condition on the attention lightbar .
Chapter 1 11 Introduction Utilities board T eststation interface The teststation can be a P A-RISC based workstation. The interface to the teststation is an ethernet A UI port for flexibility in connecting to many workstations . It is also easily expandable.
12 Chapter 1 Introduction System displays System displays The V2500 server provides two means of displaying status and error reporting: an LCD and an Attention light bar . Figure 3 System displays F ront panel LCD The front panel is a 20-character by 4-line liquid crystal display as shown in Figure 4.
Chapter 1 13 Introduction System displays Figure 4 Front panel LCD When the node key switch is turned on, the LCD powers up but is initially blank. P ower -On Self T est (POST) starts displa ying output to the LCD .
14 Chapter 1 Introduction System displays T able 3 Processor run-time status codes Message display line The message display line shows the POST initialization progress . This is updated by the monarch processor . The system console also shows detail for some of these steps .
Chapter 1 15 Introduction System displays T able 4 Message display line P ower supply indicators When the keyswitch on the operator panel is in the DC ON position both the AC power (amber) LED and the DC power (green) LED on each of the power supplies should be on.
16 Chapter 1 Introduction System displays Attention light bar The Attention light bar is located at the top left corner on the front of the HP 9000 V2500 server as shown in Figure 3 on page 12.
Chapter 1 17 Introduction System displays 1 26-2F 48-V error , NPSLR failure, PWRUP=0-9 1 30-39 48-V error , no supply failure, PWRUP=0-9 1 3A 48-V yo-yo error 1 3B MIB power failure (PB) 1 3C Clock f.
18 Chapter 1 Introduction System displays SCUB 3.3-V olt error This error indicates that the SCUB 3.3-V olt power supply has failed, but the 5-V olt supply has not. ASIC installation error Each ASIC in the node has ASIC Install lines to prevent power -up if an ASIC is installed incorrectly (such as a SP AC installed in an ERAC position).
Chapter 1 19 Introduction System displays FPGA configuration and status The SMUC is programmed by a serial data transfer from EEPROM upon utility board power -up. If the transfer does not complete properly , the SMUC cannot configure itself and many environmental conditions cannot be monitored.
20 Chapter 1 Introduction System displays the SMUC , which reports the environmental warning to the processors . The power -on circuit displays the “highest priority” 48-V olt supply that failed.
Chapter 2 21 2 Configuration management The teststation allows the user to configure the node using the ts_config utility . ts_config configures the teststation to communicate with the node. The teststation daemon, ccmd , monitors the node and reports back configuration information, error information and general status .
22 Chapter 2 Configuration management T eststation T eststation The teststation is used for configuring , monitoring, testing , and error logging. It is not required for normal operation of a node . The teststation communicates with the JTAG interface in the nodes .
Chapter 2 23 Configuration management ts_config ts_config ts_config [-display display name] V2500 nodes added to the teststation must be configured by ts_config to enable diagnostic and scan capabilities, environmental and hard-error monitoring, and console access .
24 Chapter 2 Configuration management ts_config NO TE F or shells that are run from the teststation desktop , the DISPLA Y variable is set (at the shell start-up) to the local teststation display . ts_config operation The ts_config utility displays an active list of nodes that are powered up and connected to the teststation diagnostic LAN .
Chapter 2 25 Configuration management ts_config The ts_config window title includes in parenthesis the name of the effective user ID running ts_config , either root or sppuser . The ts_config display shows the configuration status of the nodes . T able 6 shows the possible status values .
26 Chapter 2 Configuration management ts_config Configuration Procedures NO TE This procedure does not need to be performed unless the status shows “Upgrade JTAG firmw are . ” If the node shows “Not Configured,” skip this section. The following procedures provide additional details about each configuration action.
Chapter 2 27 Configuration management ts_config Upgrade JT AG firmware Step 1. Select the node from the list in the display panel. F or example, clicking on node 0 in the list highlights that line as shown in Figure 6.
28 Chapter 2 Configuration management ts_config Figure 7 ts_config “Upgrade JT AG firmware” selection. Step 3. A message panel appears as the one shown in Figure 8. Read the message. If this is the desired action, clic k “Y es” to begin the upgrade.
Chapter 2 29 Configuration management ts_config Figure 9 ts_config power-cycle panel When the node is powered up, the “Configuration Status” should c hange to “Not Configured. ” Configure a Node Step 1. Select the desired node from the list of available nodes .
30 Chapter 2 Configuration management ts_config Figure 11 ts_config “Configure Node” selection. After invoking ts_config to configure the node, a node configuration panel appears as the one in Figure 12. Figure 12 ts_config node configuration panel Step 3.
Chapter 2 31 Configuration management ts_config Step 4. Select an appropriate serial connection for the V2500 console from the pop-down option menu in the node configuration panel.
32 Chapter 2 Configuration management ts_config Figure 14 ts_config indicating Node 0 is configured Step 7. Restart the W orkspace Manager: Click the right-mouse button on the desktop background to activate the root menu. Select the “Restart” or “Restart W orkspace Manager” option, then “OK” to activate the new desktop menu.
Chapter 2 33 Configuration management ts_config Figure 15 ts_config “Configure ‘scub_ip’ address” selection ts_config checks the scub_ip address stored in NVRAM in the node . If the scub_ip address is correct, no action is required. If the node is not detected and scanned by ccmd , ts_config may ask you to try again later .
34 Chapter 2 Configuration management ts_config Figure 17 ts_config scub_ip address set confirmation panel Initiate a node reset to activate the new scub_ip address. Reset the Node Step 1. Select the desired node from the list of available nodes . Step 2.
Chapter 2 35 Configuration management ts_config Figure 19 ts_config node reset panel Step 3. In the Node Reset panel, select the desired “Reset Level” and “Boot Options, ” then click Reset. ” Deconfigure a Node Deconfiguring a node removes the selected node from the teststation configuration.
36 Chapter 2 Configuration management ts_config Figure 20 ts_config “ Add/Configure T erminal Mux” selection. A panel appears as the on shown Figure 21. This panel requires the terminal mux IP address . Figure 21 ts_config terminal mux IP address panel Step 3.
Chapter 2 37 Configuration management ts_config Figure 22 T erminal mux IP address entered into panel Remove terminal mux ts_config does not remove the terminal mux if any node consoles are assigned to terminal mux ports . Step 1. Select “ Actions, ” then “Configure T erminal Mux.
38 Chapter 2 Configuration management T eststation-to-system comm unications T eststation-to-system communications This section describes how the teststation communicates with the system using the utilities presented in Chapter 11, “Utilities .” Figure 23 depicts the V -Class server to teststation communications using HP-UX.
Chapter 2 39 Configuration management T eststation-to-system comm unications The hardware components located on the SCUB are shown in the diagram on the left side of the node or system. They include three ethernet ports and one DU ART . A layer of firmw are between HP-UX and OBP called spp_pdc allows the HP-UX kernel to communicate with OBP .
40 Chapter 2 Configuration management ccmd ccmd ccmd builds a configuration information database on the teststation. The board names and revisions, the device names and revisions , and the start-up information generated by POST are all read and stored in memory for use by other diagnostic tools .
Chapter 2 41 Configuration management ccmd If ccmd detects a hard error , it starts the hard_logger script to extract additional information from the node through the JTAG interface . After the hard_logger runs, ccmd resets the node or complex that failed.
42 Chapter 2 Configuration management xconfig xconfig xconfig is the graphical tool that can also modify the parameters initialized by POST to reconfigure a node.
Chapter 2 43 Configuration management xconfig Figure 24 xconfig window—physical location names.
44 Chapter 2 Configuration management xconfig Figure 25 xconfig window—logical names As buttons are clicked, the item selected changes state and color . There is a legend on the screen to explain the color and status. The c hange is recorded in the teststation’s image of the node .
Chapter 2 45 Configuration management xconfig The main xconfig window has three sections: • Menu bar—Provides additional capability and functions. • Node configuration map—Provides the status of the node. • Node control panel—Provides the capability to select a node and control the wa y data flows to it.
46 Chapter 2 Configuration management xconfig Node configuration map The node configuration map is a representation of the left and right side views of a node as shown in Figure 27.
Chapter 2 47 Configuration management xconfig The button boxes are positioned to represent the actual boards as viewed from the left and right sides. Eac h of the configurable components of the node is in the display . The buttons are used as follows: • Green button—Indicates that the component is present and enabled.
48 Chapter 2 Configuration management xconfig Figure 28 xconfig window node control panel The node number is shown in the node box. A new number can be selected by clicking on the node box and selecting the node from the pull- down menu. A new complex can be selected by clicking on the complex box and selecting it from the pull-down.
Chapter 2 49 Configuration management xconfig When a new node is selected and available , its data is automatically read and the node configuration map updated. The data image is kept on the teststation until it is rebuilt on the node using the Replace button.
50 Chapter 2 Configuration management Configuration utilities Configuration utilities V2500 diagnostics provides utilities that assist the user with configuration management.
Chapter 2 51 Configuration management Configuration utilities NO TE If there is a node_#.pwr file that is older than the node_#.cfg file, existing node configuration files do not need to be updated. est_config also generates a complex_uts.cfg file that can be compared against a complex.
52 Chapter 2 Configuration management Configuration utilities.
Chapter 3 53 3 P ower-On Self T est POST is the P ower On Self T est firmw are for the V -Class platform. POST provides processor and system hardware initialization functionality, as well as providing basic processor selftest and utilities board SRAM pattern test capability.
54 Chapter 3 P ower-On Self T est Overview Overview Upon power up, all processors and hardw are must be initialized before the node proceeds with booting . POST begins executing and brings up the node from an indeterminate state and then calls OBP . None of the POST modules can be directly controlled via a user interface.
Chapter 3 55 P ower-On Self T est Overview • Hard reset—If a client had execution control before the hard reset, it invokes POST to initialize the hardware.
56 Chapter 3 P ower-On Self T est POST modules POST modules POST executes modules listed below in chronological order: • Processor Initialization and Selftest—Each processor initializes itself on power up or reset in parallel with the other processors .
Chapter 3 57 P ower-On Self T est POST modules • P age Deallocation T able Support—POST supports reading the page deallocation table (PDT) and remapping memory if it detects a bad page in the HPUX good-memory region. It updates all entries to reflect the new memory layout if remapping occurs .
58 Chapter 3 P ower-On Self T est Interactive mode Interactive mode POST for the V2500 provides a command line interface for configuration and debugging. The command line interface is invoked if boot_module is set to “interactive,” by a soft reset, or a TOC during POST execution.
Chapter 3 59 P ower-On Self T est Interactive mode Configuration parameters The following parameters control the runtime operation of POST : • ts_ip —Specifies the teststation IP address for LAN messaging . The value should be set to the IP address of the diagnostics LAN port on the teststation.
60 Chapter 3 P ower-On Self T est Interactive mode T able 9 Name of CTI cache size IP address for listed utilities • boot_module —Specifies which client to turn execution control over to at the completion of POST execution.
Chapter 3 61 P ower-On Self T est Interactive mode T able 12 Name of scuba test enable for listed utilities • master_error_enable —Determines whether POST will enable errors or not. This is used in conjunction with use_error_overrides to determine how errors are enabled.
62 Chapter 3 P ower-On Self T est Interactive mode T able 15 Name of sforce monarch for listed utilities • monarch_number —Specifies the monarch processor when force_monarch is enabled.
Chapter 3 63 P ower-On Self T est Messages Messages POST has three types of messages: LCD , console, and error . This section discusses each type . LCD messages Each node has an LCD display . Figure 29 shows the display and indicates what each line on the display means .
64 Chapter 3 P ower-On Self T est Messages T able 17 Processor initialization steps T able 18 Processor run-time status codes Step Description 0 Processor internal diagnostic register initialization 1 Processor early data cache initialization. 2 Processor stack SRAM test.
Chapter 3 65 P ower-On Self T est Messages Message display line The message display line shows the POST initialization progress . This is updated by the monarch processor. The system console also shows detail for some of these steps. Table 19 shows the code definitions.
66 Chapter 3 P ower-On Self T est Messages Console messages POST provides several messages that are displayed on the teststation console. This section describes these console messages . Type-of-boot This message reports the type of boot for the current POST execution, and the node ID and monarch processor.
Chapter 3 67 P ower-On Self T est Messages Main memory initialization This message reports that main memory initialization has started. F or example: Starting main memory initialization.
68 Chapter 3 P ower-On Self T est Messages Each character indicates the physical location of the DIMM and the logical size of the DIMM. The memory information is encoded as follows: V alue Memory Type . 32MB : 64MB | 128MB _ Empty # Hardw are deconfigured $ Softw are (user) deconfigured F or example: r0 r1 r2 r3 PB0L_A MB0L [.
Chapter 3 69 P ower-On Self T est Messages Booting Boombox Interactive boot This message indicates that POST is entering it's interactive mode. POST provides a console interface for system configuration and debug.
70 Chapter 3 P ower-On Self T est Messages the checksum and was rebuilt to the default structure. F or example: Test Station Parameters checksum FAILED, rebuilding.
Chapter 3 71 P ower-On Self T est Messages Memory board deconfiguration This message indicates that the specified memory board is deconfigured. This can be due to a memory board being found on one side of memory without a corresponding pair , since boards must be used in pairs of even/ odd boards .
72 Chapter 3 P ower-On Self T est Messages PB0L_B failed to go idle after memory init Unable to force CPU PB2L_A into idle loop Monarch completing memory initialization This message indicates that the monarch processor is completing the memory initialization assigned to the specified processor .
Chapter 3 73 P ower-On Self T est Messages Contiguous memory block not found This message indicates that POST could not find a block of contiguous memory to place at address zero to achieve good memory . POST will report no main memory to the OBP for this failure.
74 Chapter 3 P ower-On Self T est Messages F or example: cpu PB1R_A deconfigured due to PB1R_B shutdown. New monarch processor selected This message indicates that the previous monarch processor w as deconfigured and a new one was selected.
Chapter 4 75 4 T est Controller The T est Controller is an EEPROM-based utility that provides the environment for executing the offline diagnostic tests.
76 Chapter 4 T est Controller T est Contr oller modes T est Controller modes There are three basic operational modes for this utility: • Stand-alone mode • Interactive mode • I/O Utility mode In stand-alone mode, cxtest invokes the T est Controller .
Chapter 4 77 T est Controller User interface User interface The T est Controller provides for the control of offline diagnostic test execution. It utilizes a set of parameters to control its operation.
78 Chapter 4 T est Controller User interface • Read and write the 128 words of test specific information • Select the hardware to test • Display the current parameter selections Main menu T est.
Chapter 4 79 T est Controller User interface • 3=Resume T est Controller Execution—Continues execution from the point of interruption. • 4=Switch CPU—Allows the user to start the T est Controller on the specified processor . The previously used processor starts executing the command wait loop code .
80 Chapter 4 T est Controller User interface • 8=CPU Summary display—Displays a summary of the current processor and testing information. An example of the display is shown below:.
Chapter 4 81 T est Controller User interface Example CPU summary display MAIN Menu - CPU Summary Display Total Failures = 0 Configuration Map ================= CPUs : 0 1 2 3* 4 5 6 7 8 9 10 11 12 13 .
82 Chapter 4 T est Controller User interface The possible states in the CPU Summary Display are described in T able 20. T able 20 Processor States • 9=Display CPU Errors—Displays the errors for the currently selected processor .
Chapter 4 83 T est Controller User interface Example T est P arameters display . Test Configuration Menu - Test Parameters Display CPUs: ( 1) 0 1 2 3* 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 1.
84 Chapter 4 T est Controller User interface T est Selection displa y MAIN Menu - Test Selection Display 0=Return to Main Menu 1=*Memory test 2=not available 3=not available 4=not available 5=I/O test.
Chapter 4 85 T est Controller User interface • Selection 1 queries for the 40-bit address to read as follows: Enter 40-bit address: • Selection 2 queries for the 40-bit address and then for the 32-bits of data to write: Enter 32-bit data: • Selection 3 queries for the 40-bit address to read.
86 Chapter 4 T est Controller User interface T est Configuration menu The T est Configuration menu is shown below: T est Configuration menu Test Configuration Menu 0=Return to Main Menu A=Hardware .
Chapter 4 87 T est Controller User interface T est Configuration menu - Subtest displa y Test Configuration Menu - Subtest Display Subtest Description 0 subtest 0 description 1 subtest 1 description . . . . n* subtest n description An asterisk following the subtest number denotes that it is selected for execution.
88 Chapter 4 T est Controller User interface • 5=Read All T est P arameters—Reads all 128 words that make up the test parameter set and displays this information.
Chapter 4 89 T est Controller User interface T able 21 P arameter Defaults • 9=Display T est Configuration—Displays the current values of the processor parameters. An example of the displa y is shown in the example below . An asterisk denotes the current selections.
90 Chapter 4 T est Controller User interface T est Configuration menu - T est Parameters displa y Test Configuration Menu - Test Parameters Display CPUs: ( 1) 0 1 2 3* 4 5 6 7 8 9 A B C D E F 10 11 1.
Chapter 4 91 T est Controller User interface • Multiple hardware component numbers separated by commas or spaces , for example 1,+2,-3. The format 2, or +2, denotes to use this hardware component in testing. The format -2 denotes not to use this hardw are component in testing.
92 Chapter 4 T est Controller User interface Pause at Test Start (0=disabled, 1=enabled): • F=P ause at T est End—Allows the user to modify the pause at test end flag. This flag results in the T est Controller pausing the testing on this processor after last subtest has completed execution and all cleanup is complete .
Chapter 4 93 T est Controller Example of running diagnostics from T est Controller command line Example of running diagnostics from T est Controller command line This example shows how to run mem3000 from the T est Controller command line within the following scenario: • Configure mem3000 to run on a system with four memory boards installed.
94 Chapter 4 T est Controller Example of running diagnostics from T est Controller command line Step 2. From the T est Selection menu shown below , select Memory test, option 1.
Chapter 4 95 T est Controller Example of running diagnostics from T est Controller command line Step 5. From the menu, select Memory test, option 1. This opens the T est Configuration menu shown belo.
96 Chapter 4 T est Controller Example of running diagnostics from T est Controller command line Step 7. From the Hardw are Selection menu shown below , select CPUs, option 1.
Chapter 4 97 T est Controller Example of running diagnostics from T est Controller command line Step 3. From the T est Configuration menu, select Display Subtests, option 2.
98 Chapter 4 T est Controller Example of running diagnostics from T est Controller command line Step 4. Select all appropriate subtests. T able 22 lists the test patterns for subtests 230 through 238.
Chapter 4 99 T est Controller Example of running diagnostics from T est Controller command line Starting tests T o run the tests selected from the T est Controller main menu, select Begin T est Controller Execution, option 1. The output is shown in the example below: Example of mem3000 execution % Enter command: 1 Execution Starting.
100 Chapter 4 T est Controller Example of running diagnostics from T est Controller command line.
Chapter 5 101 5 cxtest The cxtest program is a graphical front end and a command line interpreter for the test controller . It is a standalone program that runs independently of any diagnostic tests loaded in the EEPROM on the Utilities board.
102 Chapter 5 cxtest Overview Overview The cxtest program runs on the teststation and communicates with the test controller via the NVRAM configuration parameters on the Utilities board. Depending on the command line, cxtest either starts the graphics display or runs as a command line interpreter .
Chapter 5 103 cxtest Overview • Retrieving error information from the test controller The test controller operates in the standalone mode when running in conjunction with cxtest . This is true whether one is using the command line version of cxtest or the graphics interface.
104 Chapter 5 cxtest Graphics interface Graphics interface T o start the cxtest graphics interface, specify the -d option on the command line as follows: % cxtest -d This causes cxtest to open a window on the display . Where the window is displayed is set by the environment variable $DISPLA Y .
Chapter 5 105 cxtest Graphics interface File menu The File menu has the following options: • Save Selections • Restore Selections • Log to File/Close Log File • Clear Log • Exit Save Selections The Save Selections option saves specific tests or configurations .
106 Chapter 5 cxtest Graphics interface The selections presented are based on whether the T est Controller has built a Subtest table and Class table in its tc_test_info_struct structure. Class menus Selecting a test opens a window that displays all classes for the test.
Chapter 5 107 cxtest Graphics interface The Defaults button installs test default values into all the parameters. If a class of tests has no parameters associated with it, the right most button (the square one) is not shown. Global T est P arameters menu cxtest provides the ability to loop on a number of tests by setting the Loop Enable count.
108 Chapter 5 cxtest Graphics interface Command menu The Command menu is used to perform actions on the node or complex being tested. These actions include: •G o • Reset Machine • Read Boot Config Map The Go selection starts the subtests .
Chapter 5 109 cxtest Graphics interface Figure 33 System configuration window Help menu The Help menu has two entries: About and Contents . The About selection displays the version number of cxtest running and the Contents selection opens a browser that can scroll through the help file.
110 Chapter 5 cxtest Graphics interface P owering down the system When using cxtest in a troubleshooting environment, it is not necessary to exit and enter cxtest each time the power is cycled. T o remove power to the system (for example , to move a board), power the system down leaving cxtest running.
Chapter 5 111 cxtest Command line interface Command line interface cxtest is a utility that allows the user to run tests loaded into the T est Controller . T ests can be specified on the command line or a Graphic User Interface can be started to simplify test selection.
112 Chapter 5 cxtest Command line interface Command line test selections The command line interface deciphers the following switches to select tests . • -mem —Memory diagnostic.
Chapter 5 113 cxtest Command line interface T o set the number of times a test is looped on use the -lt <number> option. Example of cxtest -lt option cxtest -mem -lt 3 -c 4 -io -c 2 The looping specification only applies to the memory test which runs the class-4 tests three times .
114 Chapter 5 cxtest Command line interface T o specify a list of subtests . place a comma between the numbers . As an example, -s 100,150,140 , runs subtest 100, then subtest 150, and finally subtest 140. Command line parameter specifications T o specify the value of a parameter for a test, use the -pa# <val> option.
Chapter 5 115 cxtest Example of running diagnostics from cxtest window Example of running diagnostics from cxtest window The following example procedure shows the user how to use mem3000 from cxtest . It assumes that the node configuration has been set up using the main cxtest window .
116 Chapter 5 cxtest Example of running diagnostics from cxtest window Figure 35 mem3000 Class 1 Subtest Selections window Step 4. In the Subtest Selections window for each class , click the button for subtest to be executed. Any combination of subtests may be executed.
Chapter 5 117 cxtest Example of running diagnostics from cxtest window Step 6. T o start the selected tests and subtests , click the Go option in the Command menu in the cxtest main window . Step 7. View the results in the lower window pane of the cxtest main window .
118 Chapter 5 cxtest Example of running diagnostics from cxtest window.
Chapter 6 119 6 Processor-dependent code firmware loader The processor -dependent code firmware loader ( pdcfl ) is a firmw are module with the capabilities of loading other firmware modules into FLASH. It is intended to speed up download of POST and OBP on newly manufactured or malfunctioning utility boards.
120 Chapter 6 Processor-dependent code firmware loader pdcfl loading, booting, and setup pdcfl loading, booting, and setup NO TE This step should not be necessary under normal circumstances.
Chapter 6 121 Processor-dependent code firmware loader pdcfl loading, booting, and setup This requires making these entries to the following files: T o /etc/services make the following entry: tftp 69/udp Trivial File Transfer Protocol T o /etc/inetd.
122 Chapter 6 Processor-dependent code firmware loader pdcfl commands pdcfl commands From the pdcfl prompt, the following commands are supported: • printenv [variable] —Prints configuration variables from NVRAM. • setenv variable value —Allows setting configuration variables in NVRAM.
Chapter 6 123 Processor-dependent code firmware loader pdcfl commands An example of the fload command PDCFL> fload post.fw POST TFTP server : 15.99.103.191 CUB IP : 15.99.111.150 Reading : post.fw Writing : POST (each '.' represents 4K copied) Sector erased 0xF0020000 .
124 Chapter 6 Processor-dependent code firmware loader pdcfl commands.
Chapter 7 125 7 cpu3000 This chapter describes cpu3000 processor test cpu3000 runs via the test controller and provides a basic test of the functionality of the P A8500. cpu3000 requires a minimum of one processor with its associated SP AC and two EWMBs .
126 Chapter 7 cpu3000 cpu3000 classes and subtests cpu3000 classes and subtests cpu3000 consists of a series of tests grouped together in classes beginning with verification of the most basic functionality and progressing toward more complex functionality .
Chapter 7 127 cpu3000 cpu3000 classes and subtests T able 26 cpu3000 Class 1 subtests Subtest Name Description 100 Processor basic V erifies the majority of registers and a basic set of instructions . Chassis code: 0x41020. 101 Processor -ALU V erifies the processor and arithmetic Logic unit (ALU) functionality .
128 Chapter 7 cpu3000 cpu3000 classes and subtests T able 27 cpu3000 Class 2 subtests T able 28 cpu3000 Class 3 subtests T able 29 cpu3000 Class 4 subtests 140 Diagnostic register V erifies the local Diagnose Registers. Chassis code: 0x4102a. 141 Remote diagnostics registers V erifies the remote Diagnose Registers.
Chapter 7 129 cpu3000 cpu3000 classes and subtests T able 30 cpu3000 Class 5 subtests Subtest Name Description 500 Late-early self test (LST -EST) Runs subtests 100, 101, 102, 103, 104, 105, 120, 130, and 150, first in main memory and then in the Icache.
130 Chapter 7 cpu3000 cpu3000 classes and subtests 540 Dcache miss V erifies that data can be encached from coherent memory . Chassis code: 0x44060. 560 TLB transfer V erifies TLB hits and misses , as well as access rights and protection ID validation.
Chapter 7 131 cpu3000 cpu3000 error s cpu3000 errors When a failure occurs , the chassis code is available through the test controller , along with the progress value.
132 Chapter 7 cpu3000 cpu3000 error s.
Chapter 8 133 8 io3000 The I/O diagnostic supports Symbios 875 HVD SCSI controllers , Symbios 895 L VD SCSI controllers , and T achyon F ibre Channel controllers. io3000 requires a node with a minimum of one processor , one SIOB with associated SP ACs , and two EWMBs with associated SMACs .
134 Chapter 8 io3000 io3000 classes and subtests io3000 classes and subtests io3000 consists of a series of tests grouped together in classes beginning with verification of the most basic functionality and progressing toward more complex functionality .
Chapter 8 135 io3000 io3000 classes and subtests io3000 subtests The io3000 subtests are listed in T able 32 through T able 41. T able 32 io3000 Class 1 subtests 11 SAGA SCSI T ape Interface T est V erifies the ability to successfully issue SCSI commands to every selected tape drive.
136 Chapter 8 io3000 io3000 classes and subtests T able 33 io3000 Class 2 subtests Subtest Name Description 200 Context/ shared memory read/ write Writes to the first 64-bit location of each context SRAM and reads them to verify that they can be uniquely accessed.
Chapter 8 137 io3000 io3000 classes and subtests 235 Prefetch memory march C- V erify writes and reads to all of prefetch memory using a bitwise march C- algorithm. The default option does a shortened version of the march C- algorithm by using a limited pattern set.
138 Chapter 8 io3000 io3000 classes and subtests T able 34 io3000 Class 5 subtests Subtest Name Description 500 SCSI disk test unit ready A SCSI test unit ready command is issued to all selected devices at least twice. This first time , it should return with a SCSI check condition (not reported to the user) since the SCSI bus has been reset.
Chapter 8 139 io3000 io3000 classes and subtests T able 35 io3000 Class 6 subtests Subtest Name Description 600 Channel init, ATPR = 0x0 625 Channel init, write tlb, data prefetch, A TPR = 0xa 630 Cha.
140 Chapter 8 io3000 io3000 classes and subtests Subtests 600-645 create channels by writing to the SAGA channel builder CSR. The method of channel creation and the specific mode (ATPR setting) is specified in the subtest’ s one line description. Each test will write data to the disk and read it back and verify it.
Chapter 8 141 io3000 io3000 classes and subtests 725 Jump outside of a page (TLB not encached) V erifies a DMA jump outside of a page. The TLB for the destination page is not encached in context SRAM. This means that SAGA must fetch a new TLB before the transfer can continue.
142 Chapter 8 io3000 io3000 classes and subtests T able 37 io3000 Class 8 subtests Subtest Name Description 800 Multidisk nonmixed traffic Issues all selected devices simultaneous SCSI writes and then SCSI reads. The c hannels are programmed in virtual mode, with data and TLB prefetch turned on.
Chapter 8 143 io3000 io3000 classes and subtests T able 38 io3000 Class 11 subtests Subtest Name Description 1100 SCSI tape test unit ready Issues a SCSI test unit ready command to all selected devices at least three times . This first time the SCSI bus will have been reset.
144 Chapter 8 io3000 io3000 classes and subtests T able 39 io3000 Class 12 subtests Subtest Name Description 1200 Symbios PCI configuration space test V erifies the ability of the SAGA to access the Symbios SCSI controller by wa y of the PCI configuration space.
Chapter 8 145 io3000 io3000 classes and subtests 1230 Symbios SCSI Scripts RAM test P erforms a simple data equals address pattern test of the SCRIPT RAM. 1240 Symbios SCSI Interrupt test Copies a simple SCRIPTS instruction to SCRIPTS RAM on the Symbios controller .
146 Chapter 8 io3000 io3000 classes and subtests T able 40 io3000 Class 15 subtests NO TE Class 15 subtests will also test DVD drives . Subtest Name Description 1500 SCSI CDROM test unit ready Issues a SCSI test unit ready command to all selected devices at least twice.
Chapter 8 147 io3000 io3000 classes and subtests T able 41 io3000 Class 16 subtests User parameters The test controller provides io3000 with up to 37 user parameter words .
148 Chapter 8 io3000 io3000 classes and subtests T able 42 io3000 test parameters W ords Description 0 See T able 43. 1 Device write enable mask—Each bit in the mask corresponds with a device. Bit 0 (MSB or left most bit in the parameter word) corresponds to device 0, bit 29 corresponds to the last (29th) device.
Chapter 8 149 io3000 io3000 classes and subtests T able 43 io3000 user test parameter word 0 bit definition Bit Description 0-23 Unused 24 F orce code copy enable—Setting this bit causes all subtests that use encached routines to copy the code segment from flash into main memory .
150 Chapter 8 io3000 io3000 classes and subtests Device specification Due to Core Logic SRAM space limitations, only 20 devices per SAGA can be tested at a time. Up to 24 SCSI devices can be specified using parameter words 8-19. Each of these parameter words contains two device specifications , as shown in Figure 37.
Chapter 8 151 io3000 io3000 classes and subtests T able 44 io3000 bit definition for direct SCSI device specification (words 8-19) Figure 38 io3000 test parameter device specification for F ibre Channel attached SCSI targets (words 20-37) Fields within each parameter word specify the devices as shown in T able 45.
152 Chapter 8 io3000 io3000 classes and subtests T able 45 io3000 bit definition for Fibre Channel attached SCSI device specification (words 29-37) Devices are numbered according to their position in the parameter list. A device can be specified in any of the device specification locations in user parameter space.
Chapter 8 153 io3000 io3000 classes and subtests T able 46 io3000 SAGA name to number correlation SAGA name SAGA number IOLF_A 4 IOLF_B 0 IOLR_A 5 IOLR_B 1 IORR_A 6 IORR_B 2 IORF_A 7 IORF_B 3.
154 Chapter 8 io3000 io3000 error codes io3000 error codes When a failure is encountered, an event code is set along with an error message. The least significant 12 bits of the event code contain the error code. T able 47 lists the io3000 error codes .
Chapter 8 155 io3000 io3000 error codes io3000 device specification errors io3000 device specification errors post the following error message: SAGA_name/ctlr_num/tgt_num/lun_num Example of io3000 device specification error message: IOLF_A/ct0/idf/lu0 T able 48 shows each io3000 general error code.
156 Chapter 8 io3000 io3000 error codes T able 49 io3000 SAGA general errors io3000 SAGA CSR errors io3000 SAGA CSR error codes post the following error message: SAGA_name/address/act_val/exp_val Example of io3000 SA GA CSR error message: IOLF_B/fc010008/00e0000f0c000000/00e0000f0c100000 T able 50 shows each io3000 SAGA CSR error code.
Chapter 8 157 io3000 io3000 error codes io3000 SAGA ErrorInfo CSR error The io3000 ErrorInfo CSR error code posts the following error message: SAGA_name/cause_bit/address/act_val Example of io3000 SA GA ErrorInf o CSR error: IOLF_A/5/fc210098/10e0000f0c000000 T able 51 shows the io3000 SAGA ErrorInfo CSR error code.
158 Chapter 8 io3000 io3000 error codes T able 52 io3000 SAGA ErrorCause CSR errors io3000 SAGA SRAM errors io3000 SAGA SRAM error codes post the following error message: SAGA_name/address/act_val/exp_val Example of io3000 SA GA SRAM error message: IOLF_A/f81fc00080/5555555555555555/55f5555555555555 T able 53 shows each io3000 SAGA SRAM error code.
Chapter 8 159 io3000 io3000 error codes io3000 controller general errors io3000 Controller general error codes post the following error message: SAGA_name/ctlr_num Example of io3000 controller general error message: IOLF_B/ct0 T able 54 shows each io3000 general controller error code.
160 Chapter 8 io3000 io3000 error codes T able 55 io3000 PCI errors io3000 controller command errors io3000 controller command error codes post the following error message: SAGA_name/ctlr_num/tgt_num/.
Chapter 8 161 io3000 io3000 error codes io3000 DMA error The io3000 DMA error code posts the following error message: SAGA_name/ctlr_num/tgt_num/lun_num/address/act_val/ exp_val Example of io3000 DMA error message: IOLF_A/ct0/idf/lu0/0004148200/a5a5a5a4/a5a5a5a5 T able 57 shows the io3000 DMA error code.
162 Chapter 8 io3000 io3000 error codes Example of io3000 Symbios controller specific error messa ge: IOLF_B/ct1/f804000010/ffffff01/00000001 T able 59 shows each io3000 Symbios controller specific error code .
Chapter 8 163 io3000 io3000 error codes io3000 DIODC driver errors io3000 Diagnostic I/O Dependent Code (DIODC) driver error codes post the following error message: SAGA_name/ctlr_num/tgt_num/lun_num/.
164 Chapter 8 io3000 Notes on io3000 Notes on io3000 io3000 dumps trace data into Core Logic SRAM to troubleshooting failures . A script provided with io3000 called io_tr is located in the scripts directory (located in /spp/scripts at the time of this writing) that views this trace data.
Chapter 9 165 9 mem3000 This chapter describes mem3000 , a memory test for V2500 systems. mem3000 is core logic flash-based memory diagnostic that verifies the functionality of the memory subsystem. mem3000 requires a node with a minimum of one processor with two memory boards .
166 Chapter 9 mem3000 mem3000 classes and subtests mem3000 classes and subtests mem3000 verifies the V2500 memory subsystem using the T est Controller . mem3000 requires one node with a minimum of one process with associated SP AC and two EWMBs with associated SMACs .
Chapter 9 167 mem3000 mem3000 classes and subtests mem3000 subtests The mem3000 subtests are listed in T able 64 through T able 69. T able 64 mem3000 class 1 subtests T able 65 mem3000 class 2 subtest.
168 Chapter 9 mem3000 mem3000 classes and subtests T able 66 mem3000 class 3 subtests T able 67 mem3000 class 4 subtests T able 68 mem3000 class 5 subtests Subtest Description 300 V erifies the memor.
Chapter 9 169 mem3000 mem3000 classes and subtests T able 69 mem3000 class 6 subtests 510 V erifies ECC double bit data errors are detected and logged using coherent operations 520 V erifies ECC dou.
170 Chapter 9 mem3000 V2500 memory configurations V2500 memory configurations In the V2500 server , Excalibur Pluggable Memory Boards (EPMBs) are installed in 16 DIMM connectors on the EWMBs. A V2500 memory board is organized by quadrants , rows, and buses .
Chapter 9 171 mem3000 V2500 memory configurations T able 70 DIMM row/bus table V2500 DIMM quadrant designations Memory boards can be populated in increments of four DIMMs called quadrants .
172 Chapter 9 mem3000 V2500 memory configurations Figure 39 V2500 DIMM locations Example: Q2B3: Quadrant 2, Bank 3 V2500 DIMM configuration rules Use the following rules to plan the memory board DIMM configuration: • All memory boards must be populated identically .
Chapter 9 173 mem3000 V2500 memory configurations • DIMMs in quadrant 1 can be of a different size than DIMMs in quadrant 2 or 3 without degrading performance. • DIMMS in quadrant 0 and 1 should be the same size for maximum performance. • DIMMS in quadrant 2 and 3 should be the same size for maximum performance.
174 Chapter 9 mem3000 User parameters User parameters The T est Controller allows mem3000 20 user parameters. T able 73 defines these parameters: T able 73 User parameter definitions P arameter 4 defaults to the value 2 causing the test to automatically probe all known DIMMs to determine their type: 80- or 88-bit DIMMs.
Chapter 9 175 mem3000 User parameters Figure 40 F ormat of parameter 6 P arameter 7 contains the masks for boards 4-7 in the order shown in Figure 41. Figure 41 F ormat of parameter7 As an example, the Octant Mask for board 0 is encoded in the first two digits of P arameter 6.
176 Chapter 9 mem3000 mem3000 error codes mem3000 error codes When a failure is encountered, an event code is set along with an error message. The least significant 12 bits of the event code contain the error code. T able 74 lists the mem3000 error codes .
Chapter 9 177 mem3000 mem3000 error codes 033 SMAC did not log the occurrence of a single bit ECC failure 035 SMAC did not log the occurrence of a double bit ECC failure 040 Data miscompare error occu.
178 Chapter 9 mem3000 mem3000 error codes The asterisks next to the error codes listed in T able 74 actually indicate a range of events as shown in T able 75.
Chapter 9 179 mem3000 mem3000 error codes T able 76 P atterns used in specified subtests Error messages When a failure is encountered an event code is set along with an error message. The least significant 12 bits of the event code contain the error code.
180 Chapter 9 mem3000 mem3000 error codes Figure 42 Type one error message format There are six fields separated by / symbols . The meaning of each field is as follows: • Field 1—Specifies the .
Chapter 9 181 mem3000 mem3000 error codes The two fields of the type two error are as follows: • Field 1—Specifies the EWMB to which the information pertains • Field 2—Specifies the type of.
182 Chapter 9 mem3000 Notes on mem3000 Notes on mem3000 There is a dependency upon POST to initialize the memory system. This test uses many of the CSR values from POST and does not reconfigure the system. There are some exceptions in which CSR values need to be changed in order for the test to run.
Chapter 10 183 10 Scan test The Exemplar scan test ( est ) is a diagnostic utility that uses the system scan hardware making it possible to perform connectivity tests and to test gate array internal registers . The est utility runs on the teststation and sends scan instructions to a given node by wa y of the Ethernet.
184 Chapter 10 Scan test est utility test en vironment est utility test environment est is started on the teststation and is located in /spp/bin/est. The user has the option of either starting up a user interface or having the est utility run a script.
Chapter 10 185 Scan test est utility test en vironment T o perform ID and ring c hecks in the utility system, the user should turn off the power control feature either though the command line argument -p or through a runtime option command ( power_control ).
186 Chapter 10 Scan test Running the est GUI Running the est GUI The est GUI may be started at the command prompt. The following is the est command usage: /spp/bin/est [-option] node_number As an example to bring up the GUI and test node 0, enter the following command: % /spp/bin/est -x 0 T able 77 on page 200 provides a complete list of options .
Chapter 10 187 Scan test Running the est GUI The lower set of buttons allows the user to quickly and easily run the scan tests in a wholesale fashion. The test can be modified to run fewer patterns, to loop continuously or for a finite number of times , to test non- default limits , etc.
188 Chapter 10 Scan test Running the est GUI F iles button Clicking the Files button opens pop-up menu with three selections: • Execute Scripts—Runs a file containing est commands. • Reset Log File—Clears the log file . • Exit—Closes the est main window and exits the program.
Chapter 10 189 Scan test Running the est GUI Clocks button Clicking the Clocks button opens pop-up menu with four selections: • Upper—Sets the upper limit of the system clocks . • Nominal—Sets the system clocks to their nominal values . • External—Selects an external clock from the ECUB .
190 Chapter 10 Scan test Running the est GUI • Command Menu—Opens the command line window which allows the user to enter est commands directly from the GUI system. • Scan Debug Menu—Opens the debug window . • Connectivity T est Menu—Opens the connectivity test window .
Chapter 10 191 Scan test Running the est GUI Figure 48 est connectivity window T o select a connectivity test, clic k on either the dc or ac button in the Connectivity T est panel. In the P attern panel, clicking the All button runs each test pattern.
192 Chapter 10 Scan test Running the est GUI Gate array test window The gate array test window provides a means to test all gate arrays in the Exemplar system.
Chapter 10 193 Scan test Running the est GUI The next lower panel determines which and how many patterns are used in the gate array test. The test normally uses all patterns , but, for troubleshooting , you may set the starting and ending patterns, set the maximum number of patterns (a range of patterns), or set a single, custom pattern.
194 Chapter 10 Scan test Running the est GUI Scan window The scan window provides means of testing the system scan rings. Figure 50 shows the est scan window . NO TE F or more information on scan rings and modes , see the IEEE 1149.1 JTAG specification.
Chapter 10 195 Scan test Running the est GUI Clicking the buttons in the Scan panel sets the scan paths . All scan modes can be selected or the test can be set up to test the individual pathwa ys as follows: • All—T ests all scan modes . • Bypass—T est the bypass ring .
196 Chapter 10 Scan test Running the est GUI SCI cable test window The SCI cable test window provides a means to test the cables that connect the scalable coherent interfaces between nodes. All cables are tested by default, but an individual cable can be tested using this window .
Chapter 10 197 Scan test Running the est GUI Help Clicking the Help button opens pop-up menu with five topic selections: • Overview • Commands • GUI • Input Files • Options Clicking on one of these options opens the Help window shown in Figure 52.
198 Chapter 10 Scan test Running the est GUI Figure 52 est Help window.
Chapter 10 199 Scan test Running the est GUI Figure 53 est Help browser window.
200 Chapter 10 Scan test Running est from command line Running est from command line The following is the command line usage for est : est [-options] <node_number> F or example , to test node 0, enter: % est 0 est reads configuration information from files stored in /spp/data (e.
Chapter 10 201 Scan test Running est from command line Some examples of est usage are: est -v est -l -f my_script 0 est -o ./my_log_file 0 The est utility uses certain data and vector files located in the /spp/est directory . Unless disabled or redirected, the est utility will generate a log file, est.
202 Chapter 10 Scan test Running est from command line Example of output when est is started: % est 0 Excalibur Scan Test 1.0.0.2 1998/11/25 10:32:58 Steven Terry ......................... ..... General EST Tests: c ... compare id’s to config file r .
Chapter 10 203 Scan test Running est from command line Example output when using the est -h option: % est -h Excalibur Scan Test 1.0.0.2 1998/11/25 10:32:58 Steven Terry usage: est [-options] [server] node [-cp port] [-sp port] options: -h ... print this help message -v .
204 Chapter 10 Scan test Running est from command line T able 78 AC Connectivity test options Bypass test The Bypass test format is: b The Bypass test places the scan ring hardware into bypass mode . DC Connectivity test DC Connectivity test format is: d [-s -p #] T able 79 shows the options for the this test.
Chapter 10 205 Scan test Running est from command line T able 80 Gate Array test options By default, the g command tests all arrays. When the -r , -b , -j , or -t options are used, only arrays that meet all criteria are tested.
206 Chapter 10 Scan test Running est from command line When an error occurs, parallel scans into the scan hardw are may result in bus conflicts on TDO pins .
Chapter 10 207 Scan test Running est from command line SCI test The sci utility tests the Coherent T oroidal Interface (CTI) cables between nodes . The term SCI (Scalable Coherent Interface) is often used in place of the term CTI; the terms are interchangeable .
208 Chapter 10 Scan test Running est from command line SCI_all test The sci_all utility tests all SCI cables in a complex. The usage of sci_all is as follows: sci_all [test] where: test Refers to the specific test: dc, dc_c lk, ac. With the dc test, the clock from the receiver node is used.
Chapter 10 209 Scan test Running est from command line • -c high —Displays the upper clock limit. • -p 1 nom —Sets the supply 1 margin to nominal. There are four power supplies , 1 through 4. T able 81 shows the valid values for cloc k and power .
210 Chapter 10 Scan test Running est from command line T able 82 est runtime option commands Command Description Default argument log_file Turn on/off writing to the log file. On stop_on_error Stops the test when an error is detected. On limit_patterns Runs a limited set of patterns when testing arrays .
Chapter 10 211 Scan test Running est from command line est command flags and options There are a number of flags or options that operate on and enhance the est commands. Some of these flags and options perform the same functions as the run time option commands.
212 Chapter 10 Scan test Running est from command line An example file might contain the f ollowing lines: # check the rings r # show pattern pass/fail steps F P #limit dc testing to 3 patterns F D 3.
Chapter 11 213 11 Utilities This chapter details most of the diagnostic utilities which include: • address_decode • arrm • consolebar • dcm • dfdutil • dump_rdrs • fwcp • fw_init • g.
214 Chapter 11 Utilities address decode address decode address_decode decodes 40-bit virtual address into the physical node, smac, row , bus , and bank.
Chapter 11 215 Utilities AutoRaid reco very map (arrm) AutoRaid recovery map (arrm) The arrm utility is used only with an AR-12H (C5447A) disk array that displays the status "No address table" on the front panel rather than the usual status of "Ready .
216 Chapter 11 Utilities AutoRaid reco very map (arrm) 0/1/0.5.0 If the EPIC number is outside of the range 0 to 7, the slot number is outside of the range 0 to 2, or the target number is outside of the range 0 to 15, an error message is displayed and the operator prompted to reenter the address .
Chapter 11 217 Utilities AutoRaid reco very map (arrm) Example of unsucessful recovery message Utility Compatibility Check Unsuccessful. The Product firmware ma y not support RECOVER! Do you want to attempt recover anyw ay ([y]/n)? In all cases of this type, respond with a y , Y , n , or N followed by ENTER or just ENTER .
218 Chapter 11 Utilities AutoRaid reco very map (arrm) where xx is a number between 0 and 100. This message indicates the percentage of the volume set that has been recovered and is updated approximately once per second. The recovery operation can take several minutes depending on the amount of data in the volume set.
Chapter 11 219 Utilities consolebar consolebar The consolebar utility is an X application that provides a simple interface capable of starting console windows to all V2500 nodes configured on the teststation.
220 Chapter 11 Utilities dcm dcm dcm dumps the boot configuration map information for the specified node. There are two main reporting modes; one for general hardw are configuration and one for the DIMM type. The general hardware mode reports processors , ASICs , and memory size information.
Chapter 11 221 Utilities dcm Output table using dcm <node_id> Acquiring Boot Configuration Map... Stingray Configuration Map Dump: Node: 0 (hw2a-0000) ============================================================= VERSION: 1.0 compiled: 1998/12/16 18:35:00 CheckSum:0xf407a073 Boot Config Map Size:164 words POST Revision:1.
222 Chapter 11 Utilities dcm MB5L_T - EMPTY MB6R_T - EMPTY MB7R_T - EMPTY Memory: ======= Physical: L=128MB, M=64MB, S=16MB Logical: l=128MB, m=64MB, s=16MB (If logical memory not specified, then it m.
Chapter 11 223 Utilities dcm Output table using dcm -d all <node_id> Stingray Configuration Map DIMM Info: Node: 0(hw2b-0000) ============================================================= VERSION: 0.
224 Chapter 11 Utilities dfdutil dfdutil dfdutil is a standalone offline utility that downloads firmware to SCSI devices including disks , arrays , and fibrechannel devices such as SCSI MUX and fibrechannel arrays . The firmware image(s) are contained in a Logical Interc hange F ormat (LIF) volume on the teststation at /spp/firmware/DFDUTIL.
Chapter 11 225 Utilities dfdutil Example of dfdutil output when loading Loading file dfdutil.fw ................................... ............................................ .......................................................... ...............
226 Chapter 11 Utilities dfdutil Example of dfdutil output (continued//0 Indx Path Product ID Bus Size Rev ---- ------------------- ------------------- ---- ------ ------ 0 5/0.8.0.255.7.12.0 HP HPA3308 FC 0 d373 1 5/0.8.0.124.0.14.0 DGC DISK FCMUX 4006 0860 1.
Chapter 11 227 Utilities dfdutil • b—slot number • c—path level (alwa ys 0) • d—alwa ys 8 for FC storage • e—upper 4 bits of loop address • f—lower 4 bits of loop address • g—LUN number If the device is attached to an FC MUX, the path is formatted as a/b.
228 Chapter 11 Utilities dfdutil dfdutil LIF file table The descriptions of the fields in the LIF file table are as follows: • Filename—Specifies the name of the file in the LIF volume . The operator specifies this name when issuing download commands to the devices .
Chapter 11 229 Utilities dfdutil DOWNLOAD command Use the DOWNLOAD command to download firmware to a particular device. DOWNLOAD transfers the contents of a particular firmw are file to a device. It prompts the user for any arguments that were not specified on the command line.
230 Chapter 11 Utilities dfdutil DISPMAP <disk index> The user may enter the index number of a single device; using no index number causes DISPMAP to list all devices. This command will display the bootable device table displayed when dfdutil is started.
Chapter 11 231 Utilities dfdutil DISPFILES command The DISPFILES command displays a list of all available firmw are files found on a LIF device. The command displa ys: • File name • Intended pro.
232 Chapter 11 Utilities dfdutil Entering HELP without a command name displays a list of all available dfdutil commands. Entering the specific command name after HELP outputs specific information about the command. Notes and cautions about dfdutil This section presents some limitations and cautions concerning dfdutil .
Chapter 11 233 Utilities dfdutil Shared SCSI Buses If dfdutil is running on a system which shares any of its SCSI busses with another system or systems , the other system or systems must be halted while this program is running.
234 Chapter 11 Utilities dump_rdr s dump_rdrs The dump_rdrs utility automatically resets the specified node and directs it to boot the RDR dumper firmware module . Once it detects that the RDR dumper firmware has completed, it scans out the results and places a formatted RDR dump of each processor in /spp/data/<complex>/ nodeX.
Chapter 11 235 Utilities fwcp fwcp fwcp is an OBP command that upgrades system firmware . A single firmware pac kage may be loaded by the following command: % fwcp <filename> T o load all system firmw are packages , use the following master download script: source /core@f0,f0000000/ lan@0,d30000;15.
236 Chapter 11 Utilities fw_init fw_init fw_init provides an automatic means for downloading firmware to each node and initializing certain data structures in NVRAM. Using this script prevent problems that could occur when executing this procedure manually.
Chapter 11 237 Utilities fw_init fw_init message example 3 Loading Diagnostic LIF header on "hw2a-0000". fw_init message example 4 Loading JTAG firmware on "hw2a-0000". fw_init message example 5 The "hw2a" complex will now be reset to OBP.
238 Chapter 11 Utilities get_node_info get_node_info The get_node_info utility provides as a mechanism for scripts or programs to access the teststation configuration information generated by the ts_config configuration tool.
Chapter 11 239 Utilities get_node_info [OPTIONS] include the following: • -a —Display all fields (default) • -A —Display all configured nodes The selected fields will be printed in the orde.
240 Chapter 11 Utilities hard_logger hard_logger hard_logger is a script that invokes the interrogators and extractors to log all error information on a node The usage of the script is: hard_logger [node number] [node number] is a hex number .
Chapter 11 241 Utilities hard_logger T o interrogate the controllers , hard_logger calls the ASIC specific interrogator located in /spp/scripts/<asic>. F or example, the SMAC interrogator is located in /spp/scripts/smac The interrogator returns a list of extractors to run on that ASIC in /spp/data/<COMPLEX_NAME>/hl/inter_n$node.
242 Chapter 11 Utilities lcd lcd lcd prints the current contents of the liquid crystal display for node 0 of the current complex. It has the following format: lcd The complex can be changed by using the set_complex utility . The output is sent to stdout output.
Chapter 11 243 Utilities load_eprom load_eprom The load_eprom utility resides on the teststation. It downloads the core firmware products into the EEPROM on the Utilities board through the scan interface. It can also update the JT AG scan interface controller firmware .
244 Chapter 11 Utilities load_eprom T able 83 load_eprom options As an example, entering the following reads the file /spp/firmw are/ post.fw and updates the POST section of Flash EEPROM on the Utilities board. xns3_d% load_eprom -n hw2a-0000 -p /spp/firmware/ post.
Chapter 11 245 Utilities load_eprom Example output of load_eprom -n hw2a-0000 -p entry.pdc command Reading file “entry.pdc”: 4253 (0x109d) bytes read. Using default SPAC (P0L). Erasing sector 0 (0xf0000000) OK Writing sector 0 (0xf0000000) .. OK Example output of load_eprom -n hw2a-0000 -p post.
246 Chapter 11 Utilities pim_dumper pim_dumper pim_dumper is a utility used to display Process Internal Memory (PIM) information after a TOC , LPMC , or HPMC .
Chapter 11 247 Utilities pim_dumper The TOC/LPMC/HPMC options are mutually exclusive. Specify only one of these options; do not specify any , and the default mode dumps all TOC/LPMC/HPMC data. If pim_dumper is able to accomplish the desired action, it returns zero .
248 Chapter 11 Utilities set_complex set_complex The set_complex sets the default V2500 Complex Name in the current shell environment. set_complex [COMPLEX_NAME] Once set, teststation diagnostic or console utilities that are run from within the shell operate on the specified complex.
Chapter 11 249 Utilities set_complex set_complex can be invoked anytime the user wants to change the shell default complex. If the user enters an invalid COMPLEX_NAME , the default complex becomes unset and the prompt string indicates this condition.
250 Chapter 11 Utilities soft_decode soft_decode soft_decode decodes single-bit ECC error data. This perl script decodes single-bit ECC error information. It prompts for syndrome, row , and address information that is parsed, decoded, and displayed in an easy-to-read format that can be cut-and-pasted into quasar .
Chapter 11 251 Utilities sppconsole sppconsole sppconsole connects the user to the console for a specified node. sppconsole has the following format: % sppconsole node [opt1, ..., optN There are several wa ys to initiate the sppconsole interface. • Run the sppconsole command in a shell on the teststation.
252 Chapter 11 Utilities sppconsole Example of sppconsole boot output joker-t(hw2b)% sppconsole [enter `^Ec?' for help] [no, sppuser@joker-t is attached] [replay] POST Hard Boot on [0:PB0L_A] HP9000/V2500 POST Revision 1.
Chapter 11 253 Utilities sppconsole Example of OBP output while booting OBP Power-On Boot on [0:0] ------------------------------------------------------------------------------- PDC Firmware Version Information PDC_ENTRY version 4.1.0.9 POST Revision: 1.
254 Chapter 11 Utilities sppconsole The following message appears in the console window: [0:1] ok [read-only -- use `^Ecf’ to attach, `^Ec?’ for help] Attach to the node by entering Ctrl ecf . Press the Ctrl key e simultaneously; do not press the Ctrl key with the c and f .
Chapter 11 255 Utilities tc_init tc_init tc_init determines the node ID, ethernet address, and IP address for all nodes in the complex. This information is then stored in the NVRAM of all nodes as one 12-byte entry per node.
256 Chapter 11 Utilities tc_init Execute tc_init after the node has been configured by jf- node_ip_set and xconfig . ccmd must finish the scan database generation. Once ccmd executes, the changes become effective the next time test_controller is running.
Chapter 11 257 Utilities tc_ioutil tc_ioutil tc_ioutil resets the node and requests that the T est Controller load, (via tftp) and boot the specified file.
258 Chapter 11 Utilities tc_show_struct tc_show_struct The tc_show_struct tool examines certain structures that the test controller uses to set up and run tests .
Chapter 11 259 Utilities tc_show_struct The tc_cpu_info_struct structure displays the status or state of each processor and the current subtest. The tc_show_struct tool takes two arguments: the first is the test of interest, the second is the node of interest.
260 Chapter 11 Utilities tc_show_struct 104) 0x00000000 105) 0x00000000 106) 0x00000000 107) 0x00000000 108) 0x00000000 109) 0x00000000 110) 0x00000000 111) 0x00000000 112) 0x00000000 113) 0x00000000 .
Chapter 11 261 Utilities V ersion utilities V ersion utilities This section describes the three version utilities. diag_version The diag_version utility displays the product name and the version of the current teststation software . F or example: $ diag_version HP9000/V2500 Diagnostics, Version 1.
262 Chapter 11 Utilities V ersion utilities ver ver is a teststation version retriever utility . It is used to read and display the version information built into each diagnostic product.
Chapter 11 263 Utilities Event processing Event processing This section discusses three event processing utilities: • event_logger • log_event event_logger The event_logger utility is the teststat.
264 Chapter 11 Utilities Event processing event_logger should never terminate, but must be killed. If a second copy of event_logger is started it attempts to kill the existing copy of the event_logger . There should only be one copy of event_logger running at any one time.
Chapter 11 265 Utilities Event processing The -c option displays event information output on the console as well. If the event severity is high enough, this happens automatically . event_logger displays any events that have a severity greater than the warning level.
266 Chapter 11 Utilities Miscellaneous tools Miscellaneous tools The following miscellaneous tools are described in this section: • kill_by_name • fix_boot_sector kill_by_name The kill_by_name script kills processes by name rather than by process identification.
Chapter 12 267 12 Scan tools This chapter details most of the scan tools which include: • sppdsh • do_reset • jf-node_info • jf-ccmd_info • jf-reserve_info.
268 Chapter 12 Scan tools sppdsh sppdsh sppdsh is an enhanced version of the Korn Shell ( ksh ) with all of the functionality of ksh , as well as new commands that are suited to a diagnostic environment. sppdsh resides on the teststation in /spp/bin/ sppdsh.
Chapter 12 269 Scan tools sppdsh Definitions The following definitions will help user with the operation of sppdsh : • node id—An identification (ID) that can be either the node IP name or a node number . T o distinguish between one node number and another , the environmental variable, COMPLEX_NAME, indicates the complex.
270 Chapter 12 Scan tools sppdsh T able 86 sppdsh parameters P arameter V alue Unknown 0xff Reserved 0x00 P ass 0x01 F ail 0x10 Deconfigured by POST 0x20 Empty 0x30 Deconfigured by software 0x40 a 1.
Chapter 12 271 Scan tools sppdsh a. System memory can be modified through partial deconfigura- tion. • buf[1..4]—A buffer is a 4K byte block of memory on the test station that is used as a temporary holding area.
272 Chapter 12 Scan tools sppdsh • backplane_serial_number—Identifies a specific board on the diagnostic network. This number may be read with the COP command. It is used to assign new node numbers or complex serial numbers. • complex_serial_number—Identifies all the nodes in a complex.
Chapter 12 273 Scan tools sppdsh • Device_name—Refers to a major electrical component or subsection of a node. Examples of device names are: • SP AC—Processor agent chip • SMAC—Memory chip.
274 Chapter 12 Scan tools sppdsh • memory size—An argument used to deconfigure larger amounts of memory across all memory boards on a node. • net cache size—Refers to the memory shared between nodes in each node’s network cache. The network cache should be the same across all nodes in a complex.
Chapter 12 275 Scan tools sppdsh • power <node id> supply[1..4] [low|nom|up] —Changes the power margin on the supply indicated across all nodes in contact with the test station. • power <node id> supply[1..4] [low|nom|up —Changes the power margin on the supply indicated across all nodes in contact with the test station.
276 Chapter 12 Scan tools sppdsh NO TE F or clarity , a 0x0 style notation is returned by the shell rather than the 16#0 notation of ksh. The 16#0 notation is acceptable for data that can be expressed in 32 bits or less .
Chapter 12 277 Scan tools sppdsh • bput [-q] <part>:<field> <value> —Inserts data into the locked scan ring image. When the -q option is used, the results are displayed without the scan field name. • bunlock n<node_number>:<ring>:<path> —Writes the scan ring image and unlocks it.
278 Chapter 12 Scan tools sppdsh • ecc_cpy <address> <data> [size] —Copies the data into the ECC associated with the cache line of address and repeats for size cache lines. Data conversion commands Data conversion commands manipulate, evaluate or interpret data within the diagnostic shell.
Chapter 12 279 Scan tools sppdsh l_sub <arg1> <arg2> —Left subtract two data arguments. F or example: abc=`l_sub 0x55 0x1` l_mod <arg1> <arg2> —Left modulo two data arguments. F or example: abc=`l_mod 0x55 0x1` l_mult <arg1> <arg2> —Left multiply two data arguments.
280 Chapter 12 Scan tools sppdsh node <node _number> — set default node to be node _number in the current complex. fi_node —Find all available nodes in the current complex. fi_cpu [-v] [-q] <node_number> —Find all available processors of node_number in the current complex.
Chapter 12 281 Scan tools sppdsh I/O buffering commands This section presents a list of the sppdsh I/O buffering commands. F or these commands , four default buffers are created: buf1 - buf4. buf_cmp buf1 buf2 —Compares two buffers. Null is returned if they are the same.
282 Chapter 12 Scan tools sppdsh mem_cmp addr1 addr2 size —Compares the memory at addr1 to (addr1 + size ) to that at addr2 . mem_cmp addr1 buf1 size —Compares the memory at addr1 to ( addr1+size ) to that at buf1 . mem_dump addr [size] —Dumps the memory starting at addr .
Chapter 12 283 Scan tools sppdsh 6 pb6l, p6l, pb6r [pcxu], spac6, [pcxu] 7 pb7r , p7l, pb7l [pcxu], spac7, [pcxu] 8 mb0l_m, mb0l_t smac0, [stac0] 9 mb1l_m, mb1l_t smac1, [stac1] 10 mb2r_m, mb2r_t smac.
284 Chapter 12 Scan tools do_reset do_reset do_reset performs one of four levels of reset on a node or complex. The first argument is either a node ID , complex, or the keyword, all , which resets all nodes. If no nodes are specified, the default is to reset all nodes in contact with the teststation.
Chapter 12 285 Scan tools jf-node_info jf-node_info jf-node_info displays the IP address , UDP port and JTAG firmw are version string for each node in a complex. The -e option adds the ethernet address to the display . The -c option adds the core version to the display .
286 Chapter 12 Scan tools jf-ccmd_info jf-ccmd_info jf-ccmd_info displays information about active V2500 nodes connected to the diagnostic LAN . It has the following format: jf-ccmd_info The display i.
Chapter 12 287 Scan tools jf-reserve_info jf-reserve_info Before using the JTAG scan interface on the Utilities board, teststation utilities must reserve the JTAG hardw are on a time-sharing basis .
288 Chapter 12 Scan tools jf-reserve_info.
Appendix A 289 A List of diagnostics This appendix provides a list of all utilities and diagnostics in this book and where they are located. T able 89 List of diagnostics Name Locations address_decode.
290 Appendix A List of diagnostics hard_logger P age 240 io3000 Chapter 8, page 133 io_tr P age 164 jf-ccmd_info P age 286 jf-node_info P age 285 jf-reserve_info P age 287 kill_by_name P age 266 lcd P.
Appendix A 291 List of diagnostics ts_config P age 23 ver P age 262 xconfig P age 42 xsecure P age 51 Name Locations.
292 Appendix A List of diagnostics.
Index 293 Index A AC Connectivity test , 203 AC test of a node , 11 address IP , 40 address decode , 213 , 214 , 216 , 217 , 218 arrm , 213 , 215 Attention lightbar , 4 , 7 B Boot Configuration map ,.
294 Index io3000 SAGA ErrorCause CSR error , 157 io3000 SAGA general errors , 155 io3000 SAGA SRAM errors , 158 io3000 SCSI inquiry error , 161 mem3000 error codes , 176 mem3000 extended error codes ,.
Index 295 processor init steps, table , 13 processor run-time status,table , 14 Processor status line , 13 LEDs attention light bar , 12 LIF file table , 228 Liquid crystal display (LCD) , 4 , 6 , 7 .
296 Index sppdsh , 7 , 266 , 268 configuration commands , 280 data conversion commands , 278 data transfer commands , 275 I/O buffering commands , 281 map of alternate names , 282 memory transfer com.
デバイスHP (Hewlett-Packard) V2500の購入後に(又は購入する前であっても)重要なポイントは、説明書をよく読むことです。その単純な理由はいくつかあります:
HP (Hewlett-Packard) V2500をまだ購入していないなら、この製品の基本情報を理解する良い機会です。まずは上にある説明書の最初のページをご覧ください。そこにはHP (Hewlett-Packard) V2500の技術情報の概要が記載されているはずです。デバイスがあなたのニーズを満たすかどうかは、ここで確認しましょう。HP (Hewlett-Packard) V2500の取扱説明書の次のページをよく読むことにより、製品の全機能やその取り扱いに関する情報を知ることができます。HP (Hewlett-Packard) V2500で得られた情報は、きっとあなたの購入の決断を手助けしてくれることでしょう。
HP (Hewlett-Packard) V2500を既にお持ちだが、まだ読んでいない場合は、上記の理由によりそれを行うべきです。そうすることにより機能を適切に使用しているか、又はHP (Hewlett-Packard) V2500の不適切な取り扱いによりその寿命を短くする危険を犯していないかどうかを知ることができます。
ですが、ユーザガイドが果たす重要な役割の一つは、HP (Hewlett-Packard) V2500に関する問題の解決を支援することです。そこにはほとんどの場合、トラブルシューティング、すなわちHP (Hewlett-Packard) V2500デバイスで最もよく起こりうる故障・不良とそれらの対処法についてのアドバイスを見つけることができるはずです。たとえ問題を解決できなかった場合でも、説明書にはカスタマー・サービスセンター又は最寄りのサービスセンターへの問い合わせ先等、次の対処法についての指示があるはずです。