HP-UX provides the nettl command to perform network tracing and logging. By default, a basic level of tracing and logging is activated at boot time: root@hpeos004[] nettl -status Logging Information: Log Filename: /var/adm/nettl.LOG* Max Log file size(Kbytes): 1000 Console Logging: On User's ID: 0 Buffer Size: 8192 Messages Dropped: 0 Messages Queued: 0 Subsystem Name: Log Class: NS_LS_LOGGING ERROR DISASTER NS_LS_NFT ERROR DISASTER NS_LS_LOOPBACK ERROR DISASTER NS_LS_NI ERROR DISASTER NS_LS_IPC ERROR DISASTER NS_LS_SOCKREGD ERROR DISASTER NS_LS_TCP ERROR DISASTER NS_LS_PXP ERROR DISASTER NS_LS_UDP ERROR DISASTER NS_LS_IP ERROR DISASTER NS_LS_PROBE ERROR DISASTER NS_LS_DRIVER ERROR DISASTER NS_LS_RLBD ERROR DISASTER NS_LS_BUFS ERROR DISASTER NS_LS_CASE21 ERROR DISASTER NS_LS_ROUTER21 ERROR DISASTER NS_LS_NFS ERROR DISASTER NS_LS_NETISR ERROR DISASTER NS_LS_NSE ERROR DISASTER NS_LS_STRLOG ERROR DISASTER NS_LS_TIRDWR ERROR DISASTER NS_LS_TIMOD ERROR DISASTER NS_LS_ICMP ERROR DISASTER FILTER ERROR DISASTER NAME ERROR DISASTER NS_LS_IGMP ERROR DISASTER FORMATTER ERROR DISASTER STREAMS ERROR DISASTER LAN100 ERROR DISASTER PCI_FDDI ERROR DISASTER GELAN ERROR DISASTER HP_APA ERROR DISASTER HP_APAPORT ERROR DISASTER HP_APALACP ERROR DISASTER BTLAN ERROR DISASTER NS_LS_IPV6 ERROR DISASTER NS_LS_ICMPV6 ERROR DISASTER NS_LS_LOOPBACK6 ERROR DISASTER IGELAN ERROR DISASTER Tracing Information: Trace Filename: Max Trace file size(Kbytes): 0 No Subsystems Active root@hpeos004[] As we can see, ERROR and DISASTER class of events are logged to /var/adm/nettle.LOG* and to the system console, e.g., you will get a message on the console whenever a network cable fails or is removed from an active LAN card. You can see from the above output that there are quite a few subsystems ( otherwise known as an entity ) we can trace, everything from the network driver ( -e NS_LS_DRIVER ) to the IP layer ( -e NS_LS_IP ), all the way to upper layer protocols like TCP ( -e NS_LS_TCP ). The type of problem we are experiencing will determine which subsystem we trace. A second part of the trace is deciding what type of information we are tracing. This is known as a trace mask . A trace mask will only trace for those types of packets, e.g., state , error , or logging packets. The most common trace mask includes both pduout (Outbound Protocol Data Unit, including header and data) and pduin (Inbound Protocol Data Unit, including header and data) packets. Finally, we will normally send the output of the trace to an output file instead of the default: stdout . Here, I am starting a trace at the IP level on pduin and pduout packets. The output file will be called /tmp/trace.TRC* : root@hpeos004[] nettl -e NS_LS_IP -tn pduout pduin -f /tmp/trace root@hpeos004[] nettl -status Logging Information: Log Filename: /var/adm/nettl.LOG* Max Log file size(Kbytes): 1000 Console Logging: On User's ID: 0 Buffer Size: 8192 Messages Dropped: 0 Messages Queued: 0 Subsystem Name: Log Class: NS_LS_LOGGING ERROR DISASTER NS_LS_NFT ERROR DISASTER NS_LS_LOOPBACK ERROR DISASTER NS_LS_NI ERROR DISASTER NS_LS_IPC ERROR DISASTER NS_LS_SOCKREGD ERROR DISASTER NS_LS_TCP ERROR DISASTER NS_LS_PXP ERROR DISASTER NS_LS_UDP ERROR DISASTER NS_LS_IP ERROR DISASTER NS_LS_PROBE ERROR DISASTER NS_LS_DRIVER ERROR DISASTER NS_LS_RLBD ERROR DISASTER NS_LS_BUFS ERROR DISASTER NS_LS_CASE21 ERROR DISASTER NS_LS_ROUTER21 ERROR DISASTER NS_LS_NFS ERROR DISASTER NS_LS_NETISR ERROR DISASTER NS_LS_NSE ERROR DISASTER NS_LS_STRLOG ERROR DISASTER NS_LS_TIRDWR ERROR DISASTER NS_LS_TIMOD ERROR DISASTER NS_LS_ICMP ERROR DISASTER FILTER ERROR DISASTER NAME ERROR DISASTER NS_LS_IGMP ERROR DISASTER FORMATTER ERROR DISASTER STREAMS ERROR DISASTER LAN100 ERROR DISASTER PCI_FDDI ERROR DISASTER GELAN ERROR DISASTER HP_APA ERROR DISASTER HP_APAPORT ERROR DISASTER HP_APALACP ERROR DISASTER BTLAN ERROR DISASTER NS_LS_IPV6 ERROR DISASTER NS_LS_ICMPV6 ERROR DISASTER NS_LS_LOOPBACK6 ERROR DISASTER IGELAN ERROR DISASTER Tracing Information: Trace Filename: /tmp/trace.TRC* Max Trace file size(Kbytes): 1000 User's ID: 0 Buffer Size: 69632 Messages Dropped: 0 Messages Queued: 0 Subsystem Name: Trace Mask: NS_LS_IP 0x30000000 root@hpeos004[] We should not run this trace for long because the output file will grow very quickly. If we have a known, reproducible problem, we would normally start the trace, produce the problem, and then turn the trace off. The resulting binary output file can be formatted into readable text with the netfmt command. We can then analyze the trace to try to establish what the problem is. Here, I have turned the trace OFF after producing the known problem: root@hpeos004[] nettl -tf -e all root@hpeos004[] In my example, I performed a telnet between two nodes while the trace was running. To cut down the amount of output we are looking at, we can use a formatting/filter file. This can be just a filter at MAC, IP, protocol, or even port number level. Here is a simple filter file to filter source and destination IP addresses: root@hpeos004[] cat .netfmt.conf filter ip_saddr 192.168.0.66 filter ip_daddr 192.168.0.65 root@hpeos004[] Now I can format the binary output file: root@hpeos004[] netfmt -c .netfmt.conf -f /tmp/trace.TRC000 more ---------------------- SUBSYSTEM FILTERS IN EFFECT ----------------- ---------------- LAYER 1 ----------------- ---------------- LAYER 2 ----------------- ---------------- LAYER 3 ----------------- filter ip_saddr hpeos004 filter ip_daddr hpeos003 ---------------- LAYER 4 ----------------- ---------------- LAYER 5 ----------------- ---------------------- END SUBSYSTEM FILTERS ----------------------- vvvvvvvvvvvvvvvvvvvvvvvvvvvvARPA/9000 NETWORKINGvvvvvvvvvvvvvvvvvvvvvvvvvv@#% Timestamp : Fri Oct 17 BST 2003 11:44:36.184066 Process ID : [ICS] Subsystem : NS_LS_IP User ID ( UID ) : -1 Trace Kind : PDU OUT TRACE Device ID : -1 Path ID : 0 Connection ID : 0 Location : 00123 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Transmitted 40 bytes via IP Fri Oct 17 11:44:36.184066 BST 2003 pid=[ICS] 0: 45 00 00 28 f6 a5 40 00 40 06 c2 51 c0 a8 00 42 E..(..@.@..Q...B 16: c0 a8 00 46 c1 5f 17 70 88 68 45 71 6c d3 af 73 ...F._.p.hEql..s 32: 50 10 80 00 eb 0a 00 00 -- -- -- -- -- -- -- -- P............... vvvvvvvvvvvvvvvvvvvvvvvvvvvvARPA/9000 NETWORKINGvvvvvvvvvvvvvvvvvvvvvvvvvv@#% Timestamp : Fri Oct 17 BST 2003 11:44:36.314090 Process ID : 4220 Subsystem : NS_LS_IP User ID ( UID ) : 0 Trace Kind : PDU OUT TRACE Device ID : -1 Path ID : 0 Connection ID : 0 Location : 00123 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Standard input root@hpeos004[] root@hpeos004[] ll /tmp/trace.TRC000 -rw------- 1 root sys 336197 Oct 17 11:51 /tmp/trace.TRC000 root@hpeos004[] As you can see, we have lots of information to go through in an individual trace: This trace was running for approximately 2 minutes while I logged into a remote host and then exited again. We should supply the entire trace (and any other supporting information relating to the problem) to the Response Center to help them analyze the problem further. |