12.2 The Fault Tree
The fault tree presented in this section is for diagnosing and fixing problems that occur when you're installing and reconfiguring Samba. It's an expanded form of the trouble and diagnostic document
DIAGNOSIS.txt
, which is part of the Samba distribution.
Before you set out to troubleshoot any part of the Samba suite, you should know the following information:
-
Your client IP address (we use 192.168.236.10)
-
Your server IP address (we use 192.168.236.86)
-
The
netmask
for your network (typically 255.255.255.0)
-
Whether the systems are all on the same subnet (ours are)
For clarity, we've
renamed
the server in the following examples to
server.example.com
, and the client system to
client.example.com
.
12.2.1 How to Use the Fault Tree
Start the tests here, without skipping forward; it won't take long (about 5 minutes) and might actually save you time backtracking. Whenever a test succeeds, you will be given a
name
of a section to which you can safely skip.
12.2.2 Troubleshooting Low-Level IP
The first series of tests is that of the low-level services that Samba needs to run. The tests in this section verify that:
Subsequent
sections add TCP software, the Samba daemons
smbd
and
nmbd
, host-based access control, authentication and per-
user
access control, file services, and browsing. The tests are described in considerable detail to make them understandable by both technically oriented end users and
experienced
systems and network administrators.
12.2.2.1 Testing the networking software with ping
The first command to enter on both the server and the client is
ping
127.0.0.1
. This
pings
the loopback address and indicates whether any networking support is functioning. On Unix, you can use
ping
127.0.0.1
with the statistics option and interrupt it after a few lines. On Sun workstations, the command is typically
/usr/etc/ping
-s
127.0.0.1
; on Linux, just
ping
127.0.0.1
. On Windows
clients
, run
ping
127.0.0.1
in an MS-DOS (command prompt) window, and it will stop by itself after four lines.
Here is an example on a Linux server:
$
ping 127.0.0.1
PING localhost: 56 data bytes 64 bytes from localhost (127.0.0.1):
icmp-seq=0. time=1. ms 64 bytes from localhost (127.0.0.1):
icmp-seq=1. time=0. ms 64 bytes from localhost (127.0.0.1):
icmp-seq=2. time=1. ms ^C
----127.0.0.1 PING Statistics----
3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms)
min/avg/max = 0/0/1
If you get "ping: no answer from . . . " or "100% packet loss," you have no IP networking installed on the system. The address
127.0.0.1
is the internal loopback address and doesn't depend on the computer being physically connected to a network. If this test fails, you have a serious local problem. TCP/IP either isn't installed or is seriously misconfigured. See your operating system documentation if it's a Unix server. If it's a Windows client, follow the instructions in Chapter 3 to install networking support.
|
If
you're
the network manager, some good references are Craig Hunt's
TCP/IP Network Administration
, Chapter 11, and Craig Hunt and Robert Bruce Thompson's
Windows NT TCP/IP Network Administration
, both published by O'Reilly.
|
|
12.2.2.2 Testing local name services with ping
Next
, try to ping
localhost
on the Samba server. The
localhost
hostname is the conventional hostname for the
127.0.0.1
loopback interface, and it should resolve to that address. After typing
ping
localhost
, you should see output similar to the following:
$
ping localhost
PING localhost: 56 data bytes 64 bytes from localhost (127.0.0.1):
icmp-seq=0. time=0. ms 64 bytes from localhost (127.0.0.1):
icmp-seq=1. time=0. ms 64 bytes from localhost (127.0.0.1):
icmp-seq=2. time=0. ms ^C
If this succeeds, try the same test on the client. Otherwise:
-
If you get "unknown host: localhost," there is a problem resolving the hostname
localhost
into a valid IP address. (This might be as simple as a missing entry in a local
hosts
file.) From here, skip down to Section 12.2.7 later in this chapter.
-
If you get "ping: no answer," or "100% packet loss," but pinging
127.0.0.1
worked, name services is resolving to an address, but it isn't the correct one. Check the file or database (typically
/etc/hosts
on a Unix system) that the name service is using to resolve addresses to ensure that the entry is correct.
12.2.2.3 Testing the networking hardware with ping
Next, ping the server's network IP address from itself. This should get you exactly the same results as pinging
127.0.0.1
:
$
ping 192.168.236.86
PING 192.168.236.86: 56 data bytes 64 bytes from 192.168.236.86 (192.168.236.86):
icmp-seq=0. time=1. ms 64 bytes from 192.168.236.86 (192.168.236.86):
icmp-seq=1. time=0. ms 64 bytes from 192.168.236.86 (192.168.236.86):
icmp-seq=2. time=1. ms ^C
----192.168.236.86 PING Statistics----
3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms)
min/avg/max = 0/0/1
If this works on the server, repeat it for the client. Otherwise:
-
If
ping
network_ip
fails on either the server or client, but
ping
127.0.0.1
works on that system, you have a TCP/IP problem that is specific to the Ethernet network interface card on the computer. Check with the documentation for the network card or host operating system to determine how to configure it correctly. However, be aware that on some operating systems, the
ping
command appears to work even if the network is disconnected, so this test doesn't always diagnose all hardware problems.
12.2.2.4 Testing connections with ping
Now, ping the server by name (instead of its IP address) ”once from the server and once from the client. This is the general test for working network hardware:
$
ping server
PING server.example.com: 56 data bytes 64 bytes from server.example.com (192.168.236.86):
icmp-seq=0. time=1. ms 64 bytes from server.example.com (192.168.236.86):
icmp-seq=1. time=0. ms 64 bytes from server.example.com (192.168.236.86):
icmp-seq=2. time=1. ms ^C
----server.example.com PING Statistics----
3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms)
min/avg/max = 0/0/1
If successful, this test
tells
us five things:
-
The hostname (e.g.,
server
) is being found by your local name server.
-
The hostname has been expanded to the full name (e.g.,
server.example.com
).
-
Its address is being returned (
192.168.236.86
).
-
The client has sent the Samba server four 56-byte UDP/IP packets.
-
The Samba server has replied to all four packets.
If this test isn't successful, one of several things can be wrong with the network:
-
First, if you get
ping
:
no
answer
, or
100%
packet
loss
, you're not connecting to the network, the other system isn't connecting, or one of the addresses is incorrect. Check the addresses that the
ping
command
reports
on each system, and ensure that they match the ones you set up initially.
If not, there is at least one mismatched address between the two systems. Try entering the command
arp
-a
, and see if there is an entry for the other system. (The
arp
command stands for the Address Resolution Protocol. The
arp
-a
command lists all the addresses known on the local system.) Here are some things to try:
-
If you receive a message like
192.168.236.86
at
(incomplete)
, the Ethernet address of 192.168.236.86 is unknown. This indicates a complete lack of connectivity, and you're likely having a problem at the very bottom of the TCP/IP protocol stack ”the Ethernet interface layer. This is discussed in Chapters 5 and 6 of
TCP/IP Network Administration
(O'Reilly).
-
If you receive a response similar to server
(192.168.236.86)
at
8:0:20:12:7c:94
, the server has been reached at some time, or another system is answering on its
behalf
. However, this means that
ping
should have worked: you may have an intermittent networking or ARP problem.
-
If the IP address from ARP doesn't match the addresses you expected, investigate and correct the addresses manually.
-
If each system can ping itself but not another, something is wrong on the network between them.
-
If you get
ping
:
network
unreachable
or
ICMP
Host
Unreachable
, you're not receiving an answer, and more than one network is probably involved.
In principle, you shouldn't try to troubleshoot SMB clients and servers on different networks. Try to test a server and client that are on the same network:
-
First, perform the tests for
ping
:
no
answer
described earlier in this section. If this doesn't identify the problem, the remaining possibilities are the following: an address is wrong, your netmask is wrong, a network is down, or the packets have been
stopped
by a firewall.
-
Check both the address and the
netmasks
on source and destination systems to see if something is obviously wrong.
Assuming
both systems really are on the same network, they both should have the same netmasks, and
ping
should report the correct addresses. If the addresses are wrong, you'll need to correct them. If they are correct, the programs might be
confused
by an incorrect netmask. See Section 12.2.8.1, later in this chapter.
-
If the commands are still reporting that the network is unreachable and
neither
of the previous two conditions are in error, one network really might be unreachable from the other. This, too, is an issue for the network manager.
-
If you get
ICMP
Administratively
Prohibited
, you've struck a firewall of some
sort
or a misconfigured router. You will need to speak to your network security officer.
-
If you get
ICMP
Host
redirect
and
ping
reports packets getting through, this is
generally
harmless: you're simply being rerouted over the network.
-
If you get a host redirect and no
ping
responses, you are being redirected, but no one is responding. Treat this just like the
Network
unreachable
response, and check your addresses and netmasks.
-
If you get
ICMP
Host
Unreachable
from
gateway
gateway
name
, ping packets are being routed to another network, but the other system isn't responding and the router is reporting the problem on its behalf. Again, treat this like a
Network
unreachable
response, and start checking addresses and netmasks.
-
If you get
ping
:
unknown
host
hostname
, your system's name is not known. This tends to
indicate
a name service problem, which didn't affect
localhost
. Have a look at Section 12.2.7, later in this chapter.
-
If you get a partial success ”with some pings failing but others succeeding ”you have either an intermittent problem between the systems or an overloaded network. Ping a bit longer, and see if more than about three percent of the packets fail. If so, check it with your network manager: a problem might just be starting. However, if only a few fail, or if you happen to know some massive network program is running, don't worry unduly. The ICMP (and UDP) protocols used by
ping
are allowed to drop
occasional
packets.
-
If you get a response such as
smtsvr.antares.net
is
alive
when you actually pinged
client.example.com
, either you're using someone else's address or the system has multiple
names
and addresses. If the address is wrong, the name service is clearly the culprit; you'll need to change the address in the name service database to refer to the correct system. This is discussed in Section 12.2.7, later in this chapter.
Servers are often
multihomed
”i.e., connected to more than one network, with different names on each net. If you are getting a response from an unexpected name on a multihomed server, look at the address and see if it's on your network (see Section 12.2.8.1, later in this chapter). If so, you should use that address, rather than one on a different network, for both performance and reliability reasons.
Servers can also have multiple names for a single Ethernet address,
especially
if they are web servers. This is harmless, albeit startling. You probably will want to use the official (and permanent) name, rather than an alias that might change.
-
If everything works but the IP address
reported
is
127.0.0.1
, you have a name service error. This typically occurs when an operating-system installation program generates an
/etc/hosts
line similar to
127.0.0.1
localhost
hostname.domainname
. The localhost line should say
127.0.0.1
localhost
or
127.0.0.1
localhost
loghost
. Correct it, lest it cause failures to negotiate who is the master browse list holder and who is the master browser. It can also cause (ambiguous) errors in later tests.
If this worked from the server, repeat it from the client.
12.2.3 Troubleshooting TCP
Now that you've
tested
IP, UDP, and a name service with
ping
, it's time to test TCP. Browsing and
ping
use ICMP and UDP; file and print services (shares) use TCP. Both depend on IP as a lower layer, and all four depend on name services. Testing TCP is most conveniently done using the FTP program.
12.2.3.1 Testing TCP with FTP
Try connecting via FTP, once from the server to itself, and once from the client to the server:
$
ftp server
Connected to server.example.com.
220 server.example.com FTP server (Version 6.2/OpenBSD/Linux-0.10) ready.
Name (server:davecb):
331 Password required for davecb.
Password:
230 User davecb logged in.
ftp>
quit
221 Goodbye.
If this worked, skip to the next section, Section 12.2.4. Otherwise:
-
If you received the message
server
:
unknown
host
, name service has failed. Go back to the corresponding
ping
step, Section 12.2.2.2, and rerun those tests to see why name lookup failed.
-
If you received
ftp
:
connect
:
Connection
refused
, the system isn't running an FTP daemon. This is mildly unusual on Unix servers.
Optionally
, you might try this test by connecting to the system using
telnet
instead of
ftp
; the messages are very similar, and
telnet
uses TCP as well.
-
If there was a long pause, and then
ftp
:
connect
:
Connection
timed
out
, the system isn't
reachable
. Return to Section 12.2.2.4.
-
If you received
530
Logon
Incorrect
, you connected successfully, but you've just found a different problem. You likely provided an incorrect username or password. Try again, making sure you use your username from the Unix server and type your password correctly.
12.2.4 Troubleshooting Server Daemons
Once you've confirmed that TCP networking is working properly, the next step is to make sure the daemons are running on the server. This takes three separate tests because no single one of the following will decisively
prove
that they're working correctly.
To be sure they're running, you need to find out whether the daemons:
-
Have started
-
Are registered or bound to a TCP/IP port by the operating system
-
Are actually paying attention
12.2.4.1 Tracking daemon startup
First, check the Samba logs. If you've started the daemons, the message
smbd
version
number
started
should appear. If it doesn't, you need to restart the Samba daemons.
If the daemon reports that it has indeed started, look out for
bind
failed
on
port
139
socket_addr=0
(Address
already
in
use)
. This means another daemon has been started on port 139 (
smbd
). Also,
nmbd
will report a similar failure if it cannot bind to port 137. Either you've started them twice, or the
inetd
server has tried to provide a daemon for you. If it's the latter, we'll diagnose that in a moment.
12.2.4.2 Looking for daemon processes with ps
Another way to make sure the daemons are running is to check their processes on the system. Use the
ps
command on the server with the "long" option for your system type (commonly
ps
ax
or
ps
-ef
), and see whether
smbd
and
nmbd
are already running. This often looks like the following:
$
ps ax
PID TTY STAT TIME COMMAND
1 ? S 0:03 init [2]
2 ? SW 0:00 (kflushd)
(...many lines of processes...)
234 ? S 0:14 nmbd -D3
237 ? S 0:11 smbd -D3
(...more lines, possibly including more smbd lines...)
This example illustrates that
smbd
and
nmbd
have already started as standalone daemons (the
-D
option) at log level 3.
12.2.4.3 Looking for daemons bound to ports
Next, the daemons have to be registered with the operating system so that they can get access to TCP/IP ports. The
netstat
command will tell you if this has been done. Run the command
netstat
-a
on the server, and look for lines mentioning
netbios
,
137
, or
139
:
$
netstat -a
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
udp 0 0 *.137 *.*
tcp 0 0 *.139 *.* LISTEN
tcp 8370 8760 server.139 client.1439 ESTABLISHED
Among similar lines, there should be at least one UDP line for
*.netbios-
or
*.137
. This indicates that the
nmbd
server is registered and (we hope) is waiting to answer
requests
. There should also be at least one TCP line mentioning
*.netbios-
or
*.139
, and it will probably be in the LISTEN state. This means that
smbd
is up and listening for connections.
There might be other TCP lines indicating connections from
smbd
to clients, one for each client. These are usually in the ESTABLISHED state. If there are
smbd
lines in the ESTABLISHED state,
smbd
is definitely running. If there is only one line in the LISTEN state, we're not sure yet. If both of the lines are missing, a daemon has not succeeded in starting, so it's time to check the logs and then go back to Chapter 2.
If there is a line for each client, it might be coming either from a Samba daemon or from the master IP daemon,
inetd
. It's quite possible that your
inetd
startup file contains lines that start Samba daemons without your
realizing
it; for instance, the lines might have been placed there if you installed Samba as part of a Linux distribution. The daemons started by
inetd
prevent ours from running. This problem typically produces log messages such as
bind
failed
on
port
139
socket
addr=0
(Address
already
in
use)
.
Check your
/etc/inetd.conf
; unless you're intentionally starting the daemons from there,
netbios-ns
(UDP port 137) or
netbios-ssn
(tcp port 139) servers should be mentioned there. If your system is providing an SMB daemon via
inetd
, lines such as the following will appear in the
inetd.conf
file:
netbios-ssn stream tcp nowait root /usr/local/samba/bin/smbd smbd
netbios-ns dgram udp wait root /usr/local/samba/bin/nmbd nmbd
If your system uses
xinetd
instead of
inetd
, see Chapter 2 for details concerning its configuration.
12.2.4.4 Checking smbd with telnet
Ironically, the
easiest
way to test that the
smbd
server is actually working is to send it a meaningless message and see if it is rejected. Try something such as the following:
$
echo "hello" telnet localhost 139
Trying
Trying 192.168.236.86 ...
Connected to localhost. Escape character is '^]'.
Connection closed by foreign host.
This sends an erroneous but harmless message to
smbd
. If you get a
Connected
message followed by a
Connection
closed
message, the test was a success. You have an
smbd
daemon listening on the port and rejecting improper connection messages. On the other hand, if you get
telnet
:
connect
:
Connection
refused
, most likely no daemon is present. Check the logs and go back to Chapter 2.
Regrettably, there isn't an easy test for
nmbd
. If the
telnet
test and the
netstat
test both say that an
smbd
is running, there is a good chance that
netstat
will also be correct about
nmbd
running.
12.2.4.5 Testing daemons with
testparm
Once you know there's a daemon, you should always run
testparm
, in hopes of getting something such as the following:
$
testparm
Load smb config files from /opt/samba/lib/smb.conf
Processing section "[homes]"
Processing section "[printers]" ...
Processing section "[tmp]"
Loaded services file OK. ...
The
testparm
program normally reports the processing of a series of sections and responds with
Loaded
services
file
OK
if it succeeds. If not, it reports one or more of the following messages, which also appear in the logs as noted:
-
Allow/Deny connection from account (n) to service
-
A
testparm
-only message produced if you have
valid
user
or
invalid
user
options set in your
smb.conf
. You will want to make sure that you are on the valid user list, and that
root
,
bin
, etc., are on the invalid user list. If you don't, you will not be able to connect, or users who shouldn't
will
be able to.
-
Warning: You have some share names that are longer than eight chars
-
For
anyone
using Windows for Workgroups and older clients. They fail to connect to shares with long names, producing an overflow message that sounds confusingly like a memory overflow.
-
Warning: [name] service MUST be printable!
-
A printer share lacks a
printable
=
yes
option.
-
No
path
in service name using [name]
-
A file share doesn't know which directory to provide to the user, or a print share doesn't know which directory to use for spooling. If no path is specified, the service will try to run with a path of
/tmp
, which might not be what you want.
-
Note: Servicename is flagged unavailable
-
Just a reminder that you have used the
available
=
no
option in a share.
-
Can't find include file [name]
-
A configuration file referred to by an
include
option did not exist. If you were including the file unconditionally, this is an error and probably a serious one: the share will not have the configuration you intended. If you were including it based on one of the
%
variables
, such as
%a
(architecture), you will need to decide whether, for example, a missing Windows for Workgroups configuration file is a problem. It often isn't.
-
Can't copy service name, unable to copy to itself
-
You tried to copy an
smb.conf
section into itself.
-
Unable to copy service ”source not found: [name]
-
Indicates a missing or
misspelled
section in a
copy
=
option.
-
Ignoring unknown parameter name
-
Typically indicates an obsolete, misspelled, or unsupported option.
-
Global parameter name found in service section
-
Indicates that a global-only parameter has been used in an individual share. Samba ignores the parameter.
After the
testparm
test, repeat it with (exactly) three parameters: the name of your
smb.conf
file, the name of your client, and its IP address:
#
testparm /usr/local/samba/lib/smb.conf client 192.168.236.10
This will run one more test that checks the hostname and address against
hosts
allow
and
hosts
deny
options and might produce the
Allow
connection
from
hostname
to
service
and/or
Deny
connection
from
hostname
to
service
messages for the client system. These messages indicate that you have
hosts
allow
and/or
hosts
deny
options in your
smb.conf
, and they prohibit access from the client system.
12.2.5 Troubleshooting SMB Connections
Now that you know the servers are up, you need to make sure they're running properly. We start by placing a simple
smb.conf
file in the
/usr/local/samba/lib
directory.
12.2.5.1 A minimal smb.conf file
In the following tests, we assume you have a
[temp]
share suitable for testing, plus at least one account. An
smb.conf
file that includes just these is as
follows
:
[global]
workgroup =
EXAMPLE
security = user
browsable = yes
local master = yes
[homes]
guest ok = no
browsable = no
[temp]
path = /tmp
public = yes
|
The
public
=
yes
option in the
[temp]
share is just for testing. You probably don't want people without accounts storing things on your Samba server, so you should comment it out when you're done.
|
|
12.2.5.2 Testing locally with smbclient
The first test is to ensure that the server can list its own services (shares). Run the command
smbclient
-L
localhost
-U%
to connect to the server from itself, and specify the guest user. You should see the following:
$
smbclient -L localhost -U%
Server time is Wed May 27 17:57:40 2002 Timezone is UTC-4.0
Server=[localhost]
User=[davecb]
Workgroup=[EXAMPLE]
Domain=[EXAMPLE]
Sharename Type Comment
--------- ----- ----------
temp Disk
IPC$ IPC IPC Service (Samba 1.9.18)
homes Disk Home directories
This machine does not have a browse list
If you received this output, move on to the next section, Section 12.2.5.3. On the other hand, if you receive an error, check the following:
-
If you get
Get_hostbyname
:
unknown
host
localhost
, either you've spelled its name wrong or there actually is a problem (which should have been seen back in Section 12.2.2.2). In the latter case, move on to Section 12.2.7, later in this chapter.
-
If you get
Connect
error
:
Connection
refused
, the server was found, but it wasn't running an
nmbd
daemon. Skip back to Section 12.2.4, earlier in this chapter, and retest the daemons.
-
If you get the message
Your
server
software
is
being
unfriendly
, the initial session request packet got a garbage response from the server. The server might have crashed or started improperly. The common causes of this can be
discovered
by scanning the logs for the following:
-
Invalid command-line parameters to
smbd
; see the
smbd
manual page.
-
A fatal problem with the
smb.conf
file that
prevents
the startup of
smbd
. Always check your changes with
testparm
, as was done in Section 12.2.4.5, earlier in this chapter.
-
Missing directories where Samba is supposed to keep its log and lock files.
-
The presence of a server already on the port (139 for
smbd
, 137 for
nmbd
), preventing the daemon from starting.
-
If you're using
inetd
(or xinetd ) instead of standalone daemons, be sure to check your
/etc/inetd.conf
(or xinetd configuration files) and
/etc/services
entries against their manual pages for errors as well.
-
If you get a
Password
: prompt, your guest account is not set up properly. The
-U%
option tells
smbclient
to do a "null login," which requires that the guest account be present but does not require it to have any privileges.
-
If you get the message
SMBtconX
failed
.
ERRSRV--ERRaccess
, you aren't permitted access to the server. This normally means you have a
hosts
allow
option that doesn't include the server or a
hosts
deny
option that does. Recheck with the command
testparm
smb.conf
your_hostname
your_ip_address
(see Section 12.2.4.5), and correct any
unintended
prohibitions.
12.2.5.3 Testing connections with smbclient
Run the command
smbclient
\
server
\temp
to connect to the server's
[temp]
share and to see if you can connect to a file service. You should get the following response:
$
smbclient '\server\temp'
Server time is Tue May 5 09:49:32 2002 Timezone is UTC-4.0 Password:
smb: \> quit
You might receive the following errors:
-
If you get
Get_Hostbyname
:
Unknown
host
name
,
Connect
error
:
Connection
refused
, or
Your
server
software
is
being
unfriendly
, see the previous section, Section 12.2.5.2, for the diagnoses.
-
If you get the message
servertemp
:
Not
enough
`\
'
characters
in
service
, you likely didn't quote the address, so Unix stripped off backslashes. You can also write the command:
smbclient \
server
\temp
or:
smbclient //
server
/temp
Now, provide your Unix account password to the
Password
: prompt. If you then get an
smb
:
\>
prompt, it worked. Enter
quit
and continue on to the next section, Section 12.2.5.4. If you got
SMBtconX
failed
.
ERRSRV--ERRinvnetname
, the problem can be any of the following:
-
A wrong share name: you might have spelled it wrong, it might be too long, it might be in mixed case, or it might not be available. Check that it's what you expect with
testparm
(see the earlier section, Section 12.2.4.5).
-
A
security
=
share
parameter in your Samba configuration file, in which case you might have to add
-U
your_account
to the
smbclient
command.
-
An erroneous username.
-
An erroneous password.
-
An
invalid
users
or
valid
users
option in your
smb.conf
file that doesn't allow your account to connect. Recheck using
testparm
smb.conf
your_hostname your_ip_address
(see the earlier section, Section 12.2.4.5).
-
A
valid
hosts
option that doesn't include the server, or an
invalid
hosts
option that does. Also test this with
testparm
.
-
A problem in authentication, such as if shadow passwords or the Password Authentication Module (PAM) is used on the server, but Samba is not compiled to use it. This is rare, but it occasionally happens when a SunOS 4 Samba binary (with no shadow passwords) is run without recompilation on a Solaris system (with shadow passwords).
-
The
encrypted
passwords
=
yes
option is in the configuration file, but no password for your account is in the
smbpasswd
file.
-
You have a null password entry, either in Unix
/etc/passwd
or in the
smbpasswd
file.
-
You are connecting to
[temp]
, and you do not have the
guest
ok
=
yes
option in the
[temp]
section of the
smb.conf
file.
-
You are connecting to
[temp]
before connecting to your home directory, and your guest account isn't set up correctly. If you can connect to your home directory and then connect to
[temp]
, that's the problem. See Chapter 2 for more information on creating a basic Samba configuration file.
A bad guest account will also prevent you from printing or browsing until after you've logged in to your home directory.
There is one more reason for this failure that has nothing at all to do with passwords: the
path
parameter in your
smb.conf
file might point somewhere that doesn't exist. This will not be diagnosed by
testparm
, and most SMB clients can't distinguish it from other types of bad user accounts. You will have to check it manually.
Once you have connected to
[temp]
successfully, repeat the test, this time logging in to your home directory (e.g., map network drive
server
\davecb
). If you have to change anything to get that to work, retest
[temp]
again afterward.
12.2.5.4 Testing connections with net use
Run the command
net
use
*
\
server
\temp
on the Windows client to see if it can connect to the server. You should be prompted for a password, then receive the response
The
command
was
completed
successfully
.
If that worked, continue with the steps in the next section, Section 12.2.5.5. Otherwise:
-
If you get
The
specified
shared
directory
cannot
be
found
, or
Cannot
locate
specified
share
name
, the directory name is either misspelled or not in the
smb.conf
file. This message can also warn of a name that is in mixed case, including spaces, or that is longer than eight characters.
-
If you get
The
computer
name
specified
in
the
network
path
cannot
be
located
or
Cannot
locate
specified
computer
, the directory name has been misspelled, the name service has failed, there is a networking problem, or the
hosts
deny
option includes your host.
-
If it is not a spelling mistake, you need to double back at least to Section 12.2.5.3 to investigate why it doesn't connect.
-
If
smbclient
does work, there is a name service problem with the client name service, and you need to go forward to Section 12.2.6.2 and see if you can look up both the client and server with
nmblookup
.
-
If you get
The
password
is
invalid
for
\server\username
, your locally cached copy on the client doesn't match the one on the server. You will be prompted for a replacement.
|
Each Windows 95/98/Me client keeps a local
password
file, but it's really just a cached copy of the password it sends to Samba and NT/2000/XP servers to authenticate you. That's what is being prompted for here. You can still log on to a Windows system without a password (but not to NT/2000/XP).
|
|
If you provide your password and it still fails, your password is not being matched on the server, you have a
valid
users
or
invalid
users
list
denying
you permission, NetBEUI is interfering, or the encrypted password problem described in the next paragraph exists.
-
If your client is Windows NT 4.0, NT 3.5 with Patch 3, Windows 95 with Patch 3, Windows 98, any of these with Internet Explorer 4.0, or any subsequent version of Windows, the system will default to Microsoft encryption for passwords. In general, if you have installed a major Microsoft product on any of the older Windows versions, you might have applied an update and turned on encrypted passwords. If the client is defaulting to encrypted passwords, you will need to specify
encrypt
passwords
=
yes
in your Samba configuration file if you are using a version of Samba prior to Samba 3.0.
|
Because of Internet Explorer's
willingness
to
honor
URLs such as
file://somehost/somefile
by making SMB connections, clients up to and including Windows 95 Patch Level 2 would happily send your password, in plain text, to SMB servers
anywhere
on the Internet. This was
considered
a bad idea, and Microsoft switched to using only encrypted passwords in the SMB protocol. All subsequent releases of Microsoft's products have included this correction.
|
|
-
If you have a mixed-case password on Unix, the client is probably sending it in all one case. If changing your password to all one case works, this was the problem. Regrettably, all but the oldest clients support uppercase passwords, so Samba will try once with the password in uppercase and once in lowercase. If you wish to use mixed-case passwords, see the
password
level
option in Chapter 9 for a workaround.
-
You might have a
valid
users
problem, as tested with
smbclient
(see the earlier section, Section 12.2.5.3).
-
You might have the NetBEUI protocol bound to the Microsoft client. This often produces long timeouts and erratic failures and is known to have caused failures to accept passwords in the past. Unless you
absolutely
need the NetBEUI protocol, remove it.
|
The
term
"bind" is used here to mean connecting one piece of software to another. When configured correctly, the Microsoft SMB client is "bound to" TCP/IP in the bindings section of the TCP/IP properties panel under the Windows 95/98/Me Network icon in the Control Panel. TCP/IP in
turn
is bound to an Ethernet card. This is not the same sense of the word as binding an SMB daemon to a TCP/IP port.
|
|
12.2.5.5 Testing connections with Windows Explorer
Start Windows Explorer (not Internet Explorer), select Map Network Drive from the Tools menu, and specify the UNC for one of your shares on the Samba server to see if you can make Explorer connect to it. If so, you've succeeded and can skip to the next section, Section 12.2.6.
Windows Explorer is a rather poor diagnostic tool: it tells you that something's wrong, but rarely what it is. If you get a failure, you'll need to track it down with the Windows
net use
command, which has far
superior
error reporting:
-
If you get
The
password
for
this
connection
that
is
in
your
password
file
is
no
longer
correct
, you might have any of the following:
-
Your locally cached copy on the client doesn't match the one on the server.
-
You didn't provide a username and password when logging on to the client. Some versions of Explorer will continue to send a null username and password, even if you provide a password.
-
You have misspelled the password.
-
You have an
invalid
users
or
valid
users
list denying permission.
-
Your client is defaulting to encrypted passwords, but Samba is configured with the
encrypt
passwords
=
no
configuration file parameter.
-
You have a mixed-case password, which the client is supplying in all one case.
-
If you get
The
network
name
is
either
incorrect
,
or
a
network
to
which
you
do
not
have
full
access
, or
Cannot
locate
specified
computer
, you might have any of the following:
-
If you get
You
must
supply
a
password
to
make
this
connection
, the password on the client is out of synchronization with the server, or this is the first time you've tried from this client system and the client hasn't cached it locally yet.
-
If you get
Cannot
locate
specified
share
name
, you have a wrong share name or a syntax error in specifying it, a share name longer than eight characters, or one containing spaces or in mixed case.
Once you can reliably connect to the share, try again, this time using your home directory. If you have to change something to get home directories working, retest with the first share, and vice versa, as we showed in the earlier section, "Testing connections with net use." As always, if Explorer fails, drop back to that section and debug the connection there.
12.2.6 Troubleshooting Browsing
Finally, we come to browsing. We've left this for last, not because it is the most difficult, but because it's both optional and partially dependent on a protocol that doesn't guarantee delivery of a packet. Browsing is hard to diagnose if you don't already know that all the other services are running.
Browsing is purely optional: it's just a way to find the servers on your network and the shares that they provide. Unix has nothing of the sort and happily does without. Browsing also assumes all your systems are on a local area network (LAN) where broadcasts are
allowable
.
First, the browsing mechanism identifies a system using the unreliable UDP protocol; it then makes a normal (reliable) TCP/IP connection to list the shares the system provides.
12.2.6.1 Testing browsing with smbclient
We'll start with testing the reliable connection first. From the server, try listing its own shares using
smbclient
with a
-L
option and your server's name. You should get something resembling the following:
$
smbclient -L server
Added interface ip=192.168.236.86 bcast=192.168.236.255 nmask=255.255.255.0 Server
time is Tue Apr 28 09:57:28 2002 Timezone is UTC-4.0
Password:
Domain=[EXAMPLE] OS=[Unix] Server=[Samba 2.2.5]
Sharename Type Comment
--------- ---- -------
cdrom Disk CD-ROM
cl Printer Color Printer 1
davecb Disk Home Directories
Server Comment
--------- -------
SERVER Samba 2.2.5
Workgroup Master
--------- -------
EXAMPLE SERVER
-
If you didn't get a Sharename list, the server is not allowing you to browse any shares. This should not be the case if you've tested any of the shares with Windows Explorer or the
net use
command. If you haven't done the
smbclient
-L
localhost
-U%
test yet (see the earlier section, Section 12.2.5.2), do it now. An erroneous guest account can prevent the shares from being seen. Also, check the
smb.conf
file to make sure you do not have the option
browsable
=
no
anywhere in it: we suggest using a minimal
smb.conf
file (see the earlier section, Section 12.2.5.1). You need to have
browsable
enabled (which is the default) to see the share.
-
If you didn't get a browse list, the server is not providing information about the systems on the network. At least one system on the net must support browse lists. Make sure you have
local
master
=
yes
in the
smb.conf
file if you want Samba to be the local master browser.
-
If you got a browse list but didn't get
/tmp
, you probably have a
smb.conf
problem. Go back to Section 12.2.4.5.
-
If you didn't get a workgroup list with your workgroup name in it, it is possible that your workgroup is set incorrectly in the
smb.conf
file.
-
If you didn't get a workgroup list at all, ensure that
workgroup
=
EXAMPLE
is present in the
smb.conf
file.
-
If you get nothing, try once more with the options
-I
ip_address
-n
netbios_name
-W
workgroup
-d3
with the NetBIOS and workgroup name in uppercase. (The
-d3
option sets the log /debugging level to 3.) Then check the Samba logs for clues.
If you're still getting nothing, you shouldn't have gotten this far; double back to at least Section 12.2.3.1, or perhaps Section 12.2.2.4. On the other hand:
-
If you get
SMBtconX
failed
.
ERRSRV--ERRaccess
, you aren't permitted access to the server. This normally means you have a
hosts
allow
option that doesn't include the server or a
hosts
deny
option that does.
-
If you get
Bad
password
, you presumably have one of the following:
-
An incorrect
hosts
allow
or
hosts
deny
line
-
An incorrect
invalid
users
or
valid
users
line
-
A lowercase password and OS/2 or Windows for Workgroups clients
-
A missing or invalid guest account
Check what your guest account is (see the earlier section, Section 12.2.5.2), change or comment out any
hosts
allow
,
hosts
deny
,
valid
users
, or
invalid
users
lines, and verify your
smb.conf
file with
testparm
smb.conf
your_hostname your_ip_address
(see the earlier section, Section 12.2.4.5).
-
If you get
Connection
refused
, the
smbd
server is not running or has crashed. Check that it's up, running, and listening to the network with
netstat
. See the earlier section, Section 12.2.4.
-
If you get
Get_Hostbyname
:
Unknown
host
name
, you've made a spelling error, there is a mismatch between the Unix and NetBIOS hostname, or there is a name service problem. Start name service debugging as discussed in the earlier section, Section 12.2.5.4. If this works, suspect a name mismatch, and go to the later section, Section 12.2.9.
-
If you get
Session
request
failed
, the server refused the connection. This usually indicates an internal error, such as insufficient memory to fork a process.
-
If you get
Your
server
software
is
being
unfriendly
, the initial session request packet received a garbage response from the server. The server might have crashed or started improperly. Go back to Section 12.2.5.2, where the problem is first
analyzed
.
-
If you suspect the server is not running, go back to Section 12.2.4.2 to see why the server daemon isn't responding.
12.2.6.2 Testing the server with nmblookup
This will test the "advertising" system used for Windows name services and browsing. Advertising works by broadcasting one's presence or willingness to provide services. It is the part of browsing that uses an unreliable protocol (UDP) and works only on broadcast networks such as Ethernets. The
nmblookup
program broadcasts name queries for the hostname you provide and returns its IP address and the name of the system, much as
nslookup
does with DNS. Here, the
-d
(debug or log-level) and
-B
(broadcast address) options direct queries to specific systems.
First, we check the server from itself. Run
nmblookup
with a
-B
option of your server's name (to tell it to send the query to the Samba server) and a parameter of
_ _SAMBA_ _
as the symbolic name to look up. You should get:
$
nmblookup -B server _ _SAMBA_ _
Added interface ip=192.168.236.86 bcast=192.168.236.255 nmask=255.255.255.0
Sending queries to 192.168.236.86 192.168.236.86 _ _SAMBA_ _
You should get the IP address of the server, followed by the name
_ _SAMBA_ _
, which means that the server has successfully advertised that it has a service called
_ _SAMBA_ _
, and therefore at least part of NetBIOS name service works.
-
If you get
Name_query
failed
to
find
name
_ _SAMBA_ _
, you might have specified the server name to the
-B
option, or
nmbd
is not running. The
-B
option actually takes a broadcast address: we're using a computer name to get a unicast address and to ask the server if it has claimed
_ _SAMBA_ _
. Try again with
nmblookup
-B
ip_address
, and if that fails too,
nmbd
isn't claiming the name. Go back
briefly
to the earlier section, "Testing daemons with testparm," to see if
nmbd
is running. If so, it might not be claiming names; this means that Samba is not providing the browsing service ”a configuration problem. If that is the case, make sure that
smb.conf
doesn't contain the option
browsing
=
no
.
12.2.6.3 Testing the client with nmblookup
Next, check the IP address of the client from the server with
nmblookup
using the
-B
option for the client's name and a parameter of '
*
' meaning "anything," as shown here:
$
nmblookup -B client '*
'
Sending queries to 192.168.236.10 192.168.236.10 *
Got a positive name query response from 192.168.236.10 (192.168.236.10)
You might get the following error:
-
If you receive
Name-query
failed
to
find
name
*
, you have made a spelling mistake, or the client software on the PC isn't installed, started, or bound to TCP/IP. Double back to Chapter 3 and ensure that you have a client installed that is listening to the network.
Repeat the command with the following options if you had any failures:
-
If
nmblookup
-B
client_IP_address
succeeds but
nmblookup
-B
client_name
fails, there is a name service problem with the client's name; go to Section 12.2.7, later in this chapter.
-
If
nmblookup
-B
127.0.0.1
'
*
' succeeds, but
nmblookup
-B
client_IP_address
fails, there is a hardware problem, and
ping
should have failed. See your network manager.
12.2.6.4 Testing the network with nmblookup
Run the command
nmblookup
again with a
-d2
option (for a debug level of 2) and a parameter of '
*
'. This time we are testing the ability of programs (such as
nmbd
) to use broadcast. It's
essentially
a connectivity test, done via a broadcast to the default broadcast address.
A number of NetBIOS over TCP/IP hosts on the network should respond with
got
a
positive
name
query
response
messages. Samba might not catch all the responses in the short time it listens, so you won't always see all the SMB clients on the network. However, you should see most of them:
$
nmblookup -d 2 '*
'
Added interface ip=192.168.236.86 bcast=192.168.236.255 nmask=255.255.255.0 Sending
queries to 192.168.236.255
Got a positive name query response from 192.168.236.191 (192.168.236.191)
Got a positive name query response from 192.168.236.228 (192.168.236.228)
Got a positive name query response from 192.168.236.75 (192.168.236.75)
Got a positive name query response from 192.168.236.79 (192.168.236.79)
Got a positive name query response from 192.168.236.206 (192.168.236.206)
Got a positive name query response from 192.168.236.207 (192.168.236.207)
Got a positive name query response from 192.168.236.217 (192.168.236.217)
Got a positive name query response from 192.168.236.72 (192.168.236.72) 192.168.236.86 *
However:
-
If this doesn't give at least the client address you previously tested, the default broadcast address is wrong. Try
nmblookup
-B
255.255.255.255
-d
2
'
*
', which is a last-ditch variant (using a broadcast address of all 1s). If this draws responses, the broadcast address you've been using before is wrong. Troubleshooting these is discussed in Section 12.2.8.2, later in this chapter.
-
If the address 255.255.255.255 fails too, check your notes to see if your PC and server are on different subnets, as discovered in the earlier section, Section 12.2.2.4. You should try to diagnose this step with a server and client on the same subnet, but if you can't, you can try specifying the remote subnet's broadcast address with
-B
. Finding that address is discussed in Section 12.2.8.2, later in this chapter. The
-B
option will work if your router supports directed broadcasts; if it doesn't, you might be forced to test with a client on the same network.
As usual, you can check the Samba log files for additional clues.
12.2.6.5 Testing client browsing with net view
On the client, run the command
net view \server
in an MS-DOS (command prompt) window to see if you can connect to the client and ask what shares it provides. You should get back a list of available shares on the server.
If this works, continue with the later section Section 12.3.1. Otherwise:
-
If you get
Network
name
not
found
for the name you just tested in the earlier section, Section 12.2.6.3, there is a problem with the client software itself. Double-check this by running
nmblookup
on the client; if it works and
net view
doesn't, the client is at fault.
-
If
nmblookup
fails, there is a NetBIOS name service problem, as discussed in the later section, Section 12.2.9.
-
If you get
You
do
not
have
the
necessary
access
rights
, or
This
server
is
not
configured
to
list
shared
resources
, either your guest account is misconfigured (see the earlier section, Section 12.2.5.2) or you have a
hosts
allow
or
hosts
deny
line that prohibits connections from your system. These problems should have been
detected
by the
smbclient
tests starting in the earlier section, Section 12.2.6.1.
-
If you get
The
specified
computer
is
not
receiving
requests
, you have misspelled the name, the system is unreachable by broadcast (tested in the earlier section, Section 12.2.6.4), or it's not running
nmbd
.
-
If you get
Bad
password
error
, you're probably encountering the Microsoft-encrypted password problem, as discussed earlier in this chapter and in Chapter 9, with its corrections.
12.2.6.6 Browsing the server from the client
From the Windows Network Neighborhood (or My Network Places in
newer
releases), try to browse the server. Your Samba server should appear in the browse list of your local workgroup. You should be able to double-click the name of the server to get a list of shares.
-
If you get an
Invalid
password
error, it's most likely the encryption problem again.
-
If you receive an
Unable
to
browse
the
network
error, one of the following has occurred:
-
You have
looked
too soon, before the broadcasts and updates have completed. Wait 30 seconds and try again.
-
There is a network problem you've not yet diagnosed.
-
There is no browse master. Add the configuration option
local
master
=
yes
to your
smb.conf
file.
-
No shares are made browsable in the
smb.conf
file.
-
If you receive the message
\server
is
not
accessible
then:
-
You have the encrypted password problem.
-
The system really isn't accessible.
-
The system doesn't support browsing.
If you've made it this far and the problem is not yet
solved
, either the problem is one we've not yet seen, or it is a problem related to a topic we have already covered, and further analysis is required. Name resolution is often related to difficulties with Samba, so we cover it in more detail in the next sections. If you know your problem is not
related
to name resolution, skip to the Section 12.3 at the end of the chapter.
12.2.7 Troubleshooting Name Services
This section looks at simple troubleshooting of all the name services you'll encounter, but only for the common problems that affect Samba.
There are several good references for troubleshooting particular name services: Paul Albitz and Cricket Liu's
DNS and Bind
(O'Reilly) covers the DNS, Hal Stern's
NFS and NIS
(O'Reilly) covers NIS ("Yellow pages"), while Windows Internet Name Service (WINS),
hosts/LMHOSTS
files, and NIS+ are best covered by their respective
vendors
' manuals.
The problems addressed in this section are as follows:
-
Name services are identified.
-
A hostname can't be looked up.
-
The long (FQDN) form of a hostname works but the short form doesn't.
-
The short form of the name works, but the long form doesn't.
-
A long delay occurs before the expected result.
12.2.7.1 Identifying what's in use
First, see if both the server and the client are using DNS, WINS, NIS, or
hosts
files to look up IP addresses when you give them a name. Each kind of system has a different preference:
-
Windows 95/98/Me
tries
WINS and the
LMHOSTS
file first, then broadcast, and finally DNS and
HOSTS
files.
-
Windows NT/2000/XP tries WINS, then broadcast, then the
LMHOSTS
file, and finally
HOSTS
and DNS.
-
Windows programs using the WINSOCK standard use the HOSTS file, DNS, WINS, and then broadcast. Don't assume that if a different program's name service works, the SMB client program's name service will!
-
Samba daemons use
lmhosts
, WINS, the Unix system's name resolution, and then broadcast.
-
Unix systems can be configured to use any combination of DNS,
HOSTS
files, NIS or NIS+, and winbind, generally in any order.
We recommend that the client systems be configured to use WINS and DNS, the Samba daemons to use WINS and DNS, and the Unix server to use DNS,
hosts
files, and perhaps NIS+. You'll have to look at your notes and the actual systems to see which is in use.
On the clients, the name services are all set in the TCP/IP Properties panel of the Networking Control Panel, as discussed in Chapter 3. You might need to check there to see what you've actually turned on. On the server, see if a
/etc/resolv.conf
file exists. If it does, you're using DNS. You might be using the others as well, though. You'll need to check for NIS and combinations of services.
Check for a
/etc/nsswitch.conf
file on Solaris and other System V Unix operating systems. If you have one, look for a line that begins with
host
: followed by one or more of
files
,
bind
,
nis
, or
nis+
. These are the name services to use, in order, with optional extra material in square brackets. The
files
keyword is for using
HOSTS
files, while
bind
(the Berkeley Internet Name Daemon) refers to using DNS.
If the client and server
differ
, the first thing to do is to get them in sync. Clients can use DNS, WINS,
HOSTS
, and
LMHOSTS
files, but not NIS or NIS+. Servers can use
HOSTS
and
LMHOSTS
files, DNS, NIS or NIS+, and winbind, but not WINS ”even if your Samba server provides WINS services. If you can't get all the systems to use the same services, you'll have to check the server and the client
carefully
for the same data.
You can also make use of the
-R
(resolve order) option for
smbclient
. If you want to troubleshoot WINS, for example, you'd say:
$
smbclient -L
server
-R wins
The possible settings are
hosts
(which means whatever the Unix system is using, not just
/etc/hosts
files),
lmhosts
,
wins
, and
bcast
(broadcast).
In the following sections, we use the term
long name
for a fully qualified domain name (FQDN), such as
server.example.com
, and the term
short name
for the host part of an FQDN, such as
server
.
12.2.7.2 Cannot look up hostnames
Try the following:
-
DNS
-
Run
nslookup
name
. If this fails, look for a
resolv.conf
error, a downed DNS server, or a short/long name problem (see the next section). Try the following:
-
Your
/etc/resolv.conf
file should contain one or more
nameserver
lines, each with an IP address. These are the addresses of your DNS servers.
-
Ping each server address you find. If this fails for one, suspect the system. If it fails for each, suspect your network.
-
Retry the lookup using the full domain name (e.g.,
server.example.com
) if you tried the short name first, or the short name if you tried the long name first. If results differ, skip to the next section.
-
Broadcast/ WINS
-
Broadcast/ WINS does only short names such as
server
, and not long ones, such as
server.example.com
. Run
nmblookup
-S
server
. This reports everything broadcast has registered for the name. In our example, it looks like this:
$
nmblookup -S server
Looking up status of 192.168.236.86
received 10 names
SERVER <00> - M <ACTIVE>
SERVER <03> - M <ACTIVE>
SERVER <1f> - M <ACTIVE>
SERVER <20> - M <ACTIVE>
..__MSBROWSE__. <01> - <GROUP> M <ACTIVE>
MYGROUP <00> - <GROUP> M <ACTIVE>
MYGROUP <1b> - M <ACTIVE>
MYGROUP <1c> - <GROUP> M <ACTIVE>
MYGROUP <1d> - M <ACTIVE>
MYGROUP <1e> - <GROUP> M <ACTIVE>
The required entry is
SERVER
<00>
, which identifies
server
as being this system's NetBIOS name. You should also see your workgroup mentioned one or more times. If these lines are missing, Broadcast/WINS cannot look up names and will need attention.
|
The
numbers
in angle brackets in the previous output identify NetBIOS names as being workgroups, workstations, and file users of the messenger service, master browsers, domain master browsers, domain controllers, and a
plethora
of others. We primarily use
<00>
to identify system and workgroup names and
<20>
to identify systems as servers. The complete list is available at http://support.microsoft.com/support/kb/articles/q163/4/09.asp.
|
|
-
NIS
-
Try
ypmatch
name
hosts
. If this fails, NIS is down. Find out the NIS server's name by running
ypwhich
, and ping the system to see if it's accessible.
-
NIS+
-
If you're running NIS+, try
nismatch
name
hosts
. If this fails, NIS is down. Find out the NIS+ server's name by running
niswhich
, and ping that system to see if it's accessible.
-
hosts and HOSTS files
-
Inspect the
HOSTS
file on the client (
C:\Windows\ Hosts
on Windows 95/98/Me, and
C:\WINNT \system32\drivers\etc\hosts
on Windows NT/2000/XP). Each line should have an IP number and one or more names, the primary name first, then any optional aliases. An example follows:
127.0.0.1 localhost
192.168.236.1 dns.svc.example.com
192.168.236.10 client.example.com client
192.168.236.11 backup.example.com loghost
192.168.236.86 server.example.com server
192.168.236.254 router.svc.example.com
On Unix,
localhost
should always be 127.0.0.1, although it might be just an alias for a hostname on the PC. On the client, check that there are no
#XXX
directives at the ends of the lines; these are LAN Manager/NetBIOS directives and should appear only in
LMHOSTS
files.
-
LMHOSTS files
-
This file is a local source for LAN Manager (NetBIOS) names. It has a format similar to
hosts
files, but it does not support
long-form
domain names (e.g.,
server.example.com
) and can have a number of optional
#XXX
directives following the NetBIOS names. There is usually an
lmhosts.sam
(for sample) file located in
C:\Windows
on Windows 95/98/Me, and in
C:\WINNT\system32\drivers\etc
on Windows NT/2000/XP, but it's not used unless it is renamed to
Lmhosts
in the same directory.
12.2.7.3 Long and short hostnames
Where the long (FQDN) form of a hostname works but the short name doesn't (for example,
client.example.com
works but
client
doesn't), consider the following:
-
DNS
-
This usually indicates that there is no default domain in which to look up the short names. Look for a
default
line in
/etc/resolv.conf
on the Samba server with your domain in it, or look for a
search
line with one or more domains in it. One or the other might need to be present to make short names usable; which one depends on the vendor and version of the DNS resolver. Try adding
domain
your_domain
to
resolv.conf
, and ask your network or DNS administrator what should be in the file.
-
Broadcast/WINS
-
Broadcast/WINS doesn't support long names; it won't suffer from this problem.
-
NIS
-
Try the command
ypmatch
hostname
hosts
. If you don't get a match, your tables don't include short names. Speak to your network manager; short names might be missing by
accident
or might be unsupported as a matter of policy. Some sites don't ever use (ambiguous) short names.
-
NIS+
-
Try
nismatch
hostname
hosts
, and treat failure exactly as with NIS.
-
hosts
-
If the short name is not in
/etc/hosts
, consider adding it as an alias. Avoid, if you can, short names as primary names (the first one on a line). Have them as aliases if your system
permits
.
-
LMHOSTS
-
LAN Manager doesn't support long names, so it won't suffer from this problem.
On the other hand, if the short form of the name works and the long form doesn't, consider the following:
-
DNS
-
This is bizarre; see your network or DNS administrator, as this is probably a DNS setup error.
-
Broadcast/WINS
-
This is normal; Broadcast/WINS can't use the long form. Optionally, consider DNS. (Be aware that Microsoft has stated that it will eventually switch entirely to DNS, even though DNS does not provide name types such as <00>.)
-
NIS
-
If you can use
ypmatch
to look up the short form but not the long, consider adding the long form to the table as at least an alias.
-
NIS+
-
Same as NIS, except you use
nismatch
instead of
ypmatch
to look up names.
-
hosts and HOSTS
-
Add the long name as at least an alias, and preferably as the primary form. Also consider using DNS if it's practical.
-
LMHOSTS
-
This is normal. LAN Manager can't use the long form; consider switching to DNS or
hosts
.
12.2.7.4 Unusual delays
When there is a long delay before the expected result:
-
DNS
-
Test the same name with the
nslookup
command on the system that is slow (client or server). If
nslookup
is also slow, you have a DNS problem. If it's slower on a client, you might have too many protocols bound to the Ethernet card. Eliminate NetBEUI, which is infamously slow, and, optionally, Novell ”assuming you don't need them. This is especially important on Windows 95, which is particularly sensitive to excess protocols.
-
Broadcast/ WINS
-
Test the client using
nmblookup
; if it's faster, you probably have the protocols problem as mentioned in the previous item.
-
NIS
-
Try
ypmatch
; if it's slow, report the problem to your network manager.
-
NIS+
-
Try
nismatch
, similarly.
-
hosts and HOSTS
-
The
hosts
files, if of reasonable
size
, are always fast. You probably have the protocols problem mentioned previously under DNS.
-
lmhosts and LMHOSTS
-
This is not a name lookup problem;
LMHOSTS
files are as fast as
hosts
and
HOSTS
files.
12.2.7.5 Localhost issues
When a localhost isn't 127.0.0.1, try the following:
-
DNS
-
There is probably no record for
localhost
.
A
127.0.0.1
. Arrange to add one, as well as a reverse entry,
1.0.0.127.IN-ADDR.ARPA
PTR
127.0.0.1
.
-
Broadcast/WINS
-
Not
applicable
.
-
NIS
-
If
localhost
isn't in the table, add it.
-
NIS+
-
If
localhost
isn't in the table, add it.
-
hosts and HOSTS
-
Add a line that says
127.0.0.1
localhost
.
-
LMHOSTS
-
Not applicable.
12.2.8 Troubleshooting Network Addresses
A number of common problems are caused by incorrect routing of Internet addresses or by the incorrect assignment of addresses. This section helps you determine what your addresses are.
12.2.8.1 Netmasks
Using the netmask, it is possible to determine which addresses can be reached directly (i.e., which are on the local network) and which addresses require forwarding packets through a router. If the netmask is wrong, the systems will make one of two mistakes. One is to route local packets via a router, which is an expensive waste of time ”it might work reasonably fast, it might run slowly, or it might fail utterly. The second mistake is to fail to send packets from a remote system to the router, which will prevent them from being forwarded to the remote system.
The netmask is a number like an IP address, with one-bits for the network part of an address and zero-bits for the host portion. It is used as a bitmask to mask off
parts
of the address inside the TCP/IP code. If the mask is 255.255.0.0, the first 2 bytes are the network part and the last 2 are the host part. More common is 255.255.255.0, in which the first 3 bytes are the network part and the last one is the host part.
For example, let's say your IP address is 192.168.0.10 and the Samba server is 192.168.236.86. If your netmask happens to be 255.255.255.0, the network part of the address is the first 3 bytes, and the host part is the last byte. In this case, the network parts are different, and the systems are on different networks:
|
Network part
|
Host part
|
|
192 168 000
|
10
|
|
192 168 235
|
86
|
If your netmask happens to be 255.255.0.0, the network part is just the first 2 bytes. In this case, the network parts match, and so the two systems are on the same network:
|
Network part
|
Host part
|
|
192 168
|
000 10
|
|
192 168
|
236 86
|
Make sure the netmask in use on each system matches the structure of your network. On every subnet, the netmask should be identical on each system.
12.2.8.2 Broadcast addresses
The broadcast address is a normal address, with the hosts part all one-bits. It means "all hosts on your network." You can compute it easily from your netmask and address: take the address and put one-bits in it for all the bits that are zero at the end of the netmask (the host part). The following table illustrates this:
|
|
Network part
|
Host part
|
|
IP address
|
192 168 236
|
86
|
|
Netmask
|
255 255 255
|
000
|
|
Broadcast
|
192 168 236
|
255
|
In this example, the broadcast address on the 192.168.236 network is 192.168.236.255. There is also an old "universal" broadcast address, 255.255.255.255. Routers are prohibited from forwarding these, but most systems on your local network will respond to broadcasts to this address.
12.2.8.3 Network address ranges
A number of address ranges have been reserved for testing and for nonconnected networks; we use these for the examples in this book. If you don't have an address yet, feel free to use one of these to start. They include one class A network, 10.*.*.*, a range of class B network addresses, 172.16.*.* through 172.31.*.*, and 254 class C networks, 192.168.1.* through 192.168.254.*. The domain
example.com
is also reserved for unconnected networks, explanatory examples, and books.
If you're actually connecting to the Internet, you'll need to get an appropriate IP address and a domain name, probably through the same company that provides your connection.
12.2.8.4 Finding your network address
If you haven't recorded your IP address, you can learn it through the
ifconfig
command on Unix or the
ipconfig
command on Windows. (Check your manual pages for any options required by your brand of Unix. For example,
ifconfig
-a
works on Solaris.) You should see output similar to the following:
$
ifconfig -a
le0: flags=63<UP,BROADCAST,NOTRAILERS,RUNNING >
inet 192.168.236.11 netmask ffffff00 broadcast 192.168.236.255
lo0: flags=49<<>UP,LOOPBACK,RUNNING<>>
inet 127.0.0.1 netmask ff000000
One of the interfaces will be loopback (in our examples,
lo0
), and the other will be the regular IP interface. The flags should show that the interface is running, and Ethernet interfaces will also say they support broadcasts (PPP interfaces don't). The other places to look for IP addresses are
/etc/hosts
files, Windows
HOSTS
files, Windows
LMHOSTS
files, NIS, NIS+, and DNS.
12.2.9 Troubleshooting NetBIOS Names
Historically, SMB protocols have depended on the NetBIOS name system, also called the LAN Manager name system. This was a simple scheme where each system had a unique 20-character name and broadcast it on the LAN for everyone to know. With TCP/IP, we tend to use names such as
client.example.com
, stored in
/etc/hosts
files through DNS or WINS.
The usual mapping of domain names such as
server.example.com
to NetBIOS names simply uses the
server
part as the NetBIOS name and converts it to uppercase. Alas, this doesn't always work, especially if you have a system with a 21-character name; not everyone uses the same NetBIOS and DNS names. For example,
corpvm1
along with
vm1.corp.com
is not unusual.
A system with a different NetBIOS name and domain name is confusing when you're troubleshooting; we recommend that you try to avoid this wherever possible. NetBIOS names are discoverable with
smbclient
:
-
If you can list shares on your Samba server with
smbclient
-L
short_name
, the short name is the NetBIOS name.
-
If you get
Get_Hostbyname
:
Unknown
host
name
, there is probably a mismatch. Check in the
smb.conf
file to see if the NetBIOS name is explicitly set.
-
Try to list shares again, specifying
-I
and the IP address of the Samba server (e.g.,
smbclient
-L
server
-I
192.168.236.86
). This
overrides
the name lookup and forces the packets to go to the IP address. If this works, there was a mismatch.
-
Try with
-I
and the full domain name of the server (e.g.,
smbclient
-L
server
-I
server.example.com
). This tests the lookup of the domain name, using whatever scheme the Samba server uses (e.g., DNS). If it fails, you have a name service problem. You should reread the earlier section, Section 12.2.7, after you finish troubleshooting the NetBIOS names.
-
Try with the
-n
(NetBIOS name) option, giving it the name you expect to work (e.g.,
smbclient
-n
server
-L
server-12
), but without overriding the IP address through
-I
. If this works, the name you specified with
-n
is the actual NetBIOS name of the server. If you receive
Get-Hostbyname
:
Unknown
host
SERVER
, it's not the right server yet.
-
If nothing is working so far, repeat the tests specifying
-U
username
and
-W
workgroup
, with the username and workgroup in uppercase, to make sure you're not being derailed by a user or workgroup mismatch.
-
If still nothing works and you had evidence of a name service problem, troubleshoot the name service (see the earlier section, Section 12.2.7) and then return to the NetBIOS name service.
|