Domain Join Problems
Troubleshoot and solve domain-join problems
This topic covers how to resolve domain-join problems.
Top 10 reasons domain-join fail
Top reasons for domainjoin-cli failures:
- Root or sudo was not used to run the domain-join command.
- The user name or password of the account used to join the domain is incorrect.
- The join account lacks permissions to create or move the computer object.
Note
For more information, see Delegate domain join permissions.
- The name of the domain, OU, or user is mistyped.
- The local hostname is invalid.
- Incorrect /etc/hosts file entries for DNS servers or entries overriding DNS results.
- Incorrect or missing domain controller forward and reverse address (PTR) records. Verify nslookup results for FQDN and IP address match.
- The domain controller is unreachable from the client because of a firewall or because the NTP service is not running on the domain controller.
Note
For more information, see:
- A local time service is running and is not synced to the DC. Disable or use domainjoin-cli --notimesync option.
- SELinux is set to either enforcing or permissive on a system without a prebuilt policy. Use audit2allow to build a new policy.
Note
To turn off SELinux, see the SELinux man page.
Solve domain-join problems
To troubleshoot problems with joining a Linux computer to a domain, perform the following series of diagnostic tests sequentially on the Linux computer with a root account.
The tests can also be used to troubleshoot domain-join problems on a Unix computer; however, the syntax of the commands on Unix might be slightly different.
The procedures in this topic assume that you have already checked whether the problem falls under the Top 10 Reasons Domain Join Fails (see above). We also recommend that you generate a domain-join log.
Note
For more information, see Generate debug logs for AD Bridge services.
Verify that the name server can find the domain
Run the following command as root:
nslookup YourADrootDomain.com
Ensure the client can reach the domain controller
You can verify that your computer can reach the domain controller by pinging it:
ping YourDomainName
Check DNS connectivity
The computer might be using the wrong DNS server or none at all. Make sure the nameserver entry in /etc/resolv.conf contains the IP address of a DNS server that can resolve the name of the domain you are trying to join. The IP address is likely to be that of one of your domain controllers.
Ensure nsswitch.conf is configured to check DNS for host names
The /etc/nsswitch.conf file must contain the following line. (On AIX, the file is /etc/netsvc.conf.)
hosts: files dns
Computers running Solaris, in particular, may not contain this line in nsswitch.conf until you add it.
Ensure that DNS queries use the correct network interface card
If the computer is multi-homed, the DNS queries might be going out the wrong network interface card.
Temporarily disable all the NICs except for the card on the same subnet as your domain controller or DNS server and then test DNS lookups to the AD domain.
If this works, re-enable all the NICs and edit the local or network routing tables so that the AD domain controllers are accessible from the host.
Determine if DNS server is configured to return SRV records
Your DNS server must be set to return SRV records so the domain controller can be located. It is common for non-Windows (bind) DNS servers to not be configured to return SRV records.
Diagnose it by executing the following command:
nslookup -q=srv _ldap._tcp.ADdomainToJoin.com
Determine if DNS server is configured to return PTR records
Your DNS server must be set to return PTR records so GSS can work properly.
Diagnose it by executing the following command:
nslookup DOMAINCONTROLLERIPADDRESS
Ensure sure that the Global Catalog is accessible
The global catalog for Active Directory must be accessible. A global catalog in a different zone might not show up in DNS. Diagnose it by executing the following command:
nslookup -q=srv _ldap._tcp.gc._msdcs.ADrootDomain.com
From the list of IP addresses in the results, choose one or more addresses and test whether they are accessible on Port 3268 using telnet.
telnet 192.168.100.20 3268
Trying 192.168.100.20... Connected to sales-dc.example.com (192.168.100.20). Escape character is '^]'. Press the Enter key to close the connection: Connection closed by foreign host.
Verify that the client can connect to the domain on Port 123
The following test checks whether the client can connect to the domain controller on Port 123 and whether the Network Time Protocol (NTP) service is running on the domain controller. For the client to join the domain, NTP, the Windows time service, must be running on the domain controller.
On a Linux computer, run the following command as root:
ntpdate -d -u DC_hostname
Example
ntpdate -d -u sales-dc
Note
For more information, see Diagnose NTP on port 123
In addition, check the logs on the domain controller for errors from the source named w32tm, which is the Windows time service.
Ignore inaccessible trusts
An inaccessible trust can block you from successfully joining a domain. If you know that there are inaccessible trusts in your Active Directory network, you can set AD Bridge to ignore all the trusts before you try to join a domain. To do so, use the config tool to modify the values of the DomainManagerIgnoreAllTrusts setting.
- List the available trust settings:
/opt/pbis/bin/config --list | grep -i trust
The results will look something like this. The setting at issue is DomainManagerIgnoreAllTrusts
DomainManagerIgnoreAllTrusts
DomainManagerIncludeTrustsList
DomainManagerExcludeTrustsList
- List the details of the DomainManagerIgnoreAllTrusts setting to see the values it accepts:
[root@rhel5d bin]# ./config --details DomainManagerIgnoreAllTrusts
Name: DomainManagerIgnoreAllTrusts
Description: When true, ignore all trusts during domain enumeration.
Type: boolean
Current Value: false
Accepted Values: true, false
Current Value is determined by local policy.
- Change the setting to true so that AD Bridge will ignore trusts when you try to join a domain.
[root@rhel5d bin]# ./config DomainManagerIgnoreAllTrusts true
- Check to make sure the change took effect:
[root@rhel5d bin]# ./config --show DomainManagerIgnoreAllTrusts
boolean
true
local policy
Now try to join the domain again. If successful, keep in mind that only users and groups who are in the local domain will be able to log on the computer.
In the example output above that shows the setting's current values, local policy is listed, meaning that the setting is managed locally through config because an AD Bridge Group Policy setting is not managing the setting. Typically, with AD Bridge, you would manage the DomainManagerIgnoreAllTrusts setting by using the corresponding Group Policy setting, but you cannot apply Group Policy Objects (GPOs) to the computer until after it is added to the domain. The corresponding AD Bridge policy setting is named Lsass: Ignore all trusts during domain enumeration.
For information on the arguments of config, run the following command:
/opt/pbis/bin/config --help
Resolve common error messages
This section lists solutions to common errors that can occur when you try to join a domain.
Configuration of krb5
Error message:
Warning: A resumable error occurred while processing a module.
Even though the configuration of 'krb5' was executed, the configuration did not
fully complete. Please contact BeyondTrust support.
Solution:
Backup the current krb5 files and try to join the domain again.
mv /etc/krb5.conf /etc/krb5.conf.old mv /etc/krb5.conf.d/krb5.conf /etc/krb5.conf.d/krb5.conf.old mv /etc/krb5.keytab /etc/krb5.keytab.old
Chkconfig failed
This error can occur when you try to join a domain or you try to execute the domain-join command with an option but the netlogond daemon is not already running.
Error message:
Error: chkconfig failed [code 0x00080019]
Description: An error occurred while using chkconfig to process the netlogond daemon, which must be added to the list of processes to start when the computer is rebooted. The problem may be caused by startup scripts in the /etc/rc.d/ tree that are not LSB-compliant.
Verification: Running the following command as root can provide information about the error:
chkconfig --add netlogond
Solution:
Remove startup scripts that are not LSB-compliant from the /etc/rc.d/ tree.
Replication issues
The following error might occur if there are replication delays in your environment. A replication delay might occur when the client is in the same site as an RODC.
Error message:
Error: LW_ERROR_KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN [code 0x0000a309]
Client not found in Kerberos database
[root@rhel6-1 ~]# echo $?
1
[root@rhel6-1 ~]# /opt/pbis/bin/domainjoin-cli query
Error: LW_ERROR_KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN [code 0x0000a309]
Client not found in Kerberos database
Solution:
After the error occurs, wait 15 minutes, and then run the following command to restart AD Bridge:
/opt/pbis/bin/lwsm restart lwreg
Diagnose NTP on port 123
When you use the AD Bridgedomain-join utility to join a Linux or Unix client to a domain, the utility might be unable to contact the domain controller on Port 123 with UDP. The AD Bridge agent requires that Port 123 be open on the client so that it can receive NTP data from the domain controller. In addition, the time service must be running on the domain controller.
You can diagnose NTP connectivity by executing the following command as root at the shell prompt of your Linux computer:
ntpdate -d -u DC_hostname
Example
ntpdate -d -u sales-dc
If all is well, the result should look like this:
[root@rhel44id ~]# ntpdate -d -u sales-dc
2 May 14:19:20 ntpdate[20232]: ntpdate 4.2.0a@1.1190-r Thu Apr 20 11:28:37 EDT 2006 (1)
Looking for host sales-dc and service ntp
host found : sales-dc.example.com
transmit(192.168.100.20)
receive(192.168.100.20)
transmit(192.168.100.20)
receive(192.168.100.20)
transmit(192.168.100.20)
receive(192.168.100.20)
transmit(192.168.100.20)
receive(192.168.100.20)
transmit(192.168.100.20)
server 192.168.100.20, port 123
stratum 1, precision -6, leap 00, trust 000
refid [LOCL], delay 0.04173, dispersion 0.00182
transmitted 4, in filter 4
reference time: cbc5d3b8.b7439581 Fri, May 2 2008 10:54:00.715
originate timestamp: cbc603d8.df333333 Fri, May 2 2008 14:19:20.871
transmit timestamp: cbc603d8.dda43782 Fri, May 2 2008 14:19:20.865
filter delay: 0.04207 0.04173 0.04335 0.04178
0.00000 0.00000 0.00000 0.00000
filter offset: 0.009522 0.008734 0.007347 0.005818
0.000000 0.000000 0.000000 0.000000
delay 0.04173, dispersion 0.00182
offset 0.008734
2 May 14:19:20 ntpdate[20232]: adjust time server 192.168.100.20 offset 0.008734 sec
Output when there is no NTP service
If the domain controller is not running NTP on Port 123, the command returns a response such as no server suitable for synchronization found, as in the following output:
5 May 16:00:41 ntpdate[8557]: ntpdate 4.2.0a@1.1190-r Thu Apr 20 11:28:37 EDT 2006 (1)
Looking for host RHEL44ID and service ntp
host found : rhel44id.example.com
transmit(127.0.0.1)
transmit(127.0.0.1)
transmit(127.0.0.1)
transmit(127.0.0.1)
transmit(127.0.0.1)
127.0.0.1: Server dropped: no data
server 127.0.0.1, port 123
stratum 0, precision 0, leap 00, trust 000
refid [127.0.0.1], delay 0.00000, dispersion 64.00000
transmitted 4, in filter 4
reference time: 00000000.00000000 Wed, Feb 6 2036 22:28:16.000
originate timestamp: 00000000.00000000 Wed, Feb 6 2036 22:28:16.000
transmit timestamp: cbca101c.914a2b9d Mon, May 5 2008 16:00:44.567
filter delay: 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000
filter offset: 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
delay 0.00000, dispersion 64.00000
offset 0.000000
5 May 16:00:45 ntpdate[8557]: no server suitable for synchronization found
Turn off Apache to join a domain
The Apache web server locks the keytab file, which can block an attempt to join a domain. If the computer is running Apache, stop Apache, join the domain, and then restart Apache.
Updated 12 days ago