Building LDAP Filters | Searching with the DirectorySearcher

Table of contents:

The filter in an LDAP query restricts the objects that the search will return. It is the equivalent of the WHERE clause in the ubiquitous SQL SELECT statement in the RDBMS world. For example, we could use a filter to limit a search of an entire Active Directory domain naming context to find a specific user's object by their login name.

We rarely want to search for every object in the directory or within a certain scope, so it is critical to learn to use LDAP filters effectively.

The grammar and rules for LDAP filters are part of the LDAP version 3 specification and they are defined in RFC 2254. The syntax is actually quite simple compared to that of many other query languages. We can be productive creating LDAP filters in just a few minutes.

We specify the DirectorySearcher's filter using its Filter property. If no value is specified, the default (objectClass=*) filter is used. This filter matches any object and is similar to using SELECT * in the SQL world.

Basic Syntax

At their most basic level, filters are composed of what are called filter comparisons or components. Here is the basic format for any individual comparison:

()

For example:

(displayName=John Doe)

In this example, the attribute name is displayName, the filter type is =, and the attribute value is John Doe. The surrounding parentheses, (), are required to delimit an individual filter component.

To create a complex filter, individual components are composed into a filter list using logical AND, OR, and NOT operations (specified as &, |, and !). For a filter with two components that must both be true, the filter list might look like this:

(&(displayName=John Doe)(telephoneNumber=5551212))

Here the two comparisons are nested inside an & operation, so both must be true to satisfy the filter. This nesting can be done to arbitrary depths to create some very complex expressions. Here is a more complicated example.

(|(&(displayName=John Doe)(telephoneNumber=5551212))
(&(displayName=Mike Smith)(telephoneNumber=5551000)))

This filter requires that the object have either a displayName of John Doe and a phone number of 5551212, or a displayName of Mike Smith and a phone number of 5551000.

Filter Types

The comparison itself is often called the filter type, and there are seven valid filter types, as listed in Table 4.1.

Table 4.1. Filter Types
Filter Type Symbol	Filter Type Description
`=`	Equal
`~=`	Approximately equal
`>=`	Greater than or equal
`<=`	Less than or equal
`attrib:matchingrule`	Extensible
`=*`	Presence
`= [initial] any [final]`	Substring

Equal

This filter type is the most straightforward. It simply checks for equality and can be used with all attribute types if the value part can be interpreted correctly. Here are some examples:

(sn=dunn) Last Name equals "dunn"
(isDeleted=TRUE) Boolean equality

Approximately Equal

This filter type is intended to indicate that a value is approximately equal to the actual value. In practice, this filter type does not seem to produce predictable results and it is not used frequently.

Greater Than or Equal

This filter type compares the value to see if it is greater than or equal to the attribute value. For instance:

(lockoutTime>=1)

Note that not all attribute syntaxes may use this filter type, although it is generally available to strings, dates, and numbers.

Less Than or Equal

This filter type is similar to >= and has the same restrictions on usage. Note that LDAP filters do not support the simple > and < semantics. The = logic is always included. As such, we must sometimes remember to add or subtract one from our comparisons.

Extensible

Extensible filters also allow provider-specific matching rules to be used. The rule is defined by an Object Identifier (OID) and it uses syntax like this:

(userAccountControl:1.2.840.113556.1.4.803:=2)

Active Directory and ADAM define two extensible filter types for doing bitwise AND and OR comparisons on numeric attributes. In this example, we are searching for objects that have bit 2 set, indicating in this case that the account is disabled.

Presence

This filter type allows us simply to check whether an object has a specific attribute. The presence filter type does not specify a value in the comparison. The value is not *. Instead, the whole filter type is =*. This may seem like a minor point, but a presence filter type is different from a substring filter type, even though they both use *. Please do not confuse them.

As an example, the following filter will find objects with a displayName attribute:

(displayName=*)

All attribute syntaxes may use presence filter types.

Substring

The substring filter type allows us to match part of a string using the familiar * character as a wildcard placeholder. For example, the following filter would find anyone named Frank or Frances:

(givenName=fran*)

The wildcard character may be placed anywhere in the string, including at the beginning or end, and multiple wildcard characters may be used. As such, all of these are valid as well:

(givenName=*rank)
(givenName=Fr*nk)
(givenName=F*an*)

The value must have at least one character other than the wildcard placeholder in order to differentiate a substring filter type from a presence filter type.

Not all attribute syntaxes support substring searches, but all of the standard string syntaxes do.

Substring Performance Tips

Substring searches are best used when at the end of the filter and not at the beginning or in the middle. The server is unable to use the standard indices unless the wildcard is placed at the end of the value.

A new option introduced to Active Directory in Windows 2003 and included in ADAM allows a special "tuple" index to be built. This greatly improves the performance of substring searches where the wildcard comes at the beginning of the value. Note that these types of indices consume significantly more resources than a typical index and most attributes do not use this feature. Chapter 7 discusses how these types of indices are specified.

Reserved Characters in Values

Much as we would in any other language, we need to escape reserved characters if we wish to search for the intrinsic value itself. To escape reserved characters, we need to replace the character with its ASCII hex value equivalent. Table 4.2 lists the reserved characters and their escape sequences. Notice that the escape sequence comprises just the character (indicating binary data) and the hex of the ASCII character equivalent.

Table 4.2. Reserved Characters
Character	Escape Sequence
`*`	`2A`
`(`	`28`
`)`	`29`
	`5C`
`NUL`	`0`
`/`	`2F`

For example, we can use the following to search for an object that has the * character in its description attribute:

(description=A description with 2A)

Specifying Comparison Values in Search Filters

In Chapter 1, we discussed how LDAP attributes have a defined syntax that represents the primitive data type of the attribute. The LDAP filter syntax defines the query filter syntax for each of these attribute syntaxes. We drill down on the various syntaxes in Chapter 6 extensively, but let's take a quick look at them now in terms of query filters.

Table 4.3 summarizes the valid Active Directory and ADAM attribute syntaxes and the rules governing them.

Table 4.3. Summary of Attribute Filter Syntaxes and Allowed Operators
Syntax Name	Syntax OID	Filter Description	Allows `>=` and `<=`	Allows Substring
`Object(DS-DN)`	2.5.5.1	The string version of the DN	No	No
`String(Object-Identifier)`	2.5.5.2	The string version of the OID	No	No
`String(Teletex)`	2.5.5.4	The string itself	Yes	Yes
`String(Printable)`	2.5.5.5	The string itself	Yes	Yes
`String(IA5)`	2.5.5.5	The string itself	Yes	Yes
`String(Numeric)`	2.5.5.6	The string version of the number	Yes	Yes
`Object(DN-Binary)`	2.5.5.7	The binary filter string encoding of the standard string representation of the attribute	No	No
`Boolean`	2.5.5.8	TRUE or FALSE	No	No
`Integer`	2.5.5.9	The number as a string	Yes	No
`String(Octet)`	2.5.5.10	The binary filter string encoding of the binary data	Yes	Yes
`String(UTC-Time)`	2.5.5.11	The string in the format YYYYMMDDHHMMSS.0Z	Yes	No
`String(Generalized-Time)`	2.5.5.11	The string in the format YYYYMMDDHHMMSS.0Z	Yes	No
`String(Unicode)`	2.5.5.12	The string itself	Yes	Yes
`Object(DN-String)`	2.5.5.14	The binary encoded version of the standard string representation of the attribute	No	No
`String(NT-Sec-Desc)`	2.5.5.15	nTSecurityDescriptor is an operational attribute and we cannot use it in a filter	NA	NA
`Interval/LargeInteger`	2.5.5.16	The number as a string	Yes	No
`String(Sid)`	2.5.5.17	The binary filter string encoding of the binary data	Yes	No

As we can see, different attribute syntaxes require different formatting rules for specifying the value in the LDAP filter. Additionally, not all syntaxes support every operator.

We will now dig into the details of these rules and show examples of how to format all of these different strings.

Searching for Strings

Searching for attributes with one of the standard string syntaxes is pretty straightforward. By standard strings, we mean Teletex, Printable, IA5, Numeric, and Unicode. The comparison value is just the string itself and this hardly warrants a full-blown code sample, so we just show a filter showing a search for a string value:

(displayName=John Doe)

All of these string syntaxes allow the >= and <= operators and well as substring matches.

There are a few additional details to remember.

Teletex, IA5, and Printable strings are case sensitive for searching. This means that the value in the query filter must match the attribute value case exactly.
Unicode strings are not case sensitive for searching. The case in the query filter does not matter.
Teletex, Printable, IA5, and Numeric strings have limited character sets.
Unicode strings may contain any Unicode character. UTF8 encoding should be used for non-ASCII values and some values may require escape sequences as a result.

Searching for Numbers

Creating filters for numeric attributes is trivial, as the value is just the decimal version of the number. For example:

(badPwdCount=5)

The same rules apply for normal and LargeInteger syntaxes. Note that when LargeInteger values actually represent time values, we may need to do a little bit more work to format them correctly. We cover how to do this in the upcoming section, Searching for Time Values.

Numeric values support the >= and <= operators, but do not support substring searches.

Searching for Boolean Data

Searching for Boolean data is actually very simple, yet it seems to be a constant source of confusion. The only thing to remember is that a Boolean value is case sensitive. We must use either TRUE or FALSE to represent the value for a Boolean search filter. Notice that the values are all uppercased. Thus, we can have this:

(isDeleted=TRUE)

...or this:

(isDeleted=FALSE)

Searching for Distinguished Names and Object Identifiers

Distinguished names (syntax 2.5.5.1 in Table 4.3) and object identifiers are similar to strings. To match a DN, our filter might look like this:

(distinguishedName=CN=Users,DC=domain,DC=com)

To match an OID, we might do this:

(attributeSyntax=2.5.5.1)

This looks very similar to the standard string syntaxes, but there are two key differences. These attributes do not allow substring matches or the >= and <= filter types. The value must be supplied exactly.

These limitations are not obvious at first and they can cause confusion. For example, we might want to find all of the members of a group who are in a certain Organizational Unit (OU). We would like to do this:

(member=*,OU=MyOU,DC=domain,DC=com)

Unfortunately, that does not work.

Searching for Binary Data

A number of attributes in Active Directory and ADAM contain binary data. These attributes use the basic 2.5.5.10 Octet String syntax for arbitrary binary data, or the 2.5.5.17 SID syntax that is used specifically for security identifiers (SIDs), as shown in Table 4.3. It might seem quite difficult to search for binary data, given its nature. However, it is possible, and in the case of GUIDs and SIDs, it is used quite often.

Listing 3.5 from Chapter 3 demonstrates a utility function called BuildOctetString that converts binary data into the native LDAP Octet String format. Search filters also use octet strings to specify binary data, but they must escape each byte with an additional character in order to work. Listing 4.2 demonstrates another version of this function especially designed for search filters.

Listing 4.2. Converting Binary to String for Search Filters

using System.Text;

private string BuildFilterOctetString(byte[] bytes)

{
 StringBuilder sb = new StringBuilder();

 for(int i=0; i < bytes.Length; i++)
 {
 sb.AppendFormat(
 "\{0}",
 bytes[i].ToString("X2")
 );
 }
 return sb.ToString();
}

Once we have the binary data in string format, it is a simple matter to use it to search. Let's suppose we know that this is a particular object's GUID:

{4a5a0fa7-1200-4198-a3a7-31ee9ba10fc9}

Listing 4.3 shows how we can use the BuildFilterOctetString function to generate the appropriate filter value.

Listing 4.3. Converting a GUID to a Filter String

Guid objectGuid = new
 Guid("4a5a0fa7-1200-4198-a3a7-31ee9ba10fc9");
string filter = string.Format(
 "(objectGUID={0})",
 BuildFilterOctetString(objectGuid.ToByteArray())
 );
Console.WriteLine(filter);
//OUT: (objectGUID=
// A7F5A4A0129841A3A731EE9BA1FC9)

We can easily apply this to any other type of binary data stored in the directory, including SIDs or even more esoteric data like JPEGs and X509 certificates, if we wish.

One other interesting point to mention about these attribute syntaxes is that they allow the >= and <= operators. Note, though, that standard octet strings also support substring searches, but SIDs do not. We rarely need to use these special filter types with binary data; it is useful occasionally. We are uncertain as to why SIDs do not support substring searches and normal octet strings do, but we cannot think of a good reason to search for SIDs this way either, so perhaps that explains it.

Searching for Time Values

Searching by date/time is a fairly common task in Active Directory and ADAM. However, the syntax representations for dates/times in a search filter are not immediately obvious.

The main issue here is that Active Directory actually uses several different syntaxes to represent a date value. These break down into two categories:

Dates stored as Generalized or UTC time values
Dates encapsulated in the Windows FILETIME format that are stored as LargeInteger values (equivalent to a .NET Int64)

The reason for this is largely historic. The LDAP specification defines the Generalized and UTC time syntaxes and specifies many standard attributes that use them, so Active Directory uses the LDAP specification to represent those values. However, Windows also uses the FILETIME structure extensively for storing and processing values. For many of the features in Active Directory that integrate directly with Windows, such as account expiration dates, it makes sense to store those values in the native format to avoid any possible loss during translation.

Once we know what syntax the attribute in question actually uses, we know how to proceed.

Searching for Generalized and UTC Time Values. Generalized and UTC time syntaxes use a human-readable string in the following format:

YYYYMMDDHHMMSS.0[+/-]HHMM

In case you were wondering what the +/- relates to, Active Directory and ADAM date/time values are always stored in the Greenwich Mean Time (GMT) standard time zone. By adding or subtracting our time zone offset relative to GMT, we can specify the correct time for our location.

For example, the Eastern Standard Time (EST) offset is -0500, so August 1, 2005, at 6:00 AM would be represented as follows:

20050801060000.0-0500

If we want to use the GMT time zone itself, the offset is specified as .0Z. The same time in GMT would be:

20050801060000.0Z

With some languages and platforms, we might need to perform tedious string concatenation operations to build this format. However, the .NET Framework makes this easy. Listing 4.4 demonstrates how to use DateTime.ToString to accomplish this.

Listing 4.4. Generating UTC and Generalized Time Filters

public static string GetUtcFilter(DateTime date)
{
 return date.ToString("yyyyMMddhhmmss.0Z");
}

public static string GetGeneralizedFilter(
 DateTime date,
 TimeSpan offset
 )
{
 string sign =
 TimeSpan.Compare(offset, TimeSpan.Zero) == -1? "" : "+";
 return string.Format(
 "{0}.0{1}{2}{3}",
 date.ToString("yyyyMMddhhmmss"),
 sign,
 offset.Hours.ToString("00"),
 offset.Minutes.ToString("00")
 );
}

The UTC version is a bit more straightforward, so it makes sense to use it and to convert the date parameter to UTC in advance.

Creating Filters for Dates in FILETIME Format. The 2.5.5.16 LargeInteger syntax can be used to represent any standard 8-byte integer value, just like the .NET Int64 (long) type. Because the number syntaxes in filters are just specified as the standard decimal string representation of the number, there is not much to the filter itself:

(accountExpires<=127787436516581785)

However, when the LargeInteger syntax value actually represents a FILETIME and we wish to create the filter given a date value, we must convert the date into a number representing the FILETIME first. Since most LargeInteger syntax attributes in Active Directory and ADAM represent FILETIME values or time spans, this is what we usually need to do.

Luckily, .NET makes this easy too. The ToFileTime method on the DateTime struct returns an Int64. This is exactly what we need. Listing 4.5 shows a complete sample of applying this to find passwords that are more than 30 days old.

Listing 4.5. Creating a LargeInteger Date Filter to Find Old Passwords

string adsPath = "LDAP://dc=domain,dc=com";

//Explicitly create our SearchRoot
DirectoryEntry searchRoot = new DirectoryEntry(
 adsPath,
 null,
 null,
 AuthenticationTypes.Secure
 );

using (searchRoot) //we are responsible for Disposing
{
 //find anything with a password older than 30 days
 string qry = String.Format(
 "(pwdLastSet<={0})",
 DateTime.Now.AddDays(-30).ToFileTime() //30 days ago
 )

 DirectorySearcher ds = new DirectorySearcher(
 searchRoot,
 qry
 );

 using (SearchResultCollection src=ds.FindAll())
 {
 Console.WriteLine("Returning {0}", src.Count);

 foreach (SearchResult sr in src)
 {
 Console.WriteLine(sr.Path);
 }
 }
}

Bitwise Operations

Some attributes in Active Directory and ADAM are numbers that represent a collection of bitwise flags. Most developers will deal with the most common of these attributes, userAccountControl, at some point in time. In fact, we deal with it extensively in Chapter 10. This particular attribute defines many aspects of a user account's security settings.

Bitwise searches are enabled by the extensible filter type in Active Directory and ADAM. The format for these filters looks something like (attribute:extension:=value). Extensions match a rule OID, so we can express our previous format as (attribute:ruleOID:=value). For bitwise filters, there are two relevant extensions and matching rule OIDs:

Bitwise OR (called LDAP_MATCHING_RULE_BIT_OR). 1.2.840.113556.1.4.804
Bitwise AND (called LDAP_MATCHING_RULE_BIT_AND). 1.2.840.113556.1.4.803

Note that there is no bitwise NOT equivalent.

Using a bitwise filter is fairly straightforward; the developer only really needs to worry about the value portion of the filter. The bitwise filter format dictates that the value portion of our filter is in decimal format. This simply means that we must convert the hexadecimal or binary flag value to decimal and use it instead. As shown in Listing 4.6, if we want to find all disabled accounts, we should determine the relevant flag (in this case, UF_ACCOUNTDISABLE) and attach the bitwise AND rule OID to our filter, converting the flag value to decimal first.

Listing 4.6. Finding Disabled Accounts with a Bitwise Filter

string adsPath = "LDAP://dc=domain,dc=com";

//Explicitly create our SearchRoot
DirectoryEntry searchRoot = new DirectoryEntry(
 adsPath,
 null,
 null,
 AuthenticationTypes.Secure
 );

using (searchRoot)

{

 //UF_ACCOUNTDISABLE = 0x2, which is 2 decimal
 //find all disabled accounts
 string filter =
 "(userAccountControl:1.2.840.113556.1.4.803:=2)";

 DirectorySearcher ds = new DirectorySearcher(
 searchRoot,
 filter
 );

 using (SearchResultCollection src=ds.FindAll())
 {
 Console.WriteLine("Returning {0}", src.Count);

 foreach (SearchResult sr in src)
 {
 Console.WriteLine(sr.Path);
 }
 }
}

The performance on bitwise filters is not spectacular, since any indices on the attribute cannot be used. As such, it is best to use this type of filter comparison in conjunction with other indexed attributes in the filter criteria to minimize the impact.

Restrictions on Attributes That May Be Used in a Filter

Most attributes in Active Directory and ADAM can be used in a query filter, but a few cannot. Specifically, constructed and operational attributes are not available for use in search filters. Attributes such as canonicalName and tokenGroups are actually constructed on the fly by the directory, so they are restricted from usage in query filters.

Note that we can still return these attributes as part of a search result (with restrictions, in some cases). We just cannot use them to find the actual object in the first place.

Ambiguous Name Resolution

Ambiguous name resolution (ANR) is an Active Directory feature that makes it easier to find objects in the directory when we have only a fragment of a name and we do not know exactly to what attribute the name corresponds.

ANR is essentially a shortcut that creates a more complex filter for us under the hood, using indexed attributes to help improve performance.

For example, this simple filter expands to the much larger filter shown in Listing 4.7:

(anr=dunn)

Listing 4.7. ANR Filter Expansion

(|
 (displayName=dunn*)
 (givenName=dunn*)
 (legacyExchangeDN=dunn)
 (msDS-AdditionalSamAccountName=dunn*)
 (physicalDeliveryOfficeName=dunn*)
 (proxyAddresses=dunn*)
 (name=dunn*)
 (sAMAccountName=dunn*)
 (sn=dunn*)
)

As Listing 4.7 demonstrates, ANR is doing a lot of work for us behind the scenes. ANR is searching nine attributes for us instead of one and has changed the filter type to substring. Luckily, the wildcard is at the end of the string and all of these attributes are generally indexed in Active Directory by default, but ANR can still have an adverse effect on performance if abused. A filter containing a single ANR criterion is not much to worry about, however multiple ANR expressions in the filter can quickly grow out of hand. Consider Listing 4.8.

Listing 4.8. Beware ANR Filter Expansion

Searching for the authors or their hometowns gets messy quickly.

(|(anr=dunn)(anr=kaplan)(anr=chicago)(anr=seattle))

expands to:

(|
 (displayName=dunn*)
 (givenName=dunn*)
 (legacyExchangeDN=dunn)
 (msDS-AdditionalSamAccountName=dunn*)
 (physicalDeliveryOfficeName=dunn*)
 (proxyAddresses=dunn*)
 (name=dunn*)
 (sAMAccountName=dunn*)
 (sn=dunn*)
 (displayName=kaplan*)
 (givenName=kaplan*)
 (legacyExchangeDN=kaplan)
 (msDS-AdditionalSamAccountName=kaplan*)
 (physicalDeliveryOfficeName=kaplan*)
 (proxyAddresses=kaplan*)
 (name=kaplan*)
 (sAMAccountName=kaplan*)
 (sn=kaplan*)
 (displayName=chicago*)
 (givenName=chicago*)
 (legacyExchangeDN=chicago)
 (msDS-AdditionalSamAccountName=chicago*)
 (physicalDeliveryOfficeName=chicago*)
 (proxyAddresses=chicago*)
 (name=chicago*)
 (sAMAccountName=chicago*)
 (sn=chicago*)
 (displayName=seattle*)
 (givenName=seattle*)
 (legacyExchangeDN=seattle)
 (msDS-AdditionalSamAccountName=seattle*)
 (physicalDeliveryOfficeName=seattle*)
 (proxyAddresses=seattle*)
 (name=seattle*)
 (sAMAccountName=seattle*)
 (sn=seattle*)
)

Chapter 7 briefly describes the schema value that determines if an attribute is included in ANR.

Controlling the Content of Search Results

Part I: Fundamentals

Introduction to LDAP and Active Directory

Introduction to .NET Directory Services Programming

Binding and CRUD Operations with DirectoryEntry

Searching with the DirectorySearcher

Advanced LDAP Searches

Reading and Writing LDAP Attributes

Active Directory and ADAM Schema

Security in Directory Services Programming

Introduction to the ActiveDirectory Namespace

Part II: Practical Applications

User Management

Group Management

Authentication

Part III: Appendixes

Appendix A. Three Approaches to COM Interop with ADSI

Appendix B. LDAP Tools for Programmers

Appendix C. Troubleshooting and Help

Index