LDAP distinguished name validator

Solution 1:

In perl, with Net::LDAP it's near trivial:

#!/usr/bin/perl
use strict;
use warnings;
use Net::LDAP::Util qw/canonical_dn/;
foreach my $dn (@ARGV) {
   if (!defined(canonical_dn($dn))) { print "not well formed: $dn\n"; }
   else                             { print "well formed: $dn\n"; }
}

Then:

$ perl ldapdn.pl "manager, ou=company, dc=net" "cn=manager, ou=company, dc=net"
not well formed: manager, ou=company, dc=net
well formed: cn=manager, ou=company, dc=net

There are a number of functions that validate DNs, ldap_explode_dn() might also useful if you wish to normalise and further process DNs.

It's worth noting that "well-formed" and "valid" aren't the same thing, since a syntactically well-formed DN might not match the schema of a particular LDAP DIT, and hence be rejected.

If you have OpenLDAP, any recent version should come with a slapdn program. This does proper schema checking, but you must have a viable slapd.conf and schema set on the system you run it of course (which may require running it as root or a special user due to file permissions on operational configuration files).

$ /usr/local/sbin/slapdn  -v "cn=manager, ou=company, dc=net"
DN: <cn=manager, ou=company, dc=net> check succeeded
normalized: <cn=manager,ou=company,dc=net>
pretty:     <cn=manager,ou=company,dc=net>

(If you have OpenLDAP built from source, it also comes with a dntest program built as part of its test suite. It only parses DNs, no schema checking. Sadly it doesn't have a usable error code, and appears to occasionally indicate malformed DNs with a segfault...)

And finally, the regex approach. As suggested by @voretaq7 you can use the ABNF from RFC 4514, though you'll also need the base syntaxes from RFC 4512 (§1.4). Run those though any ABNF to ERE converter (e.g. abnf2regex, implemented in Java), and out it pops. I'm not going to paste it here, it's approximately 4k of line noise. You can crack the whole nut with abnf2regex though:

$ java -jar abnf2regex.jar -t distinguishedName \
        "cn=manager,ou=company,dc=net" rfc4512.abnf rfc4514dn.abnf
Rule "distinguishedName" matches: cn=manager,ou=company,dc=net
Rule: [relativeDistinguishedName *(COMMA relativeDistinguishedName)]
Expanded: [(((%x41-5a / %x61-7a) *(%x41-5a / %x61-7a / %x30 / %x31-39 / %x2d))
 ... <<expanded ABNF snipped>>
Regex: (?:(?:[A-Za-z][\-01-9A-Za-z]*|(?:[01-9]|[1-9][01-9]+)(?:\.(?:[01-9]
 ... <<expanded regex snipped>>

The above is testing a string against the regex generated from specific named rule (-t distinguishedName). If you have sharp eyes you'll notice I cheated just a little, I removed the whitespace from the DN since it's not technically part of the DN and will break the match.


And finally (really this time) a simplified and imperfect regex that you can use with pcregrep -i:

 ^([a-z][a-z0-9-]*)=(?![ #])(((?![\\="+,;<>]).)|(\\[ \\#="+,;<>])|(\\[a-f0-9][a-f0-9]))*
(,([a-z][a-z0-9-]*)=(?![ #])(((?![\\="+,;<>]).)|(\\[ \\#="+,;<>])|(\\[a-f0-9][a-f0-9]))*)*$

I have padded and wrapped it to make it legible, well, less illegible perhaps. Simplified breakdown is

^(attributename)=(attributevalue)(,(attributename)=(attributevalue))*$

with

 attributevalue = not leading space or octothorpe |
                  any char except specials | 
                  escaped specials |
                  escaped hex-digit pair         

This takes at least the following liberties:

  • it largely ignores Unicode (though you might find pcregrep --utf helps) and won't validate UTF-8
  • it does not support direct numeric OIDs in attribute types
  • it does not support multi-valued RDNs (e.g. cn=Bob+sn=Smith)
  • it doesn't handle un-escaped trailing whitespace

As per-spec, it doesn't deal with gratuitous formatting whitespace at the start, end or around ",".

Solution 2:

Don't bother. The LDAP server you're talking too already knows what an invalid DN looks like. Just catch the error (response code) and act on it appropriately.