Domain Name Validator

I just released the domain_name_validator gem, which is a simple Ruby gem that provides the capability to validate domain names. There are a number of other gems that provide features for dealing with domain names (and even Ipv4 and Ipv6 addresses). But none of them provided, in an easy-to-use form, the particular capability that I needed.

Put simply, I needed to know for any given domain name, whether that domain name was valid. I work with clients who often pass around extensive lists of domain names encountered while anlayzing malware. I needed to be able to process a list of domain names, and tell my clients which ones were valid, and which were not.

Accordingly, the scope of this gem is deliberately focused on validating domain names. It simply answers the question: "Is this a real domain name?" Using this command, you can make a realistic assessment about whether you want to store a domain name (or a URL that contains that domain name) in your database. This gem will tell you:

  1. If a domain is or is not valid

  2. If it's not valid, what the errors are

Requirements

This is a Ruby gem with no run-time dependencies on anything else. It's only been tested under Ruby 1.9.3, but it should be compatible with all versions of Ruby from 1.8.6 onwards.

Installation

Installation is as easy as it gets. Simply install the gem:

      $ gem install domain_name_validator

How It Works

To validate a domain name:

    v = DomainNameValidator.new
    if v.validate('keenertech.com')
      # Do something
    end

What about error messages? If a domain isn't valid, it's often desirable to find out why the domain wasn't valid. To do this, simply pass an array into the "validate" method as the optional second argument.

    errs = []
    v = DomainNameValidator.new
    unless v.validate('keenertech.123', errs)
      puts("Errors: #{errs.inspect}")
    end

This generates the following output:

      Errors: ["The top-level domain cannot be numerical"]

This gem should make it easy to validate domain names.

More Background on Domain Names

Domain names provide a unique, memorizable name to represent numerically addressable Internet resources. They also provide a level of abstraction that allows the underlying Internet address to be changed while still referencing a resource by its domain name. The domain name space is managed by the Internet Corporation for Assigned Names and Numbers (ICANN).

The right-most label of a domain name is referred to as the top-level domain, or TLD. A limited set of top-level domain names, and two-character country codes, have been standardized. The Internet Assigned Numbers Authority (IANA) maintains an annotated list of top-level domains, as well as a list of "special use," or reserved, top-level domain names.

Additionally, some registrar authorities have further limitations on domain names, resulting in what I've seen referred to as "effective" TLD's. For example, in Japan, there is an effective set of TLD's featuring "tokyo.jp". So the following is an effective TLD:

    &nsbp; koto.tokyo.jp

An individual cannot buy tokyo.jp or koto.tokyo.jp. But they could buy expatriots.koto.tokyo.jp. This makes koto.tokyo.jp an effective TLD. Browsers respond to this by only allowing cookies to be set for subdomains of koto.tokyo.jp.

For validation purposes, domain names follow some very detailed rules:

  • The maximum length of a domain name is 253 characters.

  • A domain name is divided into "labels" separated by periods. The maximum number of labels is 127.

  • The maximum length of any label within a domain name is 63 characters.

  • No label, including TLDs, can begin or end with a dash.

  • Top-level domains cannot be all numeric.

  • The right-most label must be either a recognized TLD or a 2-letter country code. The only exception is for international domain names, for which TLD checking is currently bypassed.

  • Domain names may not begin with a period.

Note that the gem does not currently include TLD and effective TLD validation, although that support is eminent on the project road map.

Internationalized Domain Names

What about internationalized domain names? ICANN approved the Internationalized Domain Name (IDNA) system in 2003. This standard allows for Unicode domain names to be encoded into ASCII using Punycode. Essentially, a label may contain "xn--" as a prefix, followed by the Punycode representation of a Unicode string, resulting in domain names such as xn--kbenhavn-54.eu. Note that there are also some approved Unicode TLDs.

The process of rendering an internationalized domain name in ASCII via Punycode is called normalization. This gem will validate a normalized domain name, but not a Unicode domain name. Note, however, that it currently does not validate normalized TLDs against ICANN's list of valid TLDs, or the generally accepted set of effective TLDs.

It's also unclear whether the "xn--" prefix should count against the label size limit of 63 characters. In the absence of specific guidelines, and because I've never actually seen an overly long label, I have chosen to apply the limit irregardless of the presence of the "xn--" prefix within a label.

Road Map

More types of checks will be added as they are identified. Support for validating TLDs and effective TLDa is also in the works (it's a bit more complex than you might imagine).

Alternative Gems

There are other gems and libraries that parse domain names into their various components, or parse URLS, or properly handle Unicode domain names, etc. Use them; many of them are very good at their well-defined roles. But none of the ones that I came across were very good at simply telling me whether a domain names was valid.

If this domain_name_validator gem does not suit your needs, here are a few recommended gems that may provide you with the additional power (and complexity) that is deliberately absent from this highly focused gem:

  • domain_name - For parsing/manipulating domain names.
  • ip_address - For everything you need to do with Ipv4 and Ipv6 addresses.
  • publicsuffix - TLD and effective TLD validation.

Conclusion

Admittedly, domain_name_validator is a highly focused gem, but, I hope, a very useful one depending on what your needs might be. If you do find it useful, please help me if you can, whether by emailing comments or questions, or submitting change requests through GitHub.



Comments

No comments yet. Be the first.



Leave a Comment

Comments are moderated and will not appear on the site until reviewed.

(not displayed)