Registry’s DNS platform in terms of robustness, security and reliability

Registry’s DNS platform in terms of robustness, security and reliability

 

Introduction

Red.es is responsible for the Assignment Authority of the geographic top level domains ‘.es’ (hereinafter, ccTLD - ‘country code Top Level Domain’), as well as the second level extensions ‘.com.es’, ‘.nom.es’, ‘.edu.es’, ‘.gob.es’ and ‘.org.es’.


What is the architecture and operation of the DNS protocol?


The acronym DNS stands for “Domain Name System”, one of the most-used protocols on the internet and on any IP network in general. One of its main operations is to provide different IP addresses corresponding to DNS names.

 

For example, when a user types www.dominios.es into their web browser, firstly and transparently, a DNS query is performed to find the IP address corresponding to “www.dominios.es”. This DNS process resolution is necessary because the communication is always going to be made with IP addresses.

 

The public DNS is hierarchical and distributed. DNS servers in each domain have information detailed in their own domain and information from the authoritative NS (Name Servers) for the subdomains.  This list of NS associated with the subdomains is also referred to as delegations. Below is a simplified representation of the internet’s DNS tree:

Representación simplificada del árbol DNS de internet
 


Specific examples:


- Root servers (first level) are managed globally by the IANA organisation (https://www.iana.org/domains/root/db) and have information about all the associated NS of all its subdomains, also referred to as second level, as they “hang” directly from the first level (root). These are the ccTLD’s (“.es”, “.fr”, …)), gTLD’s (.com, ..), etc.

 

- The DNS servers of the domain “.es” have information about the subdomains of “.es”. For example: “dominios.es”, “nic.es”, “rediris.es”, etc. 

 

- The DNS servers of the domain “dominos.es” have all the detailed information of their own domain. For example, they have information about the IPv4 and IPv6 addresses corresponding to the name www.dominios.es, of their NS type records, etc.

 

Users generally have two DNS servers configured, a primary one and a secondary one. The latter is only used if the first does not work or is unreachable. These types of DNS servers are called “resolvers”. They give “non-authoritative responses and store information in cache for the duration of the TTL (Time To Live) or according to defined policies. If the answer for a query received from the client is not cached, they perform an iteration process, first querying the root servers, then the second level servers they have obtained in the first query, and so on down the DNS tree until they get the requested answer, provided by an authoritative DNS server of the queried domain. Once the response has been received, they store it in cache for a set amount of time and they respond to the client. In section "What technical concepts are associated with the DNS protocol?" this process is explained in more detail.


There are several software tools for DNS servers. The most common in TLD environments are:


-    BIND. https://www.isc.org/bind/ 
-    KNOT. https://www.knot-dns.cz/ 
-    NSD. https://nlnetlabs.nl/projects/nsd/about/ 
-    Others.


Traditionally, DNS traffic is clear-text (not encrypted). In order to provide data source authentication, data integrity and ‘proof of non-existence of data’, there is a set of protocols called DNSSEC which basically consists of adding a signature on DNS responses. This signature may be valid for the “resolvers” through a chain of trust. The key requirement for this validation is that all the top-level domains are signed. The use of DNSSEC also helps to avoid certain types of attack.


Why is DNS so important as an essential service?


Given that the majority (if not all) access to applications is made through DNS, it is obvious that the DNS service is vital. For example, even if a website is operational, if the DNS resolution does not work, it will be unreachable for users. This may be caused by: incorrect or non-existent resolution, none of the authoritative DNS servers are operational, a problem in the delegations defined in the top-level domains, incorrect or expired DNSSEC, etc.

 

The domain “.” (root) is the most important of all. If all authoritative servers in that domain fail, once the cache times out, no resolver will be able to obtain NS delegations from second level domains. A failure in all DNS servers of a second level domain (e.g. ‘.es’) could affect all subdomains of ‘.es’, and so on. The higher up in the hierarchy the failure, the greater the potential impact.


The European NIS2 (Network and Information Security) regulation categorises the DNS service offered by TLDs as an essential service of the internet, and establishes a series of obligatory and recommended compliance requirements. 
 

Directive - 2022/2555 - EN - EUR-Lex (europa.eu)


The regulation covers many aspects, such as:


-    Compliance with ISO 27001 regulations
-    Regular internal audits
-    Cybersecurity management
-    Data quality
-    Multifactor authentication
-    Protection to avoid non-authorised changes to domain data, etc.


What technical concepts are associated with the DNS protocol?


Apart from what is noted in the section “What is the architecture and operation of the protocol?”, some  acronyms and concepts related to DNS terms are explained in brief below:

 

-    FQDN (Fully Qualified Domain Names): A FQDN comprises a“hostname”, a dot, and the “domain_name”. E.g.: the hostname “www” together with the domain name “dominios.es” form the FQDN “www.dominios.es.”, which is the name which must be used to make a DNS query.

 

-    NS (Name Server): Authoritative server for a specific DNS domain. As discussed above, authoritative servers (primary or secondary) contain local information in their zone files of the domains they host and delegations to subdomains. To resolve DNS queries, authoritative servers do not need to ask any other DNS server. In addition, subdomains signed with DNSSEC will also contain records called DS (see next point).

 

-    RR (Resource Records): This is the generic name for the different types of rec-ord in a DNS domain, which can be found in their corresponding zone file. The zone file is a text file in which the content is defined. There are many types of records, the most  common of which are:

 

o    NS. Name Server. Mandatory. Previously mentioned.
o    A. Defines the IPv4 address(es) associated with a host.
o    AAAA. Equivalent to type A but with IPv6
o    SOA. Start of Authority. Mandatory. Defines global DNS zone parame-ters such as Serial (version number), Refresh and Retry (these are val-ues used by other secondary servers to determine the refresh rate), Ex-pire, and negative cache time (associated with non-existent records).
o    CNAME. Canonical Name. Defines an alias. Useful to associate the reso-lution of one record with another.
o    RRSIG. Resource Record Signature. DNSSEC signatures associated to the different RR groups.
o    DNSKEY. DNSSEC ZSK (Zone Signing Key) and KSK (Key Signing Key).
o    DS. Delegation Signer. A “hash” of the KSK of a subdomain signed with DNSSEC.
o    MX. Defines email servers
o    PTR. Pointer. In reverse DNS zones, reverse DNS records are used to as-sociate a DNS name with an IP address (reverse of what is defined in A records).

 

-    Primary Server: The only server where changes can be made. Most commonly, this server is in hidden mode, i.e. not directly accessible from the internet.

 

-    Secondary Server: Obtains a copy of the zone file from the primary server, or from another secondary server, and refreshes it when it receives notifications or when the SOA ‘Refresh’ time expires. A secondary server can provide service even though it does not receive zone updates during the ‘Expire’ time defined in the SOA registry.

 

-    Notify: Notifications that a primary or secondary server sends to other DNS servers when it has obtained a new version of the zone file (SOA Serial parame-ter has increased)

 

-    XFR (AXFR and IXFR): transfers from DNS zone. They can be incremental (IXFR) or absolute (AXFR)

 

-    Recursive: Allows DNS queries to be resolved for domain names for which the server is not authoritative. In this case the server will perform iteration (using its root hints) or forwarding to other servers. In the case of authoritative serv-ers for a TLD in general, the recursive function is always disabled, therefore, such servers can only respond to DNS queries for domains that they have de-fined and have a valid zone file.

 

-    Iteration and Recursive: Going into a little more detail, clients usually have a primary and a secondary DNS server configured by ISPs. There are also other public DNS servers such as Google or cloudflare that provide DNS service and can be used. Clients make recursive DNS requests against their primary DNS. This means that they expect to receive the final response from that server. If necessary, the recursive DNS (also called resolver) will make iterative DNS queries until the desired IP resolution is achieved. It will also obtain the TTL (Time to Live) during which the response can be considered valid. Once obtained, it will reply to the client (with the IP and associated TTL) and can cache the resolution for the time stamped on the TTL received. The end client can also save the IP resolution in the local cache. Below is the DNS process to resolve the name www.dominios.es if the DNS Resolver has an empty cache.


 


Robustness, Security and Reliability


What is the Anycast architecture of Red.es?


With the aim of providing robustness, security and trustworthiness to the DNS environment of ccTLD “.es”, 4 NS (Name Servers) are deployed which in turn correspond to the root domain according to https://www.iana.org/domains/root/db/es  and which are the authoritative servers of the domain “.es”.


 


In turn, each one of these NS is composed of several nodes, which transparently provide service to customers. In other words, from the internet only a IPv4 and another IPv6 for each of the NS is observed, but in reality behind them there may be several servers attending to DNS queries.


For “a.nic.es” there is a traffic balancer which distributes requests over two servers.


The other three NS are IP anycast addresses published on the internet by 3 different DNS anycast service providers. Each one of these IP addresses is published from different regions of the planet, with several nodes in each of the regions. The total number of nodes of the three providers is over 120, covering all continents.


This type of architecture has three main advantages:

 

-    Highly resilient if some of the nodes fail.

-    Better response times. Traffic reaches the closest nodes in terms of BGP rout-ing, so this translates into better response times.

-    Protection against denial of service attacks (DoS) The fact that the service is so spread out makes it very difficult to attack it as a whole.

 

As for protection measures against DoS attacks, each of the external providers has its own protection mechanisms, such as diverting traffic to Scrubbing Centres where traffic can be cleaned, and rate-limit configurations in the DNS servers themselves. For nodes located in Red.es, Anti-DoS mechanisms have been contracted with the internet operators, rate-limits are implemented in the DNS servers themselves, and the SOC can detect traffic anomalies with the information gathered from the network probes. 

 

Below are a couple of diagrams from two of the anycast providers which give an idea of the global coverage of the DNS service.
 


www.pch.net/services/anycast

 


DENIC Anycast Nodes


 

The software used on Red.es’ own servers is BIND. Anycast providers have nodes at least in BIND and KNOT Therefore, there is a diversification of software to provide DNS service.

 

Shall we talk about DNSSEC?


As mentioned above, DNSSEC provides:

-    data source authentication

-    data integrity

-    “proof of non-existence of data”

 

The first two points are achieved with RRSIG type records, which are signatures generated with the ZSK (Zone Signing Key). The third point is achieved with NSEC or NSEC3 type records 

 

In turn, there is another key called KSK (Key Signing Key) which signs the ZSK key. To establish the chain of trust, a KSK hash is generated, called DS (Delegation Signer) which is published in the top-level domain. In this case, it is the root zone managed by IANA


Key rotation, also called rollover.


-    ZSK They are rotated more often, as the encryption algorithms are often less strong, allowing for higher performance in signing. The record is autonomous in this operation.

 

-    KSK. Stronger encryption algorithms are used so the rotation period can be longer (2 years). It requires the publication of the new DS record in IANA and the subsequent deletion of the old one.

 

The orchestration of the entire DNSSEC signature is done with the OpenDNSSEC software, which in turn is linked to a high-performance cryptographic hardware (HSM Luna 7) where the keys are stored and the RRSIG signatures are generated. OpenDNSSEC is responsible for monitoring the lifetime of the signatures generated for the different RR groups, and for generating new signatures (RRSIG) before the old ones expire. A RRSIG record is valid for a set amount of time. For example, using the “dig” we can launch a NS type query for the domain “.es” to Google's recursive DNS (resolver)(8.8.8.8). In the query we add the option ‘+dnssec’ so that it also returns the RRSIG record. In addition to the signature, this record contains the validity dates and the key identifier with which the signature was generated:
 


 


 

How is the DNSSEC validated?


If a domain is signed, DNS Resolvers can validate the signatures, thanks to the chain of trust. The top-level domains, have a DS record from the subdomain signed with the DNSSEC of the top-level domain. This goes all the way down to the root, which is trusted by all (trust anchor). in the previous figure, the “ad” (Authentic Data) flag can be seen. This means that the resolver has correctly validated the chain of trust.


There are other graphic tools such as “dnsviz” which can help check the health status and coherence of different parameters of a specific domain in a straightforward way. See, for example: https://dnsviz.net/d/es/dnssec/ to validate the domain “.es”
 


 


Continuous improvement of service


When do software updates occur and what is the official support like?

Linux operating system and BIND software updates are regularly performed when new stable versions are released, or a vulnerability is identified that requires patching. Regarding OpenDNSSEC, an upgrade to a ‘major’ version is being initiated. In order to optimise the service and to have support in case of incidents or relevant changes, BIND and OpenDNSSEC support is in the process of being contracted.


What are the improvements in the service architecture?

There is a continuous development, with CI/CD methodology, of the SGND application to cover new requirements or correct detected failures. Three environments: PRE, DEMO and PRO. Any new development  must pass through PRE and DEMO before being applied in PRO.

 


 

 

On the other hand, several projects are underway or in planning:


-    New RDAP (Registration Data Access Protocol) Service. Service based on safe protocol with an API interface. This service will be the successor to WHOIS-43.
-    Optimisation of the zone generation process and increased frequency of zone execution
-    Improvements in the Secondary DNS service for dominos.es
-    Secondary DNS service for other TLDs and reciprocal collaboration agreements
-    Own anycast and future expansion of nodes thanks to agreements with other TLDs
-    Assessment of options for software diversification.


What are the improvements in monitoring?


There are now a variety of information sources and monitoring tools.


The main aim is to improve the visibility and proactive management of the service. To this end, we are in the process of defining how to implement the following points:


-    Collection and centralisation of different sources of information.
-    Correlation of data.
-    Generation of new DNS service metrics.
-    Use of new advanced tools such as ENTRADA