How DNS really works

Internet is a big place. A large set of protocols and physical infrastructure is in place that enables us to use it with such an ease. DNS is a vast topic. In this article I will cover the basic understanding around DNS, it’s components and catch DNS resolution in action.

Purpose of DNS#

Every device connected to the internet has an IP address assigned to it. This device can host a plethora of services which anyone over the internet can access by connecting to it using it’s IP address. For example, Wikipedia has made its website accessible on IP address 103.102.166.224 (the IP address might be different for you), but it is not convenient to memorize a long string of numbers so that I can reach out to it. Pffff.

This would straight away put limitations on accessing all the cool stuff on the Internet. It will be simpler to use wikipedia.org instead of an IP address. We would rather have some sort of a mechanism which would remember these mappings for us. So just like a Contact list in a phone, DNS (short for Domain Name System) maintains bunch of information about the server name(or rather domain name) which is invisibly queried by your web browser whenever you open any website.

Technically DNS is a distributed, hierarchical database spread across the internet with mechanisms to interact - insert and retrieve information - with this database. Information in DNS is stored as resource records (RR) which is essentially a mapping between the domain name and some data. Some resource record types are :

  • A : maps domain name to an IPv4 address.
  • AAAA : maps to an IPv6 address.
  • CNAME : maps to an alias of the domain name.
  • NS : maps domain name with its authorized nameservers.
  • SOA : specifes the start of zone for a domain.

Having resource records for things more than just an IP address is important to bring a hierarchical structure to DNS and make domain name management easy. DNS also brings consistency in accessiblity by maintaining the same server name in case the IP address of the server is changed.

DNS is a critical structure for internet and internet, as we know it, would crumble without DNS.

Breakdown of a URL#

Before we begin, lets first understand a URL and figure out which part of a URL is resolved by DNS. Consider the below URL

https://www.youtube.com/watch?v=dQw4w9WgXcQ
  • ‘https’ is the protocol to be used for communicating with the web service once the connection is established.
  • www.youtube.com’ is the domain name which represents the device hosting the web service and needs to be resolved to an IP address. It is also called Fully Qualified Domain Name (FQDN).
  • ‘watch?v=dQw4w9WgXcQ’ is the resource or page we want to access in the web service.

The FQDN string is made up of labels separated by a single dot. Traditionally FQDNs end with a dot which represents the root domain. Both ‘www.youtube.com’ and ‘www.youtube.com.’ are same. The dot in the end can be omitted in representation but internally all resolution occurs with the end dot in place.

image

Hierarchy in DNS#

Domain Names#

A domain name is a realm within the internet where the entity owning the domain name possesses administrative power over it - create or update resource records in DNS for the corresponding domain name. Hierarchy can be seen in a domain name and can be visualized as a tree. wikipedia.org. is an example of a domain name. The hierarchy of domains descend from the right to the left label in the FQDN. Each label to the left specifies a sub-domain of the domain on the right. First domain in the hierarchy is the root domain (represented by a dot .).

com, org, in, io are some sub-domains under the root domain. This first level of domains under the root domain are called Top Level Domains (TLDs) and are independent for operation from the root domain. Sub-domains of the TLDs are available to the users of internet for purchase. For example wikipedia is the subdomain of TLD org. which must have been purchased at some point of time.

image

  • com, org, in are subdomains of root domain.
  • youtube, duckduckgo are subdomains of com or com. domain.
  • www, music are subdomains of youtube.com or youtube.com. domain.

Having authority over a domain name gives the ability to map that domain to an IP address of some network device(which may be running some network services) or to create sub-domains. Domain owner can even choose to delegate authority of a sub-domain to a different entity. For example if Google feels like then it can sell domain music.youtube.com and delegate full authority of the domain to the buyer. This is what TLDs generally do when they sell a sub-domain.

Zones#

A zone is the set of sub-domains including the domain itself over which a domain owner exercises full control. The root zone just has the root domain. ICANN organization, which manages the root zone, has the authority to create more TLDs (which would be subdomains) as they have done in the past.

Sub-domains of root domain com, org are completely independent from the root domain in terms of authority. This means that the root domain is not affected by any means if any new sub-domain is added under the com domain, as the com domain is outside its jurisdiction. facebook.com is in a separate zone. Domains apps.facebook.com and developers.facebook.com are in the same zone as facebook.com. If facebook.com wanted to add a new service, say for live tv, they can setup tv.facebook.com in the same zone without bothering the parent domain com.

image

It is possible for a domain to include just a few sub-domains under its zone and delegate authority for the other subdomains.

Authoritative Nameserver#

Each of the domains have alteast 2 (for redundancy) dedicated authoritative nameservers associated with them. These nameservers maintain and serve resource records for the domain and its subdomains. If there is a query for a domain name then it will be ultimately served by its authoritative nameservers.

Authoritative nameservers are important components of DNS as they form the nodes of the distributed, hierarchical database that is queried upon for domain name resolution. They make DNS distributed as each domain name can have its own nameservers and is not tied to a single central database.

It is up to the administrators to configure the nameservers as per their choice. Big organizations may maintain their own authoritative nameservers that manage resource records exclusive to their domains. But it is very common for a domain name to have its authoritative nameserver shared between other domains. In other words, many separate, non related domains names could be using shared nameservers.

DNS Resolution in Action#

DNS resolution is initiated by a DNS client seeking for some information (like an IP address) of a domain name. It creates a DNS query and sends it to the DNS server who then resolves the query for the DNS client. When we connect to Internet, our network configurations hold default DNS Servers which might be provided by our ISP. We can even use DNS servers of our choice. 8.8.8.8 is a very popular, easy to remember, DNS server provided by Google.

Below image demonstrates the flow of queries and responses in DNS resolution.

image

  1. The client seeking Type A resource records, which holds IP address, for the domain en.wikipedia.org creates a DNS query and sends it to the configured DNS server.
  2. To find the IP address of en.wikipedia.org DNS server must first find the nameservers for the domain. In order to find that nameserver, the DNS Server starts resolving the domain name one label at a time from right to left starting with the root domain whose nameservers are stored with the DNS server to begin with. Next in line would be org. It queries one of the authoritative nameservers of root domain asking for nameservers of org. Note that it searches for nameserver first because they hold all the information for the domain.
  3. The queried nameserver returns the result to the DNS server.
  4. DNS server next queries one of the nameservers for the org. domain asking for nameservers of wikipedia.org
  5. The queried nameserver returns a suitable result to the DNS server.
  6. At last the DNS server queries the wikipedia.org nameserver asking for Type A records of en.wikipedia.org
  7. The queried nameserver responds with the Type A resource record to the DNS server.
  8. DNS server replies this result to the DNS client which had raised the query.

To spare the nameservers from the load of repititive queries, the DNS servers implement the cache mechanism and only query the nameservers in hierarchy if they don’t have the resource record requested by the client in its cache. To explain the DNS resolution with all of its components, the above illustration doesn’t take cache into consideration. In case the DNS Servers and Namerservers do not have the requested info they’ll reply with a suitable answer.

Examples using DiG#

After all the reading it’s fun time, to see it all happening and to validate the theory. I’ve set my DNS server to 208.67.222.222 and will be using dig utility to carry out a couple of DNS queries in bash shell.

1. Simple dig query#

dig expects a domain name as an argument and will query for Type A resource records by default. Below is the command dig www.reddit.com and its output.

neeraj@mrm:~$ dig www.reddit.com 

; <<>> DiG 9.16.1-Ubuntu <<>> www.reddit.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16713
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.reddit.com.			IN	A

;; ANSWER SECTION:
www.reddit.com.		254	IN	CNAME	reddit.map.fastly.net.
reddit.map.fastly.net.	30	IN	A	151.101.65.140
reddit.map.fastly.net.	30	IN	A	151.101.129.140
reddit.map.fastly.net.	30	IN	A	151.101.193.140
reddit.map.fastly.net.	30	IN	A	151.101.1.140

;; Query time: 84 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Tue Sep 01 00:16:04 IST 2020
;; MSG SIZE  rcvd: 142

Output explanation :

The below snippet of the above output lists the DiG version, query issued, some command line options which were passed to dig.

; <<>> DiG 9.16.1-Ubuntu <<>> www.reddit.com
;; global options: +cmd

Next is some information on the DNS header like flags, sections.

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16713
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

Then will be the query issued to the DNS Server. Column format under Question Section is domain name, DNS Class, Type of resource record. Here IN means DNS Class ‘Internet’.

;; QUESTION SECTION:
;www.reddit.com.			IN	A

Question section is followed by the responses in the form of Answer, Authority and Additional section. In this example we just have the Answer Section in which we recieved Type CNAME resource record for the queried domain instead of Type A, probably because the Type A resource record for the domain does not exist and this is what the DNS Server found relative to the query. CNAME specify aliases so basically www.reddit.com. and reddit.map.fastly.net. are two different names for the same thing. The number after the domain name is the number of seconds a host can cache the resource record. After CNAME we have Type A resource record for reddit.map.fastly.net. This just gave us 4 IP addresses with which we can access www.reddit.com.

;; ANSWER SECTION:
www.reddit.com.		254	IN	CNAME	reddit.map.fastly.net.
reddit.map.fastly.net.	30	IN	A	151.101.65.140
reddit.map.fastly.net.	30	IN	A	151.101.129.140
reddit.map.fastly.net.	30	IN	A	151.101.193.140
reddit.map.fastly.net.	30	IN	A	151.101.1.140

At last we have some stats about the the whole operation. The time it took for the operation, the DNS Server which was queried, etc.

;; Query time: 84 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Tue Sep 01 00:16:04 IST 2020
;; MSG SIZE  rcvd: 142

2. More on CNAME#

You may know that you can even open facebook from www.fb.com. Lets see whats happening behind. I encourage you to go through the output.

neeraj@mrm:~$ dig www.fb.com 

; <<>> DiG 9.16.1-Ubuntu <<>> www.fb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29969
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.fb.com.			IN	A

;; ANSWER SECTION:
www.fb.com.		6569	IN	CNAME	www.facebook.com.
www.facebook.com.	668	IN	CNAME	star-mini.c10r.facebook.com.
star-mini.c10r.facebook.com. 60	IN	A	157.240.198.35

;; Query time: 112 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Sun Sep 06 23:47:27 IST 2020
;; MSG SIZE  rcvd: 111

Ah! So in the Answer section we can see that www.facebook.com is just an alias of www.fb.com.

We can even directly query for a certain resource record by putting the Type after the domain name.

neeraj@mrm:~$ dig www.fb.com CNAME

; <<>> DiG 9.16.1-Ubuntu <<>> www.fb.com CNAME
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37259
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.fb.com.			IN	CNAME

;; ANSWER SECTION:
www.fb.com.		570	IN	CNAME	www.facebook.com.

;; Query time: 68 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Sun Sep 06 23:48:17 IST 2020
;; MSG SIZE  rcvd: 66

3. Complete DNS resolution#

Dig utility has an interesting flag +trace, which can emulate a how a DNS Server resolves a query. Dig would iteratively resolve the domain name in query starting from the root domain. You can even compare this with the diagram above. In the end we get our result - alias of www.duckduckgo.com. and the Type A resource record associated with the alias. You are requested to go through the output. With the +trace flag the output only contains the responses and at the bottom of each section of responses there is the FQDN of the DNS server who replied or was queried.

Ignore the NSEC3 and RRSIG resource records for now.

neeraj@mrm:~$ dig www.duckduckgo.com  +trace

; <<>> DiG 9.16.1-Ubuntu <<>> www.duckduckgo.com +trace
;; global options: +cmd
.			518400	IN	NS	a.root-servers.net.
.			518400	IN	NS	b.root-servers.net.
.			518400	IN	NS	c.root-servers.net.
.			518400	IN	NS	d.root-servers.net.
.			518400	IN	NS	e.root-servers.net.
.			518400	IN	NS	f.root-servers.net.
.			518400	IN	NS	g.root-servers.net.
.			518400	IN	NS	h.root-servers.net.
.			518400	IN	NS	i.root-servers.net.
.			518400	IN	NS	j.root-servers.net.
.			518400	IN	NS	k.root-servers.net.
.			518400	IN	NS	l.root-servers.net.
.			518400	IN	NS	m.root-servers.net.
.			518400	IN	RRSIG	NS 8 0 518400 20200919050000 20200906040000 46594 . shcVsOdL/w+sH9xm8cdCgjCgu2feO/b5J7HAg8SdyHa1pzh/VSO+PL6N kLac2uYQZ//3bkPjPa1lRdBUTQvFfYWKRKz385NldCl1CSBMc5rpjyx3 qPgz21JVmV7BWzfehqduOhAQ0tk0+wahbcjEW3IfDydfpR+NXBh+DQg/ GSTZoXlfQ3UubGPdzIX9ihyRVwWe/dM5xc3ooLi/exPcNSm2exdpgHHY VsIWarQapYGFIbdrsNstevhrRp91ClfLm88ZwPEtjVjPoW3T7yffsC/O 7YNRc9q7g59srKAKaUHhjXx01HaXG/3SGKrsnQRgfTP6t8Tmdu/0fFGI erH7AQ==
;; Received 525 bytes from 208.67.222.222#53(208.67.222.222) in 59 ms

com.			172800	IN	NS	a.gtld-servers.net.
com.			172800	IN	NS	b.gtld-servers.net.
com.			172800	IN	NS	c.gtld-servers.net.
com.			172800	IN	NS	d.gtld-servers.net.
com.			172800	IN	NS	e.gtld-servers.net.
com.			172800	IN	NS	f.gtld-servers.net.
com.			172800	IN	NS	g.gtld-servers.net.
com.			172800	IN	NS	h.gtld-servers.net.
com.			172800	IN	NS	i.gtld-servers.net.
com.			172800	IN	NS	j.gtld-servers.net.
com.			172800	IN	NS	k.gtld-servers.net.
com.			172800	IN	NS	l.gtld-servers.net.
com.			172800	IN	NS	m.gtld-servers.net.
com.			86400	IN	DS	30909 8 2 E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
com.			86400	IN	RRSIG	DS 8 1 86400 20200919050000 20200906040000 46594 . RQNHtH2zX1hOpuchqw/ZFwRgDQU6oIvSNtUIWq2vnKKKmi0GL1eOJSPX zkEVq2vhSAjpfwqruMzSEL+fa4el1lA9ufC7lfOzONAIsvasPEyMxqDB qA8KxfdJNbBClA6iDiFvqP5zzNlgD2npNDIy4moxfhoM6bHqRYvBNqFC Sthsd3lA2rGcGJ0sbXYUaSSkqTABb+d8MqUifls5UHkGboWIs9hgTySZ oMnygnwolMJjE74xipQTD+FinBiUcfyRhe6BD/bO2JOkC6HyKRqfacBE 1xvGp7GGXJJ4DF8RY+rNuhWZrzx/U4yBThKHTZipaAwnLx1/MAy7wPLo 78bgug==
;; Received 1178 bytes from 198.97.190.53#53(h.root-servers.net) in 63 ms

duckduckgo.com.		172800	IN	NS	dns1.p05.nsone.net.
duckduckgo.com.		172800	IN	NS	dns2.p05.nsone.net.
duckduckgo.com.		172800	IN	NS	dns3.p05.nsone.net.
duckduckgo.com.		172800	IN	NS	dns4.p05.nsone.net.
duckduckgo.com.		172800	IN	NS	ns04.quack-dns.com.
duckduckgo.com.		172800	IN	NS	ns03.quack-dns.com.
duckduckgo.com.		172800	IN	NS	ns02.quack-dns.com.
duckduckgo.com.		172800	IN	NS	ns01.quack-dns.com.
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0Q1GIN43N1ARRC9OSM6QPQR81H5M9A NS SOA RRSIG DNSKEY NSEC3PARAM
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20200910044132 20200903033132 24966 com. K6VW6C0oC+auVPTbHxy4vSc4em0hAvhlzBiLRTqiO+axNGK71dwVKNVP Kzp7ltUjiuPvNtA0FxvwR8OwN57WXO7tR7tQWaWeE7+VhqPQMYuYa6dT 3HMFHa9udTCFyG5qdOZeYCPmfOon6un4IijrJ+yyDV817BGOvRfPsmUj fpENyGNckI0m/gNJ5ZfxECSTtxEJkMOjuHlIm7ETJ+qmow==
BN1FJS0UO0RMBT477B345GNU6A9CFODA.com. 86400 IN NSEC3 1 1 0 - BN1FSPPU7UST4HCP0ADMG9U117OMTH0V NS DS RRSIG
BN1FJS0UO0RMBT477B345GNU6A9CFODA.com. 86400 IN RRSIG NSEC3 8 2 86400 20200911053325 20200904042325 24966 com. Ec2/Sko4MmcDqenrDWRbHPk1NBc2fvkqPUmjTw2YZCgUI/Okj1QBytgt TgHK3zrpMUW6hBwyCdn3ewa6lt3FgOvCSY33/t9SgQDLz5cbqaOk+kYV ZYXtv5H3OdyK22vbO5SPvXMssMHhYbKqU+2M3IM7WN8PuQJ/BdpOQ4qG sbYgG19C3KDoYM0U5oMsvFmBIMzEPJR+BJ/f+1lqYvZ9qQ==
;; Received 947 bytes from 192.48.79.30#53(j.gtld-servers.net) in 199 ms

www.duckduckgo.com.	86400	IN	CNAME	duckduckgo.com.
duckduckgo.com.		200	IN	A	40.81.94.43
;; Received 77 bytes from 148.163.196.65#53(ns02.quack-dns.com) in 187 ms

Output explanation :

  • Dig starts with asking the DNS Server for the nameservers of the root domain.
  • One of the nameserver of root domain(h.root-servers.net) is then queried for the com. nameservers.
  • Then the com. nameserver h.root-servers.net is queried for the nameservers of duckduckgo.com. and the dig gets the requested response.
  • Finally ns02.quack-dns.com nameserver replies with CNAME and A Type resource records. If you notice the CNAME resource records here, there is no need to add www. in front of duckduckgo.com to access the website.

Lo and behold! I hope that this article helped you understand the basic workings of DNS.