Skip to main content

Demystifying URL Structure: A Comprehensive Guide to Understanding and Interpreting URLs

··9 mins
Recommended basics: Articles you should know

To get the full picture of this article, you should know about this topics:

Effortless Website Hosting on a Budget with Namecheap

Discover how to effortlessly host your website on a small budget with Namecheap's shared hosting. Explore the process from selecting a plan to configuring SSL, and learn to upload your site for a seamless online presence.

You use it everyday and probably you don’t think about it a lot: URLs. https://reliable.codes is a URL that, if you open it, would load this website in your browser.

Beside this very common form of a URL, there is much more dynamics to it, which we will dive deeper on in this article.

URLs: Navigate through the WEB #

URL stands for uniform resource locator. The basic idea is, to have a string that identifies a particular information / resource on some application, so it can be loaded whenever needed.

An information can basically be everything: A text, an image, a video, some audio or a font to render text (just some examples, for sure there is many more).

Deciphering the Anatomy of a URL #

A URL is “easily” understandable. It has a fix layout. So as long as you know what options you have, you can read URLs and get information out of it without even opening.

Here’s a full featured example of how a URL can look like:

1
https://user:password@subdomain.domain.tld:1234/some/path?query=parameter#anchor

The components of the layout are seperated by special characters and have a fix order, so let’s walk through.

Protocol #

Every URL must specify the protocol. It is the first word you can read until ://, so in the example above it is https.

The protocol will tell your system in which application to process this URL. See below for more details / examples.

Authentication #

URLs can provide authentication information to access protected data. The authentication information is optional. If needed, this component will go directly after :// until the first @, so in the example above it is user:password.

The authentication information can have just a username or a password as well. If both is provided, the : will seperate the two information. In the given example, both is provided and the username would be user with the password defined as password.

If you just have a username, you will not see the : in this section (e.g. https://user@...).

Host #

Next it is specified whom to talk to. This information can be seen as one, but to mention it, I’ll break it down.

The Host comes directly after protocol / authentication information and goes until first : (see port), / (see path), ? (see query parameters) or # (see anchor). In the given example it is subdomain.domain.tld.

To get all details out of the host, you simply split it by .. You always will end up with at least 2 words but it can be more. These words can be seen as “groups” while the first word is the most detail provided and every word after has a bigger / more generic context (see top level domain for more information).

  • subdomain
  • domain
  • tld

Subdomain (optional) #

Every split word before the second last is a so-called subdomain. In the given example we have “subdomain” as our one-and-only subdomain.

If you own a domain, you can create subdomains for free. This way, you can host multiple applications under the same name.

Do you already set up your own website hosting package?

Effortless Website Hosting on a Budget with Namecheap

Discover how to effortlessly host your website on a small budget with Namecheap's shared hosting. Explore the process from selecting a plan to configuring SSL, and learn to upload your site for a seamless online presence.

Second level domain #

The second-last split word is the second-level-domain. In the given example it is domain. The combination of second- and top level domain forms what usually is called “domain”. One “domain” can just exist once and has one owner.

If you want to have your own website, one of the first things is to think about your domain name. The second-level-domain is a dynamic value provided by you.

Top Level Domain #

The last split word is the top-level-domain. In the given example it is tld.

When it comes to top-level-domains, it is a fix set of values. While in the early days those TLDs where provided for each country (e.g. com, de, it), nowadays we see so-called generic top-level-domains (gTLDs) like codes or io.

Even if there’s now many more top-level-domains, you can just pick what’s there, you cannot create new ones.

Port #

Once it’s clear which computer to talk to, it must be clarified on which port to talk to him. Every computer has 65535 such ports (numeric increasingly, starting from 0), every running programm can take one or more of these ports, but every port can just be taken once at a time.

Ports are per-IP. Computers with multiple IPs have more ports as a result.

URLs can provide a port, but he’s optional. If needed, this component will go directly after host (which in this case is ended with a :) until first / (see path), ? (see query parameters) or # (see anchor), so in the example above it is 1234.

Since defining a port in the URL is optional, there is a public agreement of some default ports depending on the protocol used, see below.

Examples of default ports #

Let me showcase the often used default ports, definitely there’s way more than that:

ProtocolDefault Port
HTTP80
HTTPS443
FTP21
SSH22
SFTP22
SCP22

It means basically, if you have an url like https://reliable.codes, port 443 will be used, since it’s https and nothing else is specified.

Path #

URLs can provide a path which will go directly after host or port (if specified) until first ? (see query parameters) or # (see anchor), so in the example above it is /some/path.

The path is always defined, it is a empty string or a / by default.

This information is helping the application, that will work with the request, to respond with the expected information.

Query parameters #

URLs can provide query parameters, but they’re optional. If needed, this component will go directly after path until first # (see anchor), so in the example above it is query=parameter.

You can have multiple query parameters, they are seperated by &. Each query parameters is a combination of a key and a value, they are seperated by =. So in the example above I have one query parameters with the key query and the value parameter.

This information is helping the application, that will work with the request, to be more precise in the response. Often query parameters are used to describe some form of filters.

Be aware that some values contain special characters, which need to be replaced:

How to encode and decode special characters in URLs

Enhance your web browsing experience by mastering URL encoding techniques, enabling you to manage special characters with precision and ease. Uncover the simplicity of percent-encoding, ensuring smooth navigation through complex web structures and dynamic content without compromising data integrity.

Anchor #

URLs can provide an anchor, optionally. If needed, this component will be the last one, so in the example above it is anchor.

The anchor is used for websites to tell your browser where to scroll to, after the page was loaded. For this to work, the website needs to define this anchors in the HTML code.

Did you ever wonder how HTML is working?

HTML - the hidden power of the WEB

Uncover the essential role of HTML in structuring web content. This post provides a foundational introduction to HTML, highlighting its crucial role in organizing information for browsers. Explore HTML document structure, the significance of head and body sections, and build a step-by-step "About Me" page. Delve into HTML with practical examples, laying the groundwork for further exploration in web development.

In this article for example, every headline is such an anchor. So in the menu you can click a particular headline and your browser will scroll to that section. If you look precisely, you’ll recognize that the menu is “just” anchor links. Benefit: You can share this links with your friends so they do not need to read everything, but are directly focused to what is interesting for them.

As long as “just the anchor changes”, browsers will nowadays not reload the page when a link is clicked. Instead, they’ll scroll to the new position.

Beyond the Browser: Application specific URLs #

URLs came up with the Internet. Over time, their usage widened up. Here’s some more use-cases where the standard format is not fully met (it is just some examples, there’s more):

E-Mails #

You can “generate” E-Mails from websites, see the @ button in the sharing section below the article. For sure such links will not directly send an email, but they will give your E-Mail client some context.

Here’s a full featured example of how a mailto URL can look like:

1
mailto:some@friend.com?body=Nice%20page%3A%20https%3A%2F%2Freliable.codes&subject=I%20found%20a%20nice%20IT%20page

The protocol in this case is mailto (even if the seperator is not really correct), the user is some, the domain is friend.com and we have two query parameters which are body and subject.

The values of the query parameters are hard to read, they are encoded. If you decode them, you end up with Nice page: https://reliable.codes for body and I found a nice IT page for subject.

Phone calls #

If you want your visitors to easily call you, you can totally do that. Here’s a full featured example of how a tel URL can look like:

1
tel:0123456789

This is probably the most easy type of URL. The protocol in this case is tel (even if the seperator is not really correct), the domain is 0123456789.

Custom applications #

Applications that you install on your device can register their own protocols.

WhatsApp URL Schema #

If you want your visitors to easily share your page via whatsapp, here’s a full featured example of how a whatsapp URL can look like:

1
whatsapp://send?text=Nice%20page%3A%20https%3A%2F%2Freliable.codes

The protocol in this case is whatsapp, the domain is send and we have one query parameter which is text.

The value of the query parameter is hard to read, it is encoded. If you decode it, you end up with Nice page: https://reliable.codes.

Telegram URL Schema #

If you want your visitors to easily share your page via telegram, here’s a full featured example of how a telegram URL can look like:

1
tg://msg_url?url=https%3A%2F%2Freliable.codes&text=Nice%20page

The protocol in this case is tg (telegram), the domain is msg_url and we have two query parameters which are url and text.

The values of the query parameters are hard to read, they’re encoded. If you decode it, you end up with https://reliable.codes for url and Nice page for text.

Conclusion: Navigating the World of URLs #

URLs started as a way to link between webpages, they evolved to the navigator of the web. You can use them in websites and emails to increase the user experience.

It is nice that we have this tool at hand but there’s still a long way to go until all stakeholders have understood how to use them. Still many businesses just write their phone number as text in crazy formats or show their email-address as an image (to prevent spam?). What a pity.

I hope this article motivated you to dive deeper into this topic and re-think your project and how you can benefit from this idea.

Keep pushing forward: Next articles to improve your skills

With this article in mind, you can keep on reading about these topics:

How to encode and decode special characters in URLs

Enhance your web browsing experience by mastering URL encoding techniques, enabling you to manage special characters with precision and ease. Uncover the simplicity of percent-encoding, ensuring smooth navigation through complex web structures and dynamic content without compromising data integrity.

Node.js - JavaScript in Server Applications

Learn how to use Node.js to run JavaScript on the server side, from basic CLI applications to serving dynamic websites. Perfect for developers, sysadmins, and self-hosters.