Demystifying URL Structure: A Comprehensive Guide to Understanding and Interpreting URLs
Table of Contents
Recommended basics: Articles you should know
To get the full picture of this article, you should know about this topics:
You use it everyday and probably you don’t think about it a lot: URLs.
https://reliable.codes
is a URL
that, if you open it, would load
this website in your browser.
Beside this very common form of a URL
, there is much more dynamics to it,
which we will dive deeper on in this article.
URLs: Navigate through the WEB #
URL
stands for uniform resource locator
. The basic idea is, to have a string
that identifies a particular information / resource on some application, so it
can be loaded whenever needed.
An information can basically be everything: A text
, an image
, a video
,
some audio
or a font
to render text (just some examples, for sure there is
many more).
Deciphering the Anatomy of a URL #
A URL
is “easily” understandable. It has a fix layout. So as long as you know
what options you have, you can read URLs and get information out of it without
even opening.
Here’s a full featured example of how a URL
can look like:
|
|
The components of the layout are seperated by special characters and have a fix order, so let’s walk through.
Protocol #
Every URL
must specify the protocol
. It is the first word you can read
until ://
, so in the example above it is https
.
The protocol
will tell your system in which application to process this URL
.
See below for more details / examples.
Authentication #
URLs
can provide authentication information
to access protected data. The
authentication information
is optional. If needed, this component will go
directly after ://
until the first @
, so in the example above it is
user:password
.
The authentication information
can have just a username or a password as well.
If both is provided, the :
will seperate the two information. In the given
example, both is provided and the username would be user
with the password
defined as password
.
If you just have a username, you will not see the :
in this section (e.g.
https://user@...
).
Host #
Next it is specified whom to talk to. This information can be seen as one, but to mention it, I’ll break it down.
The Host
comes directly after protocol
/ authentication information
and
goes until first :
(see port), /
(see path), ?
(see query parameters) or
#
(see anchor). In the given example it is subdomain.domain.tld
.
To get all details out of the host, you simply split it by .
. You always
will end up with at least 2 words but it can be more. These words can be seen
as “groups” while the first word is the most detail provided and every word
after has a bigger / more generic context (see top level domain for more
information).
- subdomain
- domain
- tld
Subdomain (optional) #
Every split word before the second last is a so-called subdomain
. In the given
example we have “subdomain” as our one-and-only subdomain
.
If you own a domain, you can create subdomains for free. This way, you can host multiple applications under the same name.
Second level domain #
The second-last split word is the second-level-domain
. In the given example it
is domain
. The combination of second- and top level domain forms what usually
is called “domain”. One “domain” can just exist once and has one owner.
If you want to have your own website, one of the first
things is to think about your domain name. The second-level-domain
is a
dynamic value provided by you.
Top Level Domain #
The last split word is the top-level-domain
. In the given example it is tld
.
When it comes to top-level-domains
, it is a fix set of values. While in the early
days those TLDs
where provided for each country (e.g. com
, de
, it
), nowadays
we see so-called generic top-level-domains
(gTLDs
) like codes
or io
.
Even if there’s now many more top-level-domains
, you can just pick what’s there,
you cannot create new ones.
Port #
Once it’s clear which computer to talk to, it must be clarified on which port
to talk to him. Every computer has 65535
such ports
(numeric increasingly,
starting from 0), every running programm can take one or more of these ports
,
but every port
can just be taken once at a time.
Ports are per-IP. Computers with multiple IPs have more ports as a result.
URLs
can provide a port
, but he’s optional. If needed, this component will
go directly after host
(which in this case is ended with a :
) until first
/
(see path), ?
(see query parameters) or #
(see anchor), so in the example
above it is 1234
.
Since defining a port
in the URL
is optional, there is a public agreement of
some default ports depending on the protocol
used, see below.
Examples of default ports #
Let me showcase the often used default ports, definitely there’s way more than that:
Protocol | Default Port |
---|---|
HTTP | 80 |
HTTPS | 443 |
FTP | 21 |
SSH | 22 |
SFTP | 22 |
SCP | 22 |
It means basically, if you have an url like https://reliable.codes
, port 443
will be used, since it’s https
and nothing else is specified.
Path #
URLs
can provide a path
which will go directly after host
or port
(if specified) until first ?
(see query parameters) or #
(see anchor),
so in the example above it is /some/path
.
The path
is always defined, it is a empty string or a /
by default.
This information is helping the application, that will work with the request, to respond with the expected information.
Query parameters #
URLs
can provide query parameters
, but they’re optional. If needed, this
component will go directly after path
until first #
(see anchor), so
in the example above it is query=parameter
.
You can have multiple query parameters
, they are seperated by &
. Each
query parameters
is a combination of a key
and a value
, they are seperated
by =
. So in the example above I have one query parameters
with the key
query
and the value parameter
.
This information is helping the application, that will work with the request,
to be more precise in the response. Often query parameters
are used to
describe some form of filters.
Anchor #
URLs
can provide an anchor
, optionally. If needed, this component will be
the last one, so in the example above it is anchor
.
The anchor
is used for websites to tell your browser where to scroll to, after
the page was loaded. For this to work, the website needs to define this anchors
in the HTML code.
In this article for example, every headline is such an anchor
. So in the menu
you can click a particular headline and your browser will scroll to that section.
If you look precisely, you’ll recognize that the menu is “just” anchor
links.
Benefit: You can share this links with your friends so they do not need to read
everything, but are directly focused to what is interesting for them.
As long as “just the anchor changes”, browsers will nowadays not reload the page when a link is clicked. Instead, they’ll scroll to the new position.
Beyond the Browser: Application specific URLs #
URLs
came up with the Internet. Over time, their usage widened up. Here’s some
more use-cases where the standard format is not fully met (it is just some examples,
there’s more):
E-Mails #
You can “generate” E-Mails from websites, see the @
button in the sharing section
below the article. For sure such links will not directly send an email, but they
will give your E-Mail client some context.
Here’s a full featured example of how a mailto URL
can look like:
|
|
The protocol
in this case is mailto
(even if the seperator is not really correct),
the user
is some
, the domain
is friend.com
and we have two query parameters
which are body
and subject
.
The values
of the query parameters
are hard to read, they are encoded. If you
decode them, you end up with Nice page: https://reliable.codes
for body
and
I found a nice IT page
for subject
.
Phone calls #
If you want your visitors to easily call you, you can totally do that. Here’s a full
featured example of how a tel URL
can look like:
|
|
This is probably the most easy type of URL
. The protocol
in this case is tel
(even if the seperator is not really correct), the domain
is 0123456789
.
Custom applications #
Applications that you install on your device can register their own protocols
.
WhatsApp URL Schema #
If you want your visitors to easily share your page via whatsapp, here’s a full
featured example of how a whatsapp URL
can look like:
|
|
The protocol
in this case is whatsapp
, the domain
is send
and we have
one query parameter
which is text
.
The value
of the query parameter
is hard to read, it is encoded. If you
decode it, you end up with Nice page: https://reliable.codes
.
Telegram URL Schema #
If you want your visitors to easily share your page via telegram, here’s a full
featured example of how a telegram URL
can look like:
|
|
The protocol
in this case is tg
(telegram), the domain
is msg_url
and we have
two query parameters
which are url
and text
.
The values
of the query parameters
are hard to read, they’re encoded. If you
decode it, you end up with https://reliable.codes
for url
and Nice page
for text
.
Conclusion: Navigating the World of URLs #
URLs
started as a way to link between webpages, they evolved to the navigator
of the web. You can use them in websites and emails to increase the user
experience.
It is nice that we have this tool at hand but there’s still a long way to go until all stakeholders have understood how to use them. Still many businesses just write their phone number as text in crazy formats or show their email-address as an image (to prevent spam?). What a pity.
I hope this article motivated you to dive deeper into this topic and re-think your project and how you can benefit from this idea.
Keep pushing forward: Next articles to improve your skills
With this article in mind, you can keep on reading about these topics: