What URLs are

The story so far

Browsers asks servers for files. Servers send them. Browsers look at the files' extensions so they know how to use the data in the files.

URLs are addresses

Every file on the Web has a URL (uniform resource locator). Whether it’s an HTML file, a photo file, whatever, it has a URL. A file's URL is its unique address on the Web.

Here's a URL:

https://webappexamples.cybercour.se/dog.html

The URL has three parts:

  • A protocol: https
  • A domain: webappexamples.cybercour.se. (It's actually a subdomain. More on that later.)
  • The path: dog.html

The protocol

The protocol is a standard for how clients (your browser) and servers talk to each other. When your browser wants a file, it sends the word GET to the server. When a server receives the word GET, it knows that's a request for a file.

Why the word GET, and not SENDME, or BEKOMMEN? No good reason, really. As long as we all agree on what word to use, it doesn't matter much what the word is.

HTTPS is an extension of HTTP. The S stands for Secure. It means that data going between client and server is encrypted.

The domain

So the URL

https://webappexamples.cybercour.se/dog.html

.. has three parts:

  • A protocol: https
  • A domain: webappexamples.cybercour.se.
  • The path: dog.html

The domain is like the name of a web server. To put something on the web, you need two things:

  • A server to put the files on.
  • A name for that server, that is, a domain.

There are many companies that rent server space, or web hosting, as it's called. A good one is Reclaim Hosting. Their basic plan is $30/year for shared hosting, meaning that many hosting accounts run on one physical server. That's why it's so cheap.

Once you have a server, you need to register a domain name, like risingterriers.com. The name will be yours exclusively. The Reclaim Hosting fee includes a domain for the first year. After that, it's about $10/year.

The path

Think about the computer you are using to read this text. It has thousands of files. They're grouped into directories, also called folders. Windows, Mac OS, Linux, Android… all of them put files in directories. Servers are the same.

Here's another URL:

https://webappexamples.cybercour.se/renata/renata.html

Its parts are:

  • A protocol: https
  • A domain: webappexamples.cybercour.se. (Again, a subdomain.)
  • The path: renata/renata.html

Before, we had:

  • The path: dog.html

Now we have:

  • The path: renata/renata.html

Notice the extra bit on the front. renata is a directory on the server. renata.html is a file in the directory.

So, just as you can use directories on your computer to organize files, you can use directories on your server to organize files.

Marcus
Marcus
I don't get it.

You know directories work on your PC? (BTW, "directories" and "folders" are the same thing.)

Marcus
Marcus
Oh, sure. I use directories all the time, to keep files for different projects separate. Like, in Documents, I have a folder for games, another one for homework, other things.
Documents

Right! Well, a server is just a computer, with a hard disk, like your PC. In fact, you could use your PC as a web server, if you wanted.

Marcus
Marcus
Really? I thought a server was a special super-gadget, like a geeky dragon, or something.
Not really. It's a computer, with a hard disk, broken up into directories. The only thing special about it is that some of those directories are connected to the internet. When someone types a URL, they get a file from one of those directories.

Marcus
Marcus
OK, I sorta get it. Someone types a URL, they get a file that's stored in a directory on a server. Just like when I play Far Cry 4, the game gets files from a directory on my PC.
Right! Let's peek inside the server's directories.

Peeking into the server

Here are two URLs again:

https://webappexamples.cybercour.se/dog.html
https://webappexamples.cybercour.se/renata/renata.html

When a browser asks for dog.html or renata/renata.html, the server sends a file. How does it know which file? It looks in one of its directories.

I mapped https://webappexamples.cybercour.se to a directory webappexamples on the server. (You'll see how to set that up in the next lesson.) So, when a browser goes to https://webappexamples.cybercour.se, it's accessing the directory webappexamples on the server. Anything in that directory will be accessible on the web.

Essentially, a URL is a path to a file on a server's disk drive.

This isn't strictly correct. There are other ways to do things. However, URL-is-a-file-path will work for this course.

Here's what the directory webappexamples looks like:

Web root

webappexamples is called the web root. It's the top of the site, the place where the domain itself points.

I copied the file rosie1.jpg to the web root. Just by doing that, I put the file on the web. Its URL is https://webappexamples.cybercour.se/rosie1.jpg.

See the directory renata? Here's what it looks like:

renata directory

To get to the file renata.html, the server starts at the web root, then goes down into the directory renata, then grabs the file renata.html. That path is shown in the file's URL. Its URL is https://webappexamples.cybercour.se/renata/renata.html.

Georgina
Georgina
There's something I still don't get. I see how the URL http://abcd.com/llama.jpg would get the file llama.jpg from a directory on whatever computer is serving abcd.com.

But computers have lots of directories. How does the server know which directory to get files from?

That's what the next lesson is about.

Summary

  • Browsers get files from servers. What a browser does with a file depends on the file's type.
  • A domain name, like http://boiledunity.org, points to a directory on a server, called the web root.
  • A web root can have directories. The path in a URL, like /monkey/hippo/me.html, is a path on the server. Start in the web root, then go down to monkey, then go down to hippo, then grab the file me.html.