Specify our code blocks as type:python
A simple GET request handler.
A basic web request handler, nothing more yet.
This is a quick demonstration of creating a Python project from scratch as an exercise to begin learning programming. In this project, we will create a web journal that one can post new articles, entries, or links to (also known as a web log or blog). However, the code and approach in this demonstration is flexible enough to not limit you to just an online journal; once finished, one can easily turn this into an online budget tracker, a note organizer (similar to Wikipedia), and many other projects. I chose Python as the language to start in as I think it is the easiest language to learn programming concepts with, and the most flexible if you wish to take your programming endeavors further, however nothing in this tutorial is tied to any specific language in particular. If you already know or wish to learn another language, feel free to follow along and implement the coding examples on your own.
If you wish to directly follow along with this tutorial, Python version 3, SQLite version 3, and git are required. Many systems already ship with most of these installed, but if your system does not have either, or has older versions, please upgrade them now. Python 3 specifically brings in some minor changes from Python 2 that make a lot of this example easier.
As a first step in my projects, I like to get them to a point where I can see them doing something as soon as possible. This both helps give me an idea of which direction I need to make further changes in, but also is one of the best avenues for displaying errors or printing out any other diagnostic information as I experiment (and often break things). As this is an internet project, simply printing text out to a terminal won't be much use to us, we want to see our pages in the browser as soon as possible. Python has a great way to deal with this through the use of the built-in library http.server
. Many other modern languages also include similar types of libraries these days to quickly get you set up with a way to test your program.
Let's create a new python in our favorite editor and name it journal.py
. The following code is the bare minimum needed to talk to the internet, but be warned: that is all it does. If you run it now, you'll see what I mean. The program "runs fine", but if you connect your browser to http://localhost:8000 you'll see that while it can talk to the internet just fine... it has nothing to say to your browser. Just a nice error telling you that it has nothing to say to your browser. Let's take a look at the code we have first, and then add the parts we need so we can talk to any browser, not just the internet.
import http.server
httpd = http.server.HTTPServer(
('', 8000),
http.server.BaseHTTPRequestHandler
)
httpd.serve_forever()
The first line starting with import
is what tells the program (and Python) that we want to include a library and everything it brings into our program. Python could guess that we wanted to use the http
functionality here, but what if later our company makes a new http.server
function to handle a new standard, like the upcoming HTTP3? Then Python would have to ask us which http
library to choose. It's better that we tell it now exactly which library we expect to find functions in rather than be surprised later.
The second line (really broken up across four lines to make it easier to read) is what configures our web server. The first piece of data we pass it, ('', 8000)
is the name we want the server to respond to, and the port to talk on. I left the name blank here because almost certainly you are working on a personal machine that isn't configured to be a "proper" web server with a domain name and everything set up for you. Setting a blank name with ''
lets the program ignore any incoming names and just assume anyone who is sending us a request really means for it to go to us. The number we set is the port we wish to connect to. The two standard ports used on the internet for web ("http") traffic are port 80 and port 443. However, to use any port below 1024, we would need to run the program as an administrator, so for testing purposes we pick a higher port like 8000 or 8080. The second piece of data we pass, the BaseHTTPRequestHandler
, is a class that handles the incoming web (http) requests for us, and will become the core of our program. Right now the base handler that we picked is just a built-in one that comes with the http.server
library and is the piece of code responsible for telling us that it doesn't actually do much of anything. This will be where we start adding our own code.
The final line, the serve_forever
does exactly what it suggests. It sits there and runs the web server you specified forever until you forcibly close it. As you can see on the second line, we are storing a pointer to the web server (http server) we created with http.server
into the name httpd
so we can reference it later. So on the final line when we want to start the server serving we pick the one we just named httpd
. Usually we only ever have a single web server in our program, or even on our computer, but again Python would much rather force us to be explicit here rather than break the pattern.
Since we have some working code in our program now, it's a good time to start backing up our code so we can trace our steps backwards if something breaks. We'll use git
for this, as it's fast becoming the industry standard for document and change control (also referred to as revision control, source code control, or history management). If you are using a fancier editor or a full development environment (integrated development environment or IDE), you may have a button or menu to handle this for you. However, the following raw git commands will create a new git repository in your folder, tell git about the file you expect to be changing, and then add a checkpoint you can look up later.
git init
git add journal.py
git commit -m "This is a basic web request handler, but nothing else yet."
If you have an account (or wish to create an account) on a code sharing site, such as GitLab or GitHub you may want to choose now to also upload your project there to show off your progress or to be able to work continue work from multiple computers. Click the New Project
button on the site and name your project something descriptive like Web_Journal
or Python_Journal
. I named my project python_tutorial
, choosing underbars (_
) instead of spaces as spaces usually cause problems with computers. The following two git commands link up your code sharing account with your project and then upload (push) your files to the website.
git remote add origin git@gitlab.com:abyxcos/python_tutorial.git
git push -u origin master
If git has already been set up on your computer, this should just work. If you have not set up git yet, then git will likely print out some instructions on how to finish the setup. This usually includes steps like setting your email and often uploading a secret passkey to the website (more secure than a standard password).
Now that we have our program talking to the internet, let's take the next step and set up the full pipeline so that rather than stopping when your computer makes an internet connection, the program will proceed to continue the conversation and send a message to the browser. This is accomplished simply by popping out that placeholder BaseHttpRequestHandler
and crafting our own class to handle requests how we want them. If you haven't bumped into classes yet, classes are just a way of grouping a bunch of generic functions into a package that anyone can easily extend. Classes differ from libraries in that while both group up a set of functions, libraries give you code in a take it or leave it method. Classes however give you a template of functions that you can pick, replace, or extend as you need. In Python a lot of core libraries (including http.server
) actually provide classes rather than raw functions allowing us this freedom. BaseHTTPRequestHandler
actually provides a lot of code we want to take advantage of already, but it's missing one particular function that were are very interested in; the do_GET
function that gets called any time a browser opens a connection to our server and wants to get a page. Right now the browser is just stopping at opening the connection because we don't have that function to give it a page created yet. So let's make a JournalRequstHandler
class to handle people requesting pages off our journal, link it up to the BaseHttpRequestHandler
, and then make a simple do_GET
function to do something useful when someone tries to get our page.
import http.server
class JournalRequestHandler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
# Standard template nonsense
self.send_response(200) # All ok
self.end_headers()
# Our message starts here
self.wfile.write('Welcome to my web journal!'.encode())
httpd = http.server.HTTPServer(
('127.0.0.1', 8000),
JournalRequestHandler
)
httpd.serve_forever()
As you can see, our core program is still the same, we've just added our new request handler class and the single function to chat with the browser, of which two lines are comments (lines starting with a #
are just notes and not code), two lines are standard code that every function talking to a browser needs (the header on any message we send), and one weird looking line to actually send our message. To break down that line, self
is the state/context of the request the browser sent that gets passed in to our function for free. We could use this data to do some fancier checking like seeing what files or folders the user is trying to get at, but for now we'll stay simple and reply with the same message to everyone. The wfile
is not a misspelling, it's a file writer method that lets us pack all our data in to the request as though we were just writing data to a file. This might look a little funny right here, but Python has a lot of tools for working with files so by pretending our message is a file, we get access to all of those tools for free. The last part is much simpler. Just a normal write
that writes any message (or data) we pass it in to the file we name (self.wfile
, our request message). If you have looked at any other programming guides or tutorials before, they almost always start out by showing you how to print a simple "Hello World!" message to the screen. This is our version of the "Hello World!" message. Almost all computer chatter on the internet is actually no fancier than printing messages back and forth. And what we are doing here is no more complicated than printing a message out to the internet with the write
function. That encode
we sneak in at the end of the write
is just our way of telling Python to take our message full of characters (a string) and convert (encode) it into a bunch of bytes so we can save it to a file (in this case our pretend file, wfile
).
As we now have another working feature, I make it a habit to always create a checkpoint in git, no matter how simple the feature. Using git commit
with the -a
flag tells git to automatically add any changed files it already knows about (is tracking) to the commit group for the checkpoint. If you want to manually check on what changed, the status command is great to see what git thinks is going on so you can correct it.
git status
git commit -a -m "A simple GET request handler."
git push
git status