Python’s urllib.request for HTTP Requests

Python

If you need to make HTTP requests with Python, then you may find yourself directed to the brilliant requests library. Though it’s a great library, you may have noticed that it’s not a built-in part of Python. If you prefer, for whatever reason, to limit your dependencies and stick to standard-library Python, then you can reach for urllib.request!

In this tutorial, you’ll:

Learn how to make basic HTTP requests with urllib.request
Dive into the nuts and bolts of an HTTP message and how urllib.request represents it
Understand how to deal with character encodings of HTTP messages
Explore some common errors when using urllib.request and learn how to resolve them
Dip your toes into the world of authenticated requests with urllib.request
Understand why both urllib and the requests library exist and when to use one or the other

If you’ve heard of HTTP requests, including GET and POST, then you’re probably ready for this tutorial. Also, you should’ve already used Python to read and write to files, ideally with a context manager, at least once.

Ultimately, you’ll find that making a request doesn’t have to be a frustrating experience, although it does tend to have that reputation. Many of the issues that you tend to run into are due to the inherent complexity of this marvelous thing called the Internet. The good news is that the urllib.request module can help to demystify much of this complexity.

Learn More: Click here to join 290,000+ Python developers on the Real Python Newsletter and get new Python tutorials and news that will make you a more effective Pythonista.

Basic HTTP GET Requests With urllib.request

Before diving into the deep end of what an HTTP request is and how it works, you’re going to get your feet wet by making a basic GET request to a sample URL. You’ll also make a GET request to a mock REST API for some JSON data. In case you’re wondering about POST Requests, you’ll be covering them later in the tutorial, once you have some more knowledge of urllib.request.

Beware: Depending on your exact setup, you may find that some of these examples don’t work. If so, skip ahead to the section on common urllib.request errors for troubleshooting.

If you’re running into a problem that’s not covered there, be sure to comment below with a precise and reproducible example.

To get started, you’ll make a request to www.example.com, and the server will return an HTTP message. Ensure that you’re using Python 3 or above, and then use the urlopen() function from urllib.request:

>>>>>> from urllib.request import urlopen
>>> with urlopen(“https://www.example.com”) as response:
body = response.read()

>>> body[:15]
b'<!doctype html>’

In this example, you import urlopen() from urllib.request. Using the context manager with, you make a request and receive a response with urlopen(). Then you read the body of the response and close the response object. With that, you display the first fifteen positions of the body, noting that it looks like an HTML document.

There you are! You’ve successfully made a request, and you received a response. By inspecting the content, you can tell that it’s likely an HTML document. Note that the printed output of the body is preceded by b. This indicates a bytes literal, which you may need to decode. Later in the tutorial, you’ll learn how to turn bytes into a string, write them to a file, or parse them into a dictionary.

The process is only slightly different if you want to make calls to REST APIs to get JSON data. In the following example, you’ll make a request to {JSON}Placeholder for some fake to-do data:

>>>>>> from urllib.request import urlopen
>>> import json
>>> url = “https://jsonplaceholder.typicode.com/todos/1”
>>> with urlopen(url) as response:
body = response.read()

>>> todo_item = json.loads(body)
>>> todo_item
{‘userId’: 1, ‘id’: 1, ‘title’: ‘delectus aut autem’, ‘completed’: False}

In this example, you’re doing pretty much the same as in the previous example. But in this one, you import urllib.request and json, using the json.loads() function with body to decode and parse the returned JSON bytes into a Python dictionary. Voila!

If you’re lucky enough to be using error-free endpoints, such as the ones in these examples, then maybe the above is all that you need from urllib.request. Then again, you may find that it’s not enough.

Now, before doing some urllib.request troubleshooting, you’ll first gain an understanding of the underlying structure of HTTP messages and learn how urllib.request handles them. This understanding will provide a solid foundation for troubleshooting many different kinds of issues.

The Nuts and Bolts of HTTP Messages

To understand some of the issues that you may encounter when using urllib.request, you’ll need to examine how a response is represented by urllib.request. To do that, you’ll benefit from a high-level overview of what an HTTP message is, which is what you’ll get in this section.

Before the high-level overview, a quick note on reference sources. If you want to get into the technical weeds, the Internet Engineering Task Force (IETF) has an extensive set of Request for Comments (RFC) documents. These documents end up becoming the actual specifications for things like HTTP messages. RFC 7230, part 1: Message Syntax and Routing, for example, is all about the HTTP message.

If you’re looking for some reference material that’s a bit easier to digest than RFCs, then the Mozilla Developer Network (MDN) has a great range of reference articles. For example, their article on HTTP messages, while still technical, is a lot more digestible.

Now that you know about these essential sources of reference information, in the next section you’ll get a beginner-friendly overview of HTTP messages.

Understanding What an HTTP Message Is

In a nutshell, an HTTP message can be understood as text, transmitted as a stream of bytes, structured to follow the guidelines specified by RFC 7230. A decoded HTTP message can be as simple as two lines:

Read the full article at https://realpython.com/urllib-request/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]