How to Retrieve the HTTP Headers of a Web Page in Python using the http.client Module



Python


In this article, we show how to retrieve the HTTP headers of a web page in Python using the http.client module.

The http.client module is a module that uses the HTTP protocol to achieve different tasks.

The HTTP protocol is how servers on the internet communicate with each other so that information such as web pages on the web can be retrieved.

In this article, we will show how to retrieve the HTTP headers of a web page.

The HTTP headers contains information about the web page, such as the name of the Server who serves the page that is being requested, the date in which the request is made, the content type format such as text/html, the charset such as UTF-8, the transfer-encoding, the connection type, as well as much more information.

These are important information if you need to know the HTTP metadata of a given page.

Using the following code below, we are able to retrieve the HTTP headers of the home page of this web site.



Let's now go over this code.

First, we import the http.client module, so that we can use its functionality.

We then create a variable, h, which stores the connection to the server that hosts the website, www.learningaboutelectronics.com, specifically the home page, in this case. If we wanted to access another page, we specify the path to that page. In this case, we are simply going to retrieve the content on the home page of the website.

On this variable, h, we perform a GET request, which allows us to retrieve information from this web page.

We then create a variable, data, which allows us to get basically all the data that is on the home page of www.learningaboutelectronics.com

We retrieve all contents of the page and store it in this variable, data

We then print the headers of a web page, using the statement, data.headers

After running this code, we get the following output shown below.



So you can see the HTTP headers of the page printed.

You can see that the server is Sucuri/Cloudproxy

The date the request was Mon, 24 Aug 2020 07:07:54 GMT

And the information continues with more things such as the content-type, the charset, etc.

So this is how we can retrieve the HTTP headers of a web page in Python using the http.client module.


Related Resources

How to Draw a Rectangle in Python using OpenCV

How to Draw a Circle in Python using OpenCV

How to Draw a Line in Python using OpenCV

How to Add Text to an Image in Python using OpenCV

How to Display an OpenCV image in Python with Matplotlib

How to Use Callback functions to Connect Images to Events in Python using OpenCV

How to Check for Multiple Events in Python using OpenCV



HTML Comment Box is loading comments...