How to Retrieve the HTTP Headers of a Web Page in Python using the http.client Module
In this article, we show how to retrieve the HTTP headers of a web page in Python using the http.client module.
The http.client module is a module that uses the HTTP protocol to achieve different tasks.
The HTTP protocol is how servers on the internet communicate with each other so that information such as web pages on the web can be retrieved.
In this article, we will show how to retrieve the HTTP headers of a web page.
The HTTP headers contains information about the web page, such as the name of the Server who serves the page that is being requested, the date in which the request is made, the content type format such as text/html, the charset such as UTF-8, the transfer-encoding, the connection type, as well as much more information.
These are important information if you need to know the HTTP metadata of a given page.
Using the following code below, we are able to retrieve the HTTP headers of
the home page of this web site.
Let's now go over this code.
First, we import the http.client module, so that we can use its functionality.
We then create a variable, h, which stores the connection to the server that hosts the website, www.learningaboutelectronics.com, specifically the home page, in this case. If we wanted to access another page, we specify the path to that page. In this case, we are simply going to retrieve the content on the home page of the website.
On this variable, h, we perform a GET request, which allows us to retrieve information from this web page.
We then create a variable, data, which allows us to get basically all the data that is on the home page of www.learningaboutelectronics.com
We retrieve all contents of the page and store it in this variable, data
We then print the headers of a web page, using the statement, data.headers
After running this code, we get the following output shown below.
So you can see the HTTP headers of the page printed.
You can see that the server is Sucuri/Cloudproxy
The date the request was Mon, 24 Aug 2020 07:07:54 GMT
And the information continues with more things such as the content-type, the charset, etc.
So this is how we can retrieve the HTTP headers of a web page
in Python using the http.client module.
Related Resources
How to Draw a Rectangle in Python using OpenCV
How to Draw a Circle in Python using OpenCV
How to Draw a Line in Python using OpenCV
How to Add Text to an Image in Python using OpenCV
How to Display an OpenCV image in Python with Matplotlib
How to Use Callback functions to Connect Images to Events in Python using OpenCV
How to Check for Multiple Events in Python using OpenCV