How to Retrieve the Contents of a Web Page in Python using the http.client Module

In this article, we show how to retrieve the contents of a web page in Python using the http.client module.

The http.client module is a module that uses the HTTP protocol to achieve different tasks.

The HTTP protocol is how servers on the internet communicate with each other so that information such as web pages on the web can be retrieved.

In this article, we show how to retrieve the contents of a web page, meaning all the content of the page.

So, in this example, we will retrieve all of the content of this website's home page.

Using the following code below, we are able to do this.

import http.client h= http.client.HTTPConnection("www.learningaboutelectronics.com") h.request("GET", "/") data= h.getresponse() text= data.readlines() for t in text: print(t.decode('utf-8'))

Let's now go over this code.

First, we import the http.client module, so that we can use its functionality.

We then create a variable, h, which stores the connection to the server that hosts the website, www.learningaboutelectronics.com, specifically the home page, in this case. If we wanted to access another page, we specify the path to that page. In this case, we are simply going to retrieve the content on the home page of the website.

On this variable, h, we perform a GET request, which allows us to retrieve information from this web page.

We then create a variable, data, which allows us to get basically all the data that is on the home page of www.learningaboutelectronics.com

We retrieve all contents of the page and store it in this variable, data

We then create a variable, text, and read all the lines of the data variable through the readlines() function.

We then use a for loop to loop through each line, which we print out with the print() function.

Below is the output from the code above.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8" /> <title>Learning about Electronics</title> <meta name="keywords" content="Learning about Electronics, home page" /> <meta name="description" content="This site offers many tutorials on electronics, including resistors, capacitors, inductors, voltage regulators, and many different ICs." /> <meta name="msvalidate.01" content="D97CFA4A7DEFFC0DFE3E6F2CF3A58D46" /> <meta name=viewport content="width=device-width, initial-scale=1"> <link href="default66.css" rel="stylesheet" type="text/css" media="screen" />   <script language="JavaScript"> var zflag_nid="305"; var zflag_cid="52158/52157/1"; var zflag_sid="5452"; var zflag_width="1"; var zflag_height="1"; var zflag_sz="15"; var zflag_click="[INSERT_CLICK_TRACKER_MACRO]"; var zflag_page="[INSERT_PAGE_URL]"; var zflag_ref="[INSERT_REFERER_URL]"; </script> <script language="JavaScript" src="http://c5.zedo.com/jsc/c5/fo.js"></script>  </head> <body>  <div id="header"> <div id="logo">  <h2><a href="http://www.learningaboutelectronics.com">Learning about Electronics</a></h2>  </div>   </div>   <script type='text/javascript' src='jquery.js'></script> <script type='text/javascript' src='nav.js'></script> <div id='spancolumns'> <br><hr> <span class='menu-trigger'><img src='/images/mobile.png'></span> <largetext><form id='thisinline' action='search.php' method='POST'><input type='text' name='search_entered' size='15'/> <input type='submit' name='submit' value='Search'/> </form></largetext> <br><hr> </div> <div class='nav-menu'> <ul> <li class='current_page_item'><a href='http://www.learningaboutelectronics.com'>Home</a></li> <li><a href='http://www.learningaboutelectronics.com/Articles'>Articles</a></li> <li><a href='http://www.learningaboutelectronics.com/Projects'>Projects</a></li> <li><a href='https://www.docpid.com/'>Forum <font size='3' color='red'>posts</font></a></li> <li><a href='http://www.learningaboutelectronics.com/Calculators'>Calculators</a></li> <li><a href='http://www.learningaboutelectronics.com/Contact'>Contact</a></li> </ul> </div>   <div id="page">  <div id="rightads"><script type="text/javascript"> </script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> </div> <div class="entry"> <p id="para1">LearningaboutElectronics is the place to come to learn electronics. Attached to the <a href="http://www.learningaboutelectronics.com/Articles">Articles</a> tab are articles that answer many questions about electronics that you may have, with more being added daily. This is not just a Q&A site as we also give real-life uses and projects of components, devices, and equipment that we explain with as many photos and videos that we can provide.</p> <br> <img style="width: 219px; height: 69px;" alt="Learning about Electronics" src="/images/Electrolytic-capacitor.PNG"> <br><br><br><br> <p id="para6">Search this site</p> <largetext> <form action="search.php" method="POST"> <input type="text" name="search_entered"/><br><br> <input type="submit" name="submit" value="Search"/><br> </form> </largetext> <br><br> <p id="para1"><a href="http://www.learningaboutelectronics.com/Articles/How-to-connect-an-adjustable-voltage-regulator">How to Connect an Adjustable Voltage Regulator</a></p> <p id="para1"><a href="http://www.learningaboutelectronics.com/Articles/What-is-a-LM7805-voltage-regulator">What is a LM7805 Voltage Regulator?</a></p> <p id="para1"><a href="http://www.learningaboutelectronics.com/Articles/How-to-test-a-capacitor">How to Test a Capacitor</a></p> <p id="para1"><a href="http://www.learningaboutelectronics.com/Articles/High-pass-filter.php">High Pass Filter</a></p> <p id="para1"><a href="http://www.learningaboutelectronics.com/Articles/Low-pass-filter.php">Low Pass Filter</a></p> <p id="para1">and much, much more. <p id="para1">This site is all about the learning of electronics</p> </p><br><br> <p id="para1"> <div id="newsletter" style="float:left"> <p id="para1"><legend><b>Sign up for Our Newsletter</b></legend> <form action="addcontact.php" method="post"> <label>Name:</label><input type="text" name="firstname" id="firstname"/><br><b>(optional)</b><br><br> <label>Email:</label><input type="text" name="email" id="email"/><br> <input type="submit" name="submit" id="submit" value="Subscribe"/> </form> </div> <br><br> </div> </div>  </div>   <div id='footer'> <div class='fcenter'> <a href='http://www.learningaboutelectronics.com'>Home</a> | <a href='http://www.learningaboutelectronics.com/Articles'>Articles</a> | <a href='http://www.learningaboutelectronics.com/Projects'>Projects</a> | <a href='http://www.learningaboutelectronics.com/Programming'>Programming</a> | <a href='http://www.learningaboutelectronics.com/Calculators'>Calculators</a> | <a href='http://www.learningaboutelectronics.com/Contact'>Contact</a> </div><br/><br/> <div align='center'>© 2018 All Rights Reserved</div>  </body> <script>'undefined'=== typeof _trfq || (window._trfq = []);'undefined'=== typeof _trfd && (window._trfd=[]),_trfd.push({'tccl.baseHost':'secureserver.net'}),_trfd.push({'ap':'cpsh'},{'server':'p3plcpnl0769'}) // Monitoring performance to make your website faster. If you want to opt-out, please contact web hosting support.</script><script src='https://img1.wsimg.com/tcc/tcc_l.combined.1.0.6.min.js'></script><script>'undefined'=== typeof _trfq || (window._trfq = []);'undefined'=== typeof _trfd && (window._trfd=[]),_trfd.push({'tccl.baseHost':'secureserver.net'},{'ap':'cpbh-mt'},{'server':'p3plmcpnl487010'},{'dcenter':'p3'},{'cp_id':'8437534'},{'cp_cache':''},{'cp_cl':'8'}) // Monitoring performance to make your website faster. If you want to opt-out, please contact web hosting support.</script><script src='https://img1.wsimg.com/traffic-assets/js/tccl.min.js'></script></html>

So you can see that it retrieves all the content of the home page of www.learningaboutelectronics.com

So this is how we can retrieve the contents of a web page in Python using the http.client module.

Related Resources

How to Draw a Rectangle in Python using OpenCV

How to Draw a Circle in Python using OpenCV

How to Draw a Line in Python using OpenCV

How to Add Text to an Image in Python using OpenCV

How to Display an OpenCV image in Python with Matplotlib

How to Use Callback functions to Connect Images to Events in Python using OpenCV

How to Check for Multiple Events in Python using OpenCV

HTML Comment Box is loading comments...