How to Get the Contents of a Web Page in Python Using the Requests and BeautifulSoup Modules



Python


In this article, we show how to get the contents of a web page in Python using the requests and beauitfulsoup modules.

So let's go through the steps necessary to get the contents of a web page in Python.

So we can get the contents of a web page just by using the requests module.

We go do this with the following code below.



We have gotten the contents of the web page using just the requests module (no BeautifulSoup).

However, if you run the code, it comes out all garbled up and very unstructured.

This is where BeautifulSoup comes in, because BeautifulSoup can make it a lot more presentable and more readable, also BeautifulSoup can be used to parse the data so that we can extract the data we want.

So let's just start with how to prettify the text, so that it can be more structured and readable.

Below is the code to do so.



So the requests module is able to get the text from a web page and BeautifulSoup is able to structure and prettify the text, making it much more human reable.

html.parser parses HTML text

The prettify() method in BeautifulSoup structures the data in a very human readable way.

So this is how we can get the contents of a web page using the requests module and use BeautifulSoup to structure the data, making it more clean and formatted.


Related Resources

How to Randomly Select From or Shuffle a List in Python



HTML Comment Box is loading comments...