Skip to content

Files

Latest commit

ed5172a · Mar 6, 2023

History

History
25 lines (17 loc) · 879 Bytes

static-web-content-scraping-with-requests-and-beautiful-soup.md

File metadata and controls

25 lines (17 loc) · 879 Bytes

Static Web Content Scraping with Requests and Beautiful Soup

For scraping static websites, Requests and Beautiful Soup are the go to libraries for me.

It's worth noting that if the data you're trying to scrape are dynamically loaded through JavaScript or APIs, then this method won't work.

import requests
from bs4 import BeautifulSoup

url = "https://konekoya.github.io"

html_content = requests.get(url).text
soup = BeautifulSoup(html_content, "html.parser")

# We can use CSS selectors
el = soup.select_one(".avatar__title")
print(el.getText()) # Joshua

# Or by its attributes
img = soup.find("img", {"class", "avatar__img"})
print(img["alt"]) # Joshua's Picture

More example code can be found in the official docs