Book 5 · Chapter Four — find() and find_all()

Part One

find() — The First Match

find() searches the document and returns the first element that matches. Pass a tag name and BeautifulSoup returns the first tag with that name. If nothing matches, it returns None.

Python · Try it

Always check for None before calling methods on a find() result. If you call .get_text() on None, Python raises an AttributeError that crashes your script. The safe pattern is if heading: print(heading.get_text()).

find() vs dot notation: soup.h3 and soup.find("h3") do the same thing — both return the first <h3>. Use dot notation for quick exploration. Use find() in real scraping code because it is more flexible: you can pass a tag name, a class, an id, or any attribute.

Part Two

find_all() — Every Match

find_all() returns a list of every element that matches. Even if there is only one match you get a list with one item. If there are no matches you get an empty list — never None. This predictability makes it easy to loop over results.

Python · Try it

Notice that find_all("div") returns 4 divs — the outer tabcontent plus the three panels. find_all() searches the entire document tree, including nested elements. In the next chapter you will learn how to filter by class name to get only the panels you want.

find vs find_all: Use find() when you expect exactly one result — the page title, a unique badge. Use find_all() when you are collecting a list of things — all incident cards, all table rows. When find_all() returns an empty list when you expected results, the page layout has changed.

Part Three

Searching Inside a Tag

You can call find() and find_all() on any tag, not just the root soup object. Calling them on a tag limits the search to that tag's contents. This is the key pattern for structured data: first find the container, then search inside it.

Python · Try it

By calling panel.find_all("td") instead of soup.find_all("td"), you only get the cells inside that specific panel — not cells from other panels. This scoped search is essential whenever you want to keep related data together. In the next chapter you will also learn how to find panels by their class name, which is cleaner than using list indices.

The core scraping pattern: find_all() to get a list of containers → loop over the list → find() or find_all() inside each container to get its contents. This pattern repeats in almost every scraper ever written.

Part Four

Searching for Multiple Tag Types

Sometimes the element you want might be tagged as <h3> on one page and <h4> on another — sites are not always consistent. You can pass a list of tag names to find() or find_all() and it will match any of them:

Python · Try it

This technique is especially useful for scrapers that need to stay working as a site evolves. If a site changes a heading from <h3> to <h4>, code that uses find(["h3", "h4"]) keeps working without any extra changes.

Defensive scraping: Real pages change. Writing find(["h3", "h4"]) instead of find("h3") costs nothing and makes your scraper resilient to minor layout changes. Any time you know two variations exist, handle both.

Part Five

Your Turn — Count and Loop

The page below has multiple sections, each with a heading and several panels. Use find_all() to count all the panels, then loop over them and print each one's text content.

Python · Your turn

from bs4 import BeautifulSoup

html = """
<div id="page">
  <div class="tabcontent">
    <h3>Δασικές Πυρκαγιές (2)</h3>
    <div class="panel panel-red">
      <table><tr>
        <td>Αττική - Δήμος Μαρκόπουλου</td>
        <td>Έναρξη: 12/07/2024</td>
      </tr></table>
    </div>
    <div class="panel panel-green">
      <table><tr>
        <td>Στερεά Ελλάδα - Δήμος Θήβας</td>
        <td>Έναρξη: 10/07/2024</td>
      </tr></table>
    </div>
  </div>
  <div class="tabcontent">
    <h3>Αγροτοδασικές Πυρκαγιές (1)</h3>
    <div class="panel panel-yellow">
      <table><tr>
        <td>Πελοπόννησος - Δήμος Σπάρτης</td>
        <td>Έναρξη: 15/07/2024</td>
      </tr></table>
    </div>
  </div>
</div>
"""

soup = BeautifulSoup(html, "html.parser")

# How many sections are there?
sections = soup.find_all("div", {"class": "tabcontent"})
print("Sections:", len(sections))

# How many panels total?
all_panels = soup.find_all("div", {"class": "panel"})
print("Panels total:", len(all_panels))
print()

# Loop over each section and print its heading + panel count
for section in sections:
    heading = section.find(["h3", "h4"])
    panels = section.find_all("div")
    print("Section:", heading.get_text(strip=True))
    print("Panels in section:", len(panels))
    print()

What you learned in this chapter: find() returns the first match or None; find_all() returns a list of all matches; you can call either on any tag to search only inside it; and passing a list of tag names lets you match multiple alternatives. In the next chapter you will add one more powerful tool: searching by CSS class name, which lets you target exactly the elements you want.

Chapter Navigation

Move between chapters.

Previous: Chapter 3 — Your First BeautifulSoup Object Next: Chapter 5 — Searching by Class and id