Book 5 — Web Scraping with Python

Python for All

Chapter Four — find() and find_all()

Thanasis Troboukis  ·  All Books

Book Five · Chapter Four

find() and find_all()

find() and find_all() are the two most important BeautifulSoup methods. One finds the first match; the other finds every match. Together they let you locate any element on any page — and searching inside a found element is just as simple.

find() — The First Match

find() searches the document and returns the first element that matches. Pass a tag name and BeautifulSoup returns the first tag with that name. If nothing matches, it returns None.

Python · Try it

      

Always check for None before calling methods on a find() result. If you call .get_text() on None, Python raises an AttributeError that crashes your script. The safe pattern is if heading: print(heading.get_text()).

find() vs dot notation: soup.h3 and soup.find("h3") do the same thing — both return the first <h3>. Use dot notation for quick exploration. Use find() in real scraping code because it is more flexible: you can pass a tag name, a class, an id, or any attribute.

find_all() — Every Match

find_all() returns a list of every element that matches. Even if there is only one match you get a list with one item. If there are no matches you get an empty list — never None. This predictability makes it easy to loop over results.

Python · Try it

      

Notice that find_all("div") returns 4 divs — the outer tabcontent plus the three panels. find_all() searches the entire document tree, including nested elements. In the next chapter you will learn how to filter by class name to get only the panels you want.

find vs find_all: Use find() when you expect exactly one result — the page title, a unique badge. Use find_all() when you are collecting a list of things — all incident cards, all table rows. When find_all() returns an empty list when you expected results, the page layout has changed.

Searching Inside a Tag

You can call find() and find_all() on any tag, not just the root soup object. Calling them on a tag limits the search to that tag's contents. This is the key pattern for structured data: first find the container, then search inside it.

Python · Try it

      

By calling panel.find_all("td") instead of soup.find_all("td"), you only get the cells inside that specific panel — not cells from other panels. This scoped search is essential whenever you want to keep related data together. In the next chapter you will also learn how to find panels by their class name, which is cleaner than using list indices.

The core scraping pattern: find_all() to get a list of containers → loop over the list → find() or find_all() inside each container to get its contents. This pattern repeats in almost every scraper ever written.

Searching for Multiple Tag Types

Sometimes the element you want might be tagged as <h3> on one page and <h4> on another — sites are not always consistent. You can pass a list of tag names to find() or find_all() and it will match any of them:

Python · Try it

      

This technique is especially useful for scrapers that need to stay working as a site evolves. If a site changes a heading from <h3> to <h4>, code that uses find(["h3", "h4"]) keeps working without any extra changes.

Defensive scraping: Real pages change. Writing find(["h3", "h4"]) instead of find("h3") costs nothing and makes your scraper resilient to minor layout changes. Any time you know two variations exist, handle both.

Your Turn — Count and Loop

The page below has multiple sections, each with a heading and several panels. Use find_all() to count all the panels, then loop over them and print each one's text content.

Python · Your turn

      
What you learned in this chapter: find() returns the first match or None; find_all() returns a list of all matches; you can call either on any tag to search only inside it; and passing a list of tag names lets you match multiple alternatives. In the next chapter you will add one more powerful tool: searching by CSS class name, which lets you target exactly the elements you want.

Chapter Navigation

Move between chapters.

Loading Python environment — this may take a moment…