Part One
Cleaning Titles and Links
Raw text and raw URLs often need one small cleaning step before you store them. Titles may contain extra whitespace. Links may be relative paths that need the site's base URL.
Part Two
Building the Logic Inline
You do not need extra abstractions to scrape multiple pages. You can build the page URL, clean the link, and create the article dictionaries directly inside the loop.
This version uses only the basic tools from the earlier books: variables, if, for, enumerate(), dictionaries, and list append().
Part Three
The Complete One-Page Scraper
Before looping through many pages, make sure page 1 works cleanly from end to end.
Part Four
Looping Through Page 1, Page 2, Page 3
Pagination is just a loop. Use range() to visit page 1, then page 2, then page 3, and collect all results into one big list.
That is the full pagination pattern. The special-case URL logic appears directly inside the loop, and the extraction logic is written directly under it.
if/else block that builds the correct page URL, and a for page_number in range(...) loop that repeats the same scraper across multiple pages.
Part Five
Your Turn — Simulate Pagination
The cell below simulates page 1 and page 2 with static HTML, so you can run the multi-page logic directly in the browser.
requests; how to search the page with BeautifulSoup; how to extract titles and links; how to build dictionaries and DataFrames; how to write CSV files; and how to use loops plus range() to scrape multiple pages like .../page/2/ and .../page/3/. You now have a complete template for scraping any paginated news feed.
Chapter Navigation
Move between chapters.