Part One
The requests Library
Python's standard library can make HTTP requests, but the third-party requests library is simpler and is what almost every Python developer uses in practice. It handles encoding, redirects, headers, and timeouts cleanly, without the verbosity of the built-in alternatives.
If you are working in a Jupyter notebook or a local Python environment, install it once:
pip install requests
Or inside a notebook cell:
!pip install requests
Then import it at the top of your script:
Part Two
The Request, Line by Line
Every call to the Diavgeia API follows the same four-line pattern. Let's go through each line.
The URL
The search endpoint never changes. You store it in a variable so you don't have to type the full address every time:
url = "https://diavgeia.gov.gr/luminapi/api/search"
The params dictionary
Instead of building a long URL string by hand, you pass a dictionary. The requests library URL-encodes the values automatically and appends them after the ?:
The four most common params in a Diavgeia search:
| Parameter | Example value | What it does |
|---|---|---|
q |
subject:"κλιματισμός" |
The search expression. Always includes a field name before the colon. |
size |
10 |
How many results to return per page. Maximum is 100. |
page |
0 |
Which page to return. Zero-based: first page is 0, second is 1. |
sort |
recent |
Newest first. Use relative when you care about text relevance. |
The q parameter mini-syntax
The value of q is not a plain search term — it uses a small query language. You always name the field you want to search, followed by a colon and the term in quotes:
| Pattern | Searches in |
|---|---|
subject:"κλιματισμός" |
The decision title / subject field |
q:"κλιματισμός" |
Full text of the document |
ada:"6ΛΩΖ7ΛΞ-ΦΨΥ" |
A specific ADA code |
q:["κλιματισμός", "ψύξη"] |
Multiple terms (any match) |
The get call
resp = requests.get(url, params=params, headers={"Accept": "application/json"}, timeout=30)
requests.get() sends the HTTP GET request and returns a response object. The two keyword arguments you should always include:
headers={"Accept": "application/json"}— tells the server you want JSON back, not HTML.timeout=30— if the server takes more than 30 seconds to respond, Python raises an error instead of waiting forever.
raise_for_status and .json()
resp.raise_for_status()
data = resp.json()
raise_for_status() checks the HTTP status code. If the server returned an error — 404 Not Found, 500 Server Error — it raises a Python exception immediately, so your code fails loudly rather than silently processing an empty or broken response.
resp.json() parses the response body as JSON and returns a Python dictionary. It is equivalent to writing json.loads(resp.text), but shorter.
Part Three
Working with the Response
The cell below simulates a realistic Diavgeia response — the same structure your code would receive after the resp.json() call. Run it and examine the output, then try changing the print statements.
Notice that data["info"]["total"] is 312 — that is the full count of matching decisions in the database — while len(data["decisions"]) is only 3, because that is what one page returns. Pagination is how you collect all 312, and you will do exactly that in a later chapter.
Part Four
The Safe Extraction Pattern
Real API responses are not always complete. A field that exists for one decision may be null — Python's None — for another. If you access decision["organization"]["label"] and organization is None, Python raises a TypeError and your script stops.
The standard defence is two steps. First, use .get() with a fallback. Second, guard the nested access with or {}:
The expression decision.get("organization") or {} reads as: "get the value of organization, and if it is None or missing, use an empty dictionary instead." Then {}.get("label", "") safely returns an empty string. Your loop never crashes, and every row in your dataset has consistent keys.
pandas.DataFrame(). That step — API → list of dicts → DataFrame — is the full pipeline you will build in the next chapter.
Part Five
Filtering with fq
The q parameter searches across all decisions. The fq parameter filters the results — it narrows the search to a specific organisation, decision type, or date range without affecting the relevance ranking.
To filter by organisation, you need the organisation's numeric UID. You can find it by searching Diavgeia's web interface and reading the URL, or by looking it up in the discovery endpoints listed in the cheat sheet. Once you have it, add it as a second key in your params dict:
You can also filter by date range. Diavgeia uses its own timestamp syntax inside the fq value:
"fq": ['organizationUid:"6167"', "issueDate:[DT(2024-01-01T00:00:00) TO DT(2024-12-31T23:59:59)]"]. The requests library will send both as separate fq parameters.
Part Six
Your Turn — Extract and Organise
The cell below contains a response with four decisions. Your task is to build a list of dictionaries using the safe extraction pattern, then print the total number of decisions and each row. Use .get() everywhere and guard nested access with or {}.
requests.get(), how to use the params dictionary to build the query, what raise_for_status() and resp.json() do, and how to extract nested fields safely with .get() and or {}. In the next chapter you will pass this list of rows to pandas and turn it into a DataFrame.
Chapter Navigation
Move between chapters.