May 2026 · 16 min

Automation & Data Retrieval: A Practical Intro for Ops Engineers

AutomationPythonAPIsHTTPInteractive

If you’re an ops, networking, or infrastructure engineer, 80% of your day is a loop you could have written. Pulling logs, checking dashboards, gathering reports, poking a switch to see if it’s alive. This post is the shortest path from “I run these commands by hand” to “a script runs them while I sleep.”

Every panel below is live. The blue ones send a real HTTP request when you click SEND. The green ones boot a real CPython interpreter (Pyodide, ~10MB lazy-loaded on first run) and execute the Python you see — requests, loops, try/except, files, SQLite, the works.

1. What is automation?

Automation is using software to do work that you’d otherwise do by hand. In ops, the candidates are obvious:

Checking logs
Gathering reports
Monitoring systems
Fetching alerts
Configuring devices
Backing up files
Sending summaries
Generating dashboards

Without automation: a human repeats the work every day. With automation: you write the logic once, the machine repeats it forever.

That second line is the whole pitch.

2. Why it matters

Problem	Automation benefit
Repetitive work	Saves time
Human error	Improves consistency
Slow operations	Faster execution
Manual monitoring	Continuous monitoring
Hard to scale	Repeatable systems

The real goal isn’t “writing scripts” — it’s connecting systems together automatically. Pulling data from one place, processing it, sending the result somewhere else. Everything in this post is in service of that.

3. How modern systems work

Before automating, understand the shape of what you’re automating. Almost every modern system is the same three layers:

fig 1 — modern system architecture

Frontend — what humans look at. Dashboards, mobile apps, the Grafana panel you stare at on-call.
Backend — the brain. Receives requests, runs logic, talks to the database, returns answers.
Database — long-term storage. Logs, metrics, configs, user records.

The backend is the layer you’ll interact with most. It exposes its capabilities through APIs.

4. APIs — the part that actually matters

API = Application Programming Interface. It’s how one piece of software talks to another. If a system has an API, you can automate it.

The waiter analogy:

You don’t walk into the kitchen and start cooking your own food. You tell the waiter what you want, the waiter takes the request to the kitchen, and the kitchen sends food back through the waiter. The waiter is the API.

Mechanically, it looks like this:

fig 2 — http request / response flow

You send a request (with a method, a URL, sometimes a body). The server sends back a response (a status code, headers, and a body). That’s the entire conversation.

Almost every modern platform — cloud providers, monitoring tools, network gear, security systems, CI/CD — exposes APIs. Master HTTP and you can automate all of them.

5. HTTP methods

The method tells the server what kind of action you want. There are nine in the HTTP spec, but four cover 99% of automation:

GET /users/42

Retrieve data

✓ safe · ✓ idempotent

POST /users

Create / send new

✗ not safe · ✗ not idempotent

PUT /users/42

Replace / update

✗ not safe · ✓ idempotent

DELETE /users/42

Remove data

✗ not safe · ✓ idempotent

safe — doesn't change server state idempotent — calling twice = calling once

fig 3 — common http methods

For data retrieval — what most automation scripts spend their time doing — GET is the only one you need.

6. JSON — the universal data format

Once you’ve made an HTTP request, you have a response body. 99% of the time, it’s JSON — “JavaScript Object Notation” — a tiny text format that became the de facto language every API speaks. It won out over XML in the late 2000s because it’s smaller, faster to parse, and maps cleanly to data structures every programming language already has.

The shape

JSON has exactly two compound structures and four primitives. That’s it. Every JSON document is built from these six things:

	What it looks like	Python equivalent
Object	`{ "key": value }`	`dict`
Array	`[ value, value, ... ]`	`list`
String	`"hello"`	`str`
Number	`120` or `3.14`	`int` / `float`
Boolean	`true` / `false`	`True` / `False`
Null	`null`	`None`

A real API response is just these six things nested:

{
  "title": "Example Article",
  "score": 120,
  "author": "john",
  "published": true,
  "edited_at": null,
  "tags": ["api", "automation"],
  "metadata": {
    "word_count": 1840,
    "language": "en"
  },
  "comments": [
    { "id": 1, "by": "alice", "text": "Nice!" },
    { "id": 2, "by": "bob",   "text": "+1" }
  ]
}

Three things to notice:

Strings are always in double quotes (single quotes aren’t valid JSON).
No trailing commas. {"a": 1,} is invalid — many editors auto-add them, then your API call breaks.
Keys are always strings, even when they look like numbers.

Parsing JSON in Python

In Python, the json module handles the conversion. Most of the time you don’t even need to touch it directly — requests exposes .json() as a shortcut:

import requests, json

response = requests.get("https://api.example.com/article")

# Shortcut — calls json.loads(response.text) under the hood
data = response.json()

# Equivalent long form:
data = json.loads(response.text)

After parsing, you access fields with normal Python indexing. Nesting just chains:

print(data["title"])              # "Example Article"
print(data["tags"][0])            # "api"
print(data["metadata"]["language"])  # "en"
print(data["comments"][0]["by"])  # "alice"

Going the other way — Python → JSON

To send JSON in a POST body, you serialize it with json.dumps():

payload = {"title": "Hello", "tags": ["new"]}

requests.post(
    "https://api.example.com/articles",
    json=payload,  # requests serializes the dict for you
)

The json= argument is the clean way — requests calls json.dumps() and sets the Content-Type: application/json header automatically.

Common gotchas

Missing keys raise KeyError. data["nope"] crashes. Use data.get("nope", default) when a field might be absent.
Strings vs numbers. Some APIs return numeric IDs as strings ("42") — surprising in arithmetic. Cast explicitly with int().
null becomes None. Don’t len(None).
Unicode is fine. JSON is UTF-8 by default; Python handles it without escaping.

Where to learn more

json.org — the formal spec, one page, surprisingly readable.
MDN — Working with JSON — JS-focused but the structural intuitions transfer to any language.
Python json module — every function and parameter.
jq — command-line JSON wrangler; absolute lifesaver for poking at APIs from a shell.

7. Your first API request

Install the requests library — the de facto standard for HTTP in Python:

pip install requests

We’ll use the Hacker News API — public, no auth, well-documented. Below is your first real request. Click SEND and watch the actual 200 OK and JSON come back.

Three lines of Python, one HTTP request, and you have the IDs of the top 500 stories on Hacker News right now. That’s the entire pattern.

Understanding what just happened

requests.get(url) — sends an HTTP GET request to the URL, blocks until the response comes back.
response.status_code — the integer code the server returned. 200 means success; 404 means not found; 500 means the server broke. (full list)
response.json() — parses the response body as JSON and gives you back native Python data (dict / list).

8. Real data retrieval

The endpoint above only returned IDs — useful but not enough. To get actual story content, you need to call a second endpoint per ID. Try one:

Now we have real data — title, score, author. The URL pattern is /v0/item/{id}.json, and you can plug in any ID from the previous request.

9. Loops — the workhorse of automation

The pattern from §7 + §8 is the universal automation shape:

get a list of things
for each thing:
    fetch its details
    do something with the result

In Python, that’s just a for loop. Hit RUN — this actually executes in your browser. The first run downloads the Python runtime; subsequent runs are instant:

Real HTTP requests, real titles printed. The slice [:3] keeps it polite — Hacker News doesn’t mind a few requests, but loop over the full 500 and you’ve written a rate-limit problem.

10. Functions — reuse the same logic

Once a script does the same thing more than once, wrap it in a function. Functions are how you turn “a script” into “a tool you reuse next week”:

Same logic, now it’s a tiny library. You can import get_top_stories from another script next time. Try editing the limit argument or the print format — the next RUN picks up your changes instantly.

11. Error handling — networks fail

The internet is unreliable. Servers go offline. DNS misconfigures. Rate limits kick in. A real automation script handles failure rather than crashing silently.

The Python pattern is try / except:

import requests

try:
    response = requests.get("https://api.example.com/data")
    response.raise_for_status()   # raises if status >= 400
    data = response.json()
    print(data)

except requests.exceptions.RequestException as e:
    # Catches network errors, timeouts, HTTP errors
    print(f"Request failed: {e}")

Click RUN below. The URL is intentionally broken — watch how the script catches the failure and keeps running instead of crashing:

The last line prints. That’s the whole point. Without the try/except, a requests.exceptions.ConnectionError would propagate up, kill the script, and stop everything downstream — the cron job log fills with red, the rest of your automation never runs.

Rule of thumb: wrap every external call (HTTP, file, database, subprocess) in error handling. Anything that touches the outside world will eventually fail.

12. File handling — storing results locally

Most automation scripts produce output — reports, logs, summaries. The simplest output is a text file. Python’s open() builtin is all you need:

Reading

with open("data.txt", "r") as f:
    content = f.read()

print(content)

Writing

with open("output.txt", "w") as f:
    f.write("Hello, world\n")

File modes worth knowing

Mode	Meaning
`"r"`	Read (the default)
`"w"`	Write — truncates the file first
`"a"`	Append — adds to the end
`"x"`	Create — fails if the file exists

Why `with`?

The with block automatically closes the file when you’re done — even if your code raises an exception inside the block. Without with, you have to remember to call f.close() yourself, and forgetting causes resource leaks. Always use with.

13. Putting it together — a real automation script

The classic “ops daily report” script. Four things in sequence:

Fetch the top story IDs
Loop through and pull each story’s title + score
Save to a text file
Read the file back to confirm it looks right

Click RUN. This isn’t a mock — it really fetches from Hacker News, really writes a file (to Pyodide’s virtual filesystem), and really reads it back:

Run that on a cron schedule and you have a working “morning news digest” automation in 11 lines. Same shape for log summarizers, alert digests, daily reports — swap the API and the formatting, keep everything else.

14. When files aren’t enough — databases

Text files are great for small, one-shot scripts. They fall apart when you want to:

Query historical data (“what was the avg score last Tuesday?”)
Update records in place
Share data between scripts
Build a dashboard on top of the data

That’s what databases are for. Plenty of flavors exist — MongoDB and DynamoDB for JSON-like documents with no fixed schema, InfluxDB and Prometheus for time-series metrics — but for ops automation, SQLite is the perfect starting point. It’s built into Python’s standard library, the entire database is a single file, and there’s no server to set up.

SQLite example

sqlite3 is built into Python’s standard library, and into Pyodide too — so the panel below runs a real database, in your browser, with no server:

A database gives you persistence + queryability + analysis for almost zero extra work. Write the data once; query it from dashboards, alerts, and reports forever after.

15. The automation pipeline

Zoom out. Almost every automation script you’ll ever write is a variation of the same four-stage flow:

fig 4 — automation pipeline

Retrieve — fetch data from somewhere (API, file, device, log).
Process — parse, filter, transform, loop.
Store — write to a file, database, or both.
Send — deliver the result (Slack message, email, dashboard update, alert).

Recognize this shape and the next script you write almost designs itself.

16. Real-world examples

Task	Pipeline
Daily news report	Fetch API → format → write file → email it
Monitoring system	Read metrics → detect anomalies → alert
Network automation	Pull device configs → diff → apply updates
Log processing	Read logs → filter errors → write summary
Alerting system	Poll thresholds → if breached → send Slack

Every one of those is the same shape from §15, with different sources and sinks.

17. Try it yourself

Exercise

Write a script that:

Retrieves the top 10 Hacker News stories.
Pulls each story’s title and score.
Saves the result to a text file.

(You already have all the pieces in §13 — just bump [:5] to [:10].)

Bonus

Store the top 5 stories in a SQLite database instead of a text file. Bonus to the bonus: run it daily and add a created_at timestamp so you can graph score-over-time later.

18. The takeaway

Automation is not about writing random scripts. It’s about connecting systems together reliably.

Once you can:

retrieve data
process it
store it
send the result somewhere

…you can automate almost anything. The Python you’ve seen in this post — requests.get, a for loop, a with open, a try/except — is the entire foundation. Every “advanced” ops script you’ll ever read is a remix of those four primitives.

19. Further reading

Python & HTTP

Python requests documentation — the library, the API reference, the Quickstart.
Python open() builtin — every mode, every parameter.
MDN — HTTP overview — request/response, methods, status codes, headers. Read this once and HTTP stops being a mystery.
MDN — HTTP methods — GET, POST, PUT, DELETE, and the rarer ones.
MDN — HTTP status codes — what every code actually means.

JSON

json.org — the formal spec, one page. Bookmark it.
Python json module — loads, dumps, encoders, and edge cases.
MDN — Working with JSON — JS-focused tutorial; structural intuitions transfer to any language.
jq — command-line JSON wrangler. Pipe curl output through it and life gets better immediately.

APIs used in this post

Hacker News API — the JSON API powering every playground above. Public, no auth, great to practice on.

Databases

SQLite documentation — the database, the SQL dialect, the C library.
Python sqlite3 module — Python’s standard-library bindings to SQLite.

Going further

Real Python — API Integration — fuller tutorial on production-grade API consumption.
12 Factor App — Config — once your scripts handle credentials, do it right.

← All posts Get in touch →