Mainframe Connectivity for AI Agents

Mainframes still run the world's banks, airlines, and payrolls — and most of them still talk to you through a 1970s green screen. I wanted to know what it would take to let an AI agent operate one safely: not screen-scraping bolted to a model, but the right layer underneath "mainframe connectivity for AI agents."

Build, embed, or wrap

The fastest way to get a mainframe connection wrong is to write the protocol engine yourself. TN3270 is stable and documented, but the long tail of real-world quirks is weeks of work. Three options were on the table:

Embed a commercial library — closed-source, a JVM dependency, and a license that explicitly forbids using it to build a general 3270 library. Out.
Build a native stack — only worth it if the protocol itself becomes core IP. It isn't.
Wrap x3270 — the BSD-licensed, most battle-tested 3270 stack there is: full TN3270E + TLS, driven by a trivial line protocol. This one.

The insight that made the call easy: the value isn't the protocol engine — it's the agent layer on top. That layer is engine-agnostic, so the engine stays a swappable detail.

The shape of it

LLM agent
   │  MCP tools: connect · screen · fill · type · press · disconnect
   ▼
mcp_server      — agent-facing surface; redacts hidden (password) fields
   ▼
Session         — screen identity · field map · wait-discipline   ← the IP
   ▼
S3270 driver    — subprocess + stdio line protocol + status parsing
   ▼
ws3270 / s3270  — BSD x3270 engine: TN3270E + TLS   (swappable)
   ▼
TN3270E / TLS  ───────────────────────────────►  mainframe

The agent never touches the engine directly. It gets six tools over MCP:

Tool	What it does
`mainframe_connect`	open a session, return the first screen
`mainframe_screen`	re-read the current screen
`mainframe_fill`	type into the field after a label
`mainframe_type`	type at the cursor
`mainframe_press`	send Enter / PFn / PAn / Clear, then wait for unlock
`mainframe_disconnect`	close the session

Why the agent layer matters

A green screen breaks naive automation in ways a model won't anticipate. The session layer encodes the rules so the agent can't get them wrong:

Wait-discipline. After an Enter or PF key, the host locks the keyboard until it repaints. The layer waits for unlock before re-reading — so the agent never acts on a half-painted screen.
Screen identity. Every screen gets a fingerprint hashed from its protected text only, so a panel identifies the same no matter what's been typed into it. The agent can confirm "I'm on the logon screen" before doing anything.
Field semantics & redaction. Fields carry protected / numeric / hidden flags. Password fields are write-only from the agent's side — their contents are never returned.

So a tool result is a compact, safe view of the screen:

{
  "screen_id": "a1b2c3…",        // stable per panel (hashes protected text only)
  "cursor": [3, 18],
  "keyboard_locked": false,
  "fields": [
    { "label": "Userid",   "writable": true,  "hidden": false },
    { "label": "Password", "writable": true,  "hidden": true }
  ]
}

Mainframes live in regulated shops, so every screen and keystroke is trivial to log at that single boundary — auditability falls out of the design rather than being bolted on.

What transfers

The interesting part of "AI meets a legacy system" is almost never the model — it's the layer that makes an agent's actions legible and reversible: wait until the system is ready, prove you're where you think you are, never exfiltrate a secret, log everything at one seam. The protocol engine underneath is swappable. That safety layer is the product.

Role

Tech Stack

Build, embed, or wrap

The shape of it

Why the agent layer matters

What transfers

Want to ship something like this?