checkrd

LangChain (Python)

Add policy enforcement, kill switch, and observability to LangChain and LangGraph chains.

LangChain / LangGraph (Python)

Checkrd ships a BaseCallbackHandler that hooks into every LLM call, tool call, retriever call, and chain invocation in LangChain and LangGraph. Denials surface as CheckrdPolicyDenied exceptions on the calling stack; LangChain propagates them naturally, so your existing error handling Just Works.

Install

bash
pip install 'checkrd[langchain]'

This pulls langchain-core>=0.3,<1. Compatible with LangChain itself (langchain), LangGraph, and any third-party Runnable.

Quickstart

python
from checkrd import Checkrd
from checkrd.integrations.langchain import CheckrdCallbackHandler
from langchain_openai import ChatOpenAI

with Checkrd(agent_id="01234567-89ab-cdef-0123-456789abcdef", api_key="ck_live_...") as client:
    handler = CheckrdCallbackHandler.from_checkrd(client)

    llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
    print(llm.invoke("Tell me a joke"))

CheckrdCallbackHandler.from_checkrd() pulls the engine, agent_id, sink, and enforce mode from the client's runtime so the handler matches every other Checkrd instrumentor in the same process. Checkrd(api_key=...) fetches the agent's published policy bundle from the dashboard and installs it before returning — no policy= argument needed in app code.

Per-call attach

If you don't want to register the handler on the LLM itself, attach it per-call via RunnableConfig:

python
chain.invoke(
    {"question": "x"},
    config={"callbacks": [handler]},
)

This pattern is preferred when one process serves multiple agents; each invocation gets its own handler bound to the right agent_id.

Async chains

The handler subclasses BaseCallbackHandler. LangChain's dispatcher automatically runs sync callbacks via asyncio.to_thread for .ainvoke() paths, so the same handler instance works for both sync and async chains. The WASM engine's evaluate() is a sub-millisecond synchronous call; the thread-pool overhead is negligible.

python
result = await chain.ainvoke(input, config={"callbacks": [handler]})

What gets enforced

Every event LangChain emits is policy-evaluated through a synthetic URL the policy engine matches against:

LangChain eventSynthetic URLBody
on_llm_starthttps://langchain.local/llm/{model}{"prompts": [...]}
on_chat_model_starthttps://langchain.local/chat_model/{model}{"messages": [[...]]}
on_tool_starthttps://langchain.local/tool/{tool_name}{"input_str": ..., "inputs": ...}
on_retriever_starthttps://langchain.local/retriever/{name}{"query": ...}
on_chain_starthttps://langchain.local/chain/{name}{"inputs": ...}

Write rules against these URLs the same way you write any other Checkrd rule:

yaml
default: allow

rules:
  - name: deny-shell-tools
    deny:
      url: "langchain.local/tool/shell*"

  - name: limit-llm-calls
    limit:
      per: endpoint
      calls_per_minute: 100

Observation mode

Set enforce=False (or pass enforce_override=False to Checkrd(...)) to log denies without aborting:

python
handler = CheckrdCallbackHandler(
    engine=engine,
    agent_id="01234567-89ab-cdef-0123-456789abcdef",
    sink=sink,
    enforce=False,    # observation mode - log only
)

Useful for rolling out a new policy in shadow mode before flipping enforcement on.

Constructing without Checkrd

If your application has its own engine lifecycle, init the global runtime once and pull the handler from it:

python
import checkrd

checkrd.init(agent_id="01234567-89ab-cdef-0123-456789abcdef", api_key="ck_live_...")
handler = CheckrdCallbackHandler.from_global()
# ... use with LangChain ...

Or fully explicit (e.g. for tests where you supply your own engine):

python
handler = CheckrdCallbackHandler(
    engine=my_engine,
    agent_id="01234567-89ab-cdef-0123-456789abcdef",
    sink=my_sink,
    enforce=True,
    dashboard_url="https://app.checkrd.io",
)

Telemetry

When a TelemetrySink is configured, every event emits a structured record after completion (or on error). The record follows Checkrd's standard TelemetryEventInput wire schema — same shape every other Checkrd integration produces — so dashboards, alerts, and rollups query LangChain calls the same way they query vendor SDK calls:

  • request_id matches LangChain's run_id (so a chain run is one trace).
  • url_host is always langchain.local; url_path is /{kind}/{target} (e.g. /llm/gpt-4o, /tool/search_database).
  • method is POST; status_code is 200 on success, 500 on chain errors.
  • policy_result is allowed or denied.
  • For LLM events with usage data, gen_ai_model, gen_ai_input_tokens, and gen_ai_output_tokens are populated using OpenTelemetry GenAI semantic-convention names — so a single ClickHouse query rolls up tokens across vendor SDKs and LangChain steps.
  • latency_ms is the wall-clock latency from on_*_start to on_*_end.

trace_id and span_id are emitted as W3C Trace Context hex strings derived from LangChain's per-event UUID, so a single LangGraph workflow shows up in the dashboard as a single trace.

Caveats

  • raise_error=True is required. The handler sets this on itself; do not override. Without it, LangChain swallows handler exceptions and the deny decision is lost.
  • Token counts depend on the LLM provider. Anthropic and OpenAI populate them reliably; some local models do not.
  • Streaming: on_llm_new_token is not currently policy-evaluated (per-token gating would 100x the eval call rate). The first / last token boundaries are gated via on_llm_start / on_llm_end.