Skip to content

feat: in-memory test runtime for deterministic E2E tests #1434

@rgbkrk

Description

@rgbkrk

Summary

Add a synthetic test runtime that runs entirely in-memory inside the agent subprocess, with no real kernel process, no ZMQ, no environment pool, and no Python/conda dependency.

Motivation

E2E tests currently require:

  • Pool warming (90s sleep on cold CI)
  • Real Python/Deno kernel processes
  • Environment resolution (uv, conda, pixi)
  • Network (ZMQ sockets for kernel protocol)

This makes E2E tests slow, flaky, and coupled to environment infrastructure. With the agent-as-peer architecture (#1431, #1433), the agent subprocess owns the full CRDT machinery. A test runtime would replace the real kernel with an in-memory implementation that writes to RuntimeStateDoc directly.

Design

A fixture notebook declares runtime: test in its metadata. The agent detects this and uses an in-memory backend instead of spawning ipykernel:

  • Instant launch: Writes kernel_status: "idle" to RuntimeStateDoc immediately
  • Echo execution: On execute, produces text/plain output echoing the source (or canned responses)
  • Status transitions: queued → running → done with realistic timing (configurable delay)
  • No external dependencies: No Python, no ZMQ, no pool, no env resolution
  • Full CRDT path: Outputs flow through RuntimeStateDoc sync like real execution

What it tests

  • Frontend rendering pipeline (cells, outputs, toolbar status)
  • Automerge sync (NotebookDoc + RuntimeStateDoc)
  • Agent subprocess lifecycle (spawn, connect, handshake, sync)
  • Execution queue ordering (seq numbers, idempotency)
  • UI interactions (run button, run all, interrupt)

What it doesn't test

  • Real kernel behavior (Python semantics, package installation)
  • Environment resolution (uv, conda, pool lifecycle)
  • ZMQ kernel protocol

Implementation sketch

// In agent.rs or a new test_runtime.rs
struct TestRuntime {
    state_doc: Arc<RwLock<RuntimeStateDoc>>,
    state_changed_tx: broadcast::Sender<()>,
}

impl TestRuntime {
    fn execute(&self, cell_id: &str, source: &str, execution_id: &str) {
        // Write queued → running → done with outputs
        let mut sd = self.state_doc.write();
        sd.set_execution_running(execution_id);
        sd.append_output(execution_id, &create_text_output(source));
        sd.set_execution_done(execution_id, true);
        let _ = self.state_changed_tx.send(());
    }
}

The agent's LaunchKernel handler checks kernel_type == "test" and creates a TestRuntime instead of a RoomKernel.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions