LlamaIndex

Trace LlamaIndex pipelines — LLM calls, embeddings, retriever queries, and end-to-end query engine runs — by hooking AxonPushLlamaIndexHandler into your pipeline.

Installation

npm install @axonpush/sdk llamaindex

Setup

import { AxonPush } from "@axonpush/sdk";
import { AxonPushLlamaIndexHandler } from "@axonpush/sdk/integrations/llamaindex";

const client = new AxonPush({
  apiKey: process.env.AXONPUSH_API_KEY!,
  tenantId: process.env.AXONPUSH_TENANT_ID!,
});

const handler = new AxonPushLlamaIndexHandler({
  client,
  channelId: 1,
  agentId: "my-agent",
});

Usage

The handler exposes plain instance methods you call at the boundaries you care about — it does not auto-hook into LlamaIndex’s callbackManager. Wire it up wherever you already orchestrate retrieval and LLM calls:

import { VectorStoreIndex } from "llamaindex";

handler.onRetrieverStart("what is axonpush?");
const nodes = await retriever.retrieve("what is axonpush?");
handler.onRetrieverEnd(nodes.length);

handler.onLLMStart("gpt-4o", 1);
const response = await index.asQueryEngine().query("What is AxonPush?");
handler.onLLMEnd(response);

For streaming, call handler.onLLMStream(token) per token.

Events Traced

Event	When
`llm.start`	An LLM call begins
`llm.end`	An LLM call completes
`llm.token`	A streaming token is received
`embedding.start`	An embedding request begins
`embedding.end`	An embedding request completes
`retriever.query`	A retriever query is issued
`retriever.result`	A retriever returns results
`query.start`	A query engine run begins
`query.end`	A query engine run completes