AI app boilerplates
Yui Tanaka10 min read22 views

LlamaIndex.TS RAG Starter Review: Still Usable in June 2026, but the Project Is Deprecated

Hands-on June 2026 review of the LlamaIndex.TS RAG starter for Next.js. The starter still works, the project is officially deprecated, and the deploy-paths matrix below has three options that still ship. Honest cost numbers, vector store picks, and what to point a fresh codebase at instead.

Updated on June 28, 2026

LlamaIndex.TS, Next.js, and Totalum brand cards on a moss-accented editorial header, RAG starter 2026 caption
LlamaIndex.TS, Next.js, and Totalum brand cards on a moss-accented editorial header, RAG starter 2026 caption
On this page

Quick answer (June 27, 2026): The llamaindex npm package still works as a TypeScript RAG starter in 2026, but the project was officially marked deprecated on March 11, 2026, with llamaindex@0.12.1 as the last release on December 2, 2025. If you are scaffolding a new LlamaIndex.TS RAG starter today, treat it as a working tutorial codebase, not a long-lived production dependency. This curator review walks through the three deploy paths that still work in June 2026, what each one costs to run, and where to point a fresh codebase instead.

LlamaIndex.TS RAG starter review hero card, June 2026

What is the LlamaIndex.TS RAG starter in June 2026?

LlamaIndex logo LlamaIndex.TS is the JavaScript and TypeScript port of the LlamaIndex data framework. The starter we tested is the canonical scaffold shipped from the run-llama/LlamaIndexTS GitHub repo, plus the examples/ folder of working RAG flows that the maintainers historically pinned to main. The framework wraps the four moving parts every retrieval-augmented generation app needs: a loader and chunker for the source documents, an embedding client, a vector store interface, and a query engine that orchestrates retrieval into a model call.

What changed in 2026 is the project status. The README on the official run-llama/LlamaIndexTS monorepo carries this banner verbatim, added in the March 11, 2026 commit titled Add deprecation notice to README:

Deprecation Notice. This project is deprecated and no longer maintained. For LlamaCloud / LlamaParse usage, check out our docs.

The latest published versions on npm are llamaindex@0.12.1 (December 2, 2025) and @llamaindex/core@0.6.23. No new TypeScript releases shipped between December 2025 and June 2026. The Python framework continues, and LlamaCloud plus LlamaParse remain actively maintained, but the JavaScript and TypeScript surface is in archive mode. That is the headline fact this curator review leads with, because none of the top-ranking 2024 and 2025 tutorials surface it.

Does the LlamaIndex.TS RAG starter still work?

Yes. On June 24, 2026 we ran the published example end to end on Node.js 22, ingested 12 PDFs from a real customer support knowledge base, and got coherent answers back. The starter still installs cleanly with npm install llamaindex @llamaindex/openai. The provider packages for OpenAI logo OpenAI, Anthropic logo Anthropic, Mistral logo Mistral, and the broader provider catalog still resolve, because they were last updated in early December 2025 alongside the core package. Runtime support for Node 20, Deno, Bun, Vercel Edge, and Cloudflare Workers is intact.

What does not work, in June 2026:

  • Newer model SKUs. Anthropic Claude Opus 4.7 and Claude Haiku 4.5 are not first-class in the bundled provider package. The recipe is to drop down to the raw Anthropic SDK and pass the new model id, then wrap it as a custom LLM implementation. Five extra lines, not five hours.
  • Some vector store integrations. Pinecone, Qdrant, Chroma, and pgvector still work. A handful of smaller stores that depended on now-incompatible client SDKs require a manual pin or a forked adapter.
  • Pull requests will not be merged. Bug reports filed since March 2026 sit open on the repo. If you adopt the framework, treat it as your fork.

So the practical question is not does it work. It does. The question is what is the right deploy path for a starter that you know is on a deprecation runway.

Three deploy paths for the LlamaIndex.TS RAG starter, June 2026

We ran each path with the same 12-PDF corpus, the same text-embedding-3-small embeddings, the same Claude Haiku 4.5 answering model, and timed the first usable query in a browser. Setup minutes assume a TypeScript-fluent reader who already has a Node 22 toolchain.

Scroll to see more

PathStackSetup to first answerRecurring cost shapeBest for
GitHub clone, local NodeGitHub logo Clone run-llama/LlamaIndexTS, run examples/rag11 minutesEmbeddings plus query LLM only. Vector store on disk (SimpleVectorStore).Prototyping, demos, eval rigs you throw away.
Vercel deployNext.js logo Next.js 15 App Router on Vercel logo Vercel, pgvector on Neon22 minutesEdge function invocations plus embeddings, query LLM, and pgvector. Mostly the LLM bill.Public-facing chat over a static corpus that updates weekly.
Totalum managed backendNext.js plus TotalumSdk shell with the LlamaIndex.TS query engine living behind a server route, pgvector on a separately provisioned database34 minutesFlat-rate Totalum plan plus the same LLM bill. No vector store hosting if you reuse Neon or Supabase pgvector.Agent backends that need auth, file storage, custom domains, and an admin panel out of the box.

Path A is the path the README implies. Path B is the path the dev.to and Microsoft Reactor tutorials covered through 2025. Path C is the one that gets handwaved in every comparison, even though it answers the most honest founder question: where does the agent backend actually live once the prototype lands. Totalum's MCP-driven scaffolding ships a Next.js shell with auth, database, hosting, and an admin panel pre-wired, which means the LlamaIndex.TS query engine slots in as a POST /api/rag/query route and inherits the rest of the surface. The tradeoff is real: TotalumSdk is not a vector store, so you still bring your own pgvector or Pinecone for the embeddings. Code remains downloadable.

How does the LlamaIndex.TS RAG starter compare to LangChain.js?

The honest answer in June 2026 is that LangChain logo LangChain.js is now the better default for a fresh TypeScript RAG codebase, not because its API is cleaner (it is not) but because it is still being maintained. The LangChain.js GitHub repository shipped releases through June 2026 with current Anthropic, OpenAI, and Google model support, and the LangGraph orchestration layer is now the de facto pattern for multi-step agents that need retrieval.

Where LlamaIndex.TS still wins, head to head:

  • Smaller surface. The starter is roughly half the lines-of-code of an equivalent LangChain.js scaffold. Easier to read end to end before you trust it.
  • Better default chunking. The sentence-window and hierarchical node parsers ship better defaults for long-form documents than the LangChain.js text splitter recipes. This shows up in retrieval quality on 50-plus page PDFs.
  • First-class Vercel Edge support. LangChain.js works on Edge but pretends not to know it.

Where LangChain.js wins, and is the right answer in June 2026:

  • It is not archived. This is the single biggest factor.
  • LangGraph for agentic RAG. Multi-step retrieve-then-reason-then-retrieve patterns are first-class. LlamaIndex.TS supports the same shape but the agent loop in the deprecated starter is one generation behind.
  • Tooling around eval. LangSmith plus LangGraph Studio plus the community eval recipes outclass anything LlamaIndex.TS ships out of the box.

The honest curator call: if you are scaffolding new, pick LangChain.js. If you are inheriting a LlamaIndex.TS RAG starter from a 2024 or 2025 codebase, keep it, treat it as your fork, and budget a six-week migration window before Anthropic or OpenAI ship a breaking change you cannot patch.

What does the LlamaIndex.TS RAG starter cost to run?

Three line items dominate the bill, in order of magnitude:

  1. The answering model. Over our 12-PDF run with 200 queries, Claude Haiku 4.5 ran 1.42 dollars on the official Anthropic prices. The cheaper Haiku tier is the right default for a starter, with Opus reserved for the eval set.
  2. Embeddings. One-shot ingest of 38,400 tokens with text-embedding-3-small cost roughly 0.001 dollars. Re-embedding only fires when the corpus changes.
  3. Vector store hosting. Free for SimpleVectorStore on disk. Around 19 dollars per month for Neon pgvector on a starter tier with this corpus. Pinecone serverless came out at 7 to 12 dollars per month for the same workload.

On a small corpus the answering model dwarfs every other line item by a factor of 50, which means the right place to spend optimization effort is prompt caching and routing, not exotic vector stores. The eval pass below catches the silent-failure mode that wastes the most money: the query engine answering from the model's prior knowledge instead of the retrieved context.

How do you evaluate a RAG starter before shipping it?

Retrieval quality is the silent killer of new RAG apps. The starter examples encourage you to eyeball five answers and call it a day, which is how production goes sideways on day six. The cheap, honest move is a 30-line eval set and an LLM-as-judge pass. PromptAttic's five-line eval that catches eighty percent of hallucinations is the prompt recipe we used. It catches the common failure of the query engine answering from the model's prior knowledge rather than the retrieved context.

For multi-step tool-using flows where the RAG query is one tool among several, AgentNotebook's TypeScript agent loop tutorial for streaming Claude tool calls covers the streaming and replay patterns. Both are short, both are TS-native, and both compose with the LlamaIndex.TS query engine without code surgery.

Which vector store should you pair with the LlamaIndex.TS RAG starter?

Tested combinations, June 2026:

Scroll to see more

Vector storeStatus with the starterHonest pick when
pgvector logo pgvector on Neon or SupabaseWorks. The @llamaindex/postgres provider package is the cleanest path.You already run Postgres. Single billing line, no new vendor.
Pinecone logo Pinecone serverlessWorks. Bring your own client and wrap a thin adapter.You need namespace-level metadata filters at scale.
Qdrant logo Qdrant CloudWorks. Provider package is current as of the December 2025 release.You want self-hostable later. Open core matters to you.
Chroma logo ChromaWorks locally. Cloud option is fine for staging, weaker for production.You are still pre-traffic and want the simplest local dev story.

None of these are objectively better. The picks are about who owns the bill and who pages on Saturday night. For a Next.js plus Totalum or Next.js plus Vercel deploy path, pgvector on Neon is the dullest option that ships. Dull is good here.

When should you NOT use the LlamaIndex.TS RAG starter?

Three cases:

  1. You are starting from zero and you want a long-lived production codebase. Pick LangChain.js plus LangGraph, or the Python LlamaIndex framework if Python is your house language. The TypeScript port being archived is a real signal.
  2. Your retrieval needs are simple keyword search plus rerank. The full RAG framework is overkill. Vercel AI SDK plus a thin retrieve-then-prompt route, or a Postgres full-text-search plus pgvector hybrid, ships in less code.
  3. You are building a multi-step research agent. LangGraph is the right substrate. LlamaIndex.TS workflows exist but were deprecated alongside the rest.

The only scenario where the LlamaIndex.TS RAG starter is the right pick in June 2026 is the one where you are learning the concepts end to end, or where you already have a working codebase and migration is more expensive than maintenance. Both are legitimate. Both have an expiration date.

Three open questions a curator should keep on the bench

  1. Will the maintainers reverse the deprecation? The Python framework gets all the attention. A return to active TS development would change the calculus. As of June 27, 2026, no signal in the commit history suggests it.
  2. Will LlamaCloud ship a first-class TypeScript client? Today it is a Python-shaped product with REST endpoints you can call from anywhere. A real TS SDK would make the migration story softer.
  3. Will a community fork carry the framework forward? The repo has the audience to support it. None has emerged with the maintainer bench to make it credible.

FAQ

Below are the questions ShipGarden gets most often when this starter comes up in solo-founder threads. The answers are short on purpose.

About this review

Tested on a 2024 M3 MacBook Pro running Node.js 22.7.0, Next.js 15.4.0, llamaindex@0.12.1, @llamaindex/openai@0.4.1, @llamaindex/anthropic@0.4.0, text-embedding-3-small, and Claude Haiku 4.5. The 12-PDF corpus was an anonymized customer support knowledge base totaling 38,400 tokens. Deploy paths timed three times each, median reported. Vector store costs taken from each vendor's public June 2026 pricing.

Bias disclosure: ShipGarden curates open-source SaaS and AI app boilerplates. We have no commercial relationship with LlamaIndex, LangChain, Vercel, Neon, Pinecone, Qdrant, or Totalum. Path C in the deploy paths matrix uses Totalum because a managed Next.js plus auth plus admin shell is the genuine third realistic option for a TypeScript RAG starter in 2026, and we have shipped one onto it.

Y

Written by

Yui Tanaka

Curator at ShipGarden. Writes about the systems a solo founder actually keeps, and the ones they replace by month nine.

Frequently asked questions

Is the LlamaIndex.TS RAG starter still maintained in 2026?

No. The TypeScript framework was marked deprecated on March 11, 2026. The last release was llamaindex@0.12.1 on December 2, 2025. The Python LlamaIndex framework and LlamaCloud / LlamaParse remain actively maintained. The JavaScript and TypeScript port is in archive mode.

Does the LlamaIndex.TS RAG starter still work in June 2026?

Yes for core flows. The starter installs cleanly with npm install llamaindex @llamaindex/openai. Pinecone, Qdrant, Chroma, and pgvector still work. The newest model SKUs like Claude Opus 4.7 and Haiku 4.5 require a five-line wrap with the raw Anthropic SDK. Treat the codebase as your fork from here on out.

Should I pick LlamaIndex.TS or LangChain.js for a new TypeScript RAG project?

LangChain.js is the better default in June 2026. It is still being maintained, LangGraph is the de facto pattern for multi-step agentic RAG, and the eval tooling around LangSmith outclasses anything LlamaIndex.TS shipped. LlamaIndex.TS still has the cleaner default chunking and the smaller surface area, which matters if you are learning or inheriting an older codebase.

What vector store works best with the LlamaIndex.TS RAG starter?

pgvector on Neon or Supabase is the dullest option that ships. It keeps your billing on Postgres, the @llamaindex/postgres provider package is current, and you do not introduce a new vendor. Pinecone serverless is the right pick if you need namespace-level metadata filters at scale. Qdrant Cloud is the right pick if self-hosting later matters. Chroma is fine for local dev.

What does the LlamaIndex.TS RAG starter cost to run?

On a 12-PDF, 200-query test in June 2026 the answering model dominated at 1.42 dollars for Claude Haiku 4.5. Embedding ingest with text-embedding-3-small was about 0.001 dollars one-shot. Vector store hosting ranged from free for SimpleVectorStore on disk to roughly 19 dollars per month for Neon pgvector, or 7 to 12 dollars per month for Pinecone serverless on the same workload.

Can I deploy the LlamaIndex.TS RAG starter on Vercel?

Yes. The framework supports Vercel Edge Runtime with the published caveats and the starter examples work as a Next.js 15 App Router route. The typical recipe is the RAG query engine behind a POST /api/chat route, pgvector on Neon, and the answering model called from the Edge function. Setup to first answer takes around 22 minutes for someone with a TypeScript-fluent toolchain.

How does the LlamaIndex.TS RAG starter compare to the Vercel AI SDK?

The Vercel AI SDK is the streaming and chat-UI layer. It is not a RAG framework. The honest combination is the Vercel AI SDK for the wire format and message streaming plus a retrieval layer underneath. That layer can be LangChain.js, LlamaIndex.TS, or hand-rolled. For simple retrieve-then-prompt flows the hand-rolled path is shorter than dragging in a full framework.

Where should I point a fresh TypeScript RAG codebase in 2026?

Three honest options. LangChain.js plus LangGraph for multi-step agentic RAG. The Vercel AI SDK plus a hand-rolled retrieve route for simple chat-over-docs. A managed Next.js shell like Totalum if you want auth, file storage, custom domains, and an admin panel out of the box and you are willing to pair it with a pgvector or Pinecone instance for the embeddings.

AI app stacks

AI agent app stack: 6 pieces, 1 weekend

An AI agent app is six moving parts: a Next.js shell, a model SDK, a tool-calling layer, a durable job runner for long tasks, a vector store for memory, and a streaming UI. Wire those and you can ship something genuinely useful in a weekend. The hard part isn't the model; it's the plumbing around it that keeps a slow, flaky call from taking your whole app down. Here's the stack and the four traps that eat the most time.

4 min read35