Finly - Building AI-Native Applications with Payload CMS and the Vercel AI SDK

A technical breakdown of how we use Payload CMS and the Vercel AI SDK to build AI-native FinSureTech applications at InnoPeak—from prompt management and structured outputs to embeddings, background AI workflows, and production-grade introspection for advisory tools.

Building AI features that hold up in production is mostly an architecture problem.

Once you move beyond demos, you need a way to manage prompts outside of code, run long-running AI jobs reliably, store and query embeddings, enforce structured outputs, and observe what your agents are actually doing over time. This becomes even more critical in FinSureTech, where AI systems support consultants and decision-making workflows—not just end users chatting with a model.

In this post, I’ll break down how we use Payload CMS and the Vercel AI SDK to build AI-native applications at InnoPeak. The focus is on concrete patterns: prompt and schema management in Payload, background AI workflows, embeddings stored directly in the database, structured generation and tools via the Vercel AI SDK, and full introspection of AI usage and messages to support testing and iteration.

Why PayloadCMS

Payload is commonly seen as a CMS for Next.js, but it’s more accurate to think of it as a backend framework with a built-in admin UI.

It integrates directly with the Next.js App Router and gives us a single place to manage prompts, agent configuration, outputs, and long-running AI jobs via tasks and workflows. That admin layer is critical for operating and testing AI features outside of code.

Because Payload is schema-first, our data models power the admin UI, APIs, and database. We can access the same data through the local API or directly via Drizzle using generated, fully typed schemas—no duplication, no glue code.

For AI-native apps, that combination of structure, observability, and flexibility is hard to beat.

Prompt & Model Management

Prompts are a core part of modern AI applications. Most real-world use cases rely on general-purpose LLMs, and the quality of the output depends heavily on how well they’re instructed—both technically (output format, constraints) and from a business perspective (rules, decision logic).

Hard-coding prompts and model choices quickly becomes a bottleneck. Business logic changes, models evolve, and the right trade-off between cost, speed, and quality usually only becomes clear after real usage.

This is where Payload globals work well. They let us centralize prompt and model configuration and manage it through an admin UI, without redeploying code.

As an example, we’re building a tool that helps advisors generate recommendations for additional health insurance packages (Zusatzversicherungen) based on client preferences. For this, we define an Ai global in Payload that stores:

System and user prompts as Handlebars templates
The model selection via a controlled select field, limited to supported models

This gives non-developers a safe way to update prompts and switch models while keeping the application logic stable.

ai.ts

1export const Ai: GlobalConfig = {
2  slug: "ai",
3  fields: [
4    {
5      name: "modelId",
6      type: "select",
7      options: [
8        "ministral-3b-latest",
9        "ministral-8b-latest",
10        "mistral-large-latest",
11        "mistral-medium-latest",
12        "mistral-medium-2508",
13        "mistral-medium-2505",
14        "mistral-small-latest",
15        "pixtral-large-latest",
16        "magistral-small-2507",
17        "magistral-medium-2507",
18        "magistral-small-2506",
19        "magistral-medium-2506",
20        "pixtral-12b-2409",
21        "open-mistral-7b",
22        "open-mixtral-8x7b",
23        "open-mixtral-8x22b",
24      ],
25      enumName: "mistral_model_id",
26      defaultValue: "mistral-medium-latest",
27      required: true,
28    },
29    {
30      name: "healthInsuranceRecommendations",
31      type: "group",
32      fields: [
33        {
34          name: "systemPrompt",
35          type: "textarea",
36          defaultValue: ``,
37          required: true,
38        },
39        {
40          name: "userPrompt",
41          type: "textarea",
42          defaultValue: ``,
43          required: true,
44        },
45      ],
46    },
47  ],
48};
49

Visualizing JSON Schemas

Testing prompts and ensuring reliable outputs is a big part of building AI apps—especially when using structured output modes like the Vercel AI SDK’s JSON schemas.

By visualizing the schema in Payload’s admin interface, developers can copy it directly into test chats or local LLM environments to validate outputs before running full workflows.

We do this with a custom Payload UI field that renders our component and passes the schema as props:

1{
2  name: "jsonSchema",
3  type: "ui",
4  label: "JSON Schema",
5  admin: {
6    components: {
7      Field: "@/components/admin/ai/json-schema",
8    },
9    custom: {
10      jsonSchema: z.toJSONSchema(comparisonTableSchema),
11    },
12  },
13}

Background Jobs

AI apps often need background jobs—tasks that take time or might require retries due to missing data or unexpected outputs. Running these in serverless environments like Vercel can be tricky, since there’s no long-running instance to queue and execute workers.

Payload’s Jobs Queue solves this neatly. You can define tasks—or compose multiple tasks into workflows—and Payload handles orchestration, retries, and scheduling. This is useful for embedding generation, document scanning, or any workflow where instant feedback isn’t required.

For serverless setups, Payload can be triggered via Vercel CRON jobs or any external orchestrator (like a Kubernetes CronJob). For Docker deployments using Next.js standalone mode, Payload’s autoRun with a CRON schedule works out of the box.

This is just a quick shout-out: Payload makes managing AI background tasks straightforward, so you can focus on your workflows rather than building queueing infrastructure from scratch.

Store Embeddings in Payload DB

AI-native apps often rely on retrieval-augmented generation (RAG), which requires storing embeddings alongside your data. This enables semantic search—or even full-text search with pgvector—to provide relevant context to your agents.

Payload doesn’t provide built-in support for embedding columns or vector indexes, so we add them using beforeSchemaInit and afterSchemaInit hooks:

payload.config.ts

1db: postgresAdapter({
2  beforeSchemaInit: [
3    ({ schema, adapter }) => {
4      adapter.rawTables.additional_health_insurance_packages.columns.embedding = {
5        name: "embedding",
6        type: "vector",
7        dimensions: 1024,
8      };
9      return schema;
10    },
11  ],
12  afterSchemaInit: [
13    ({ schema, extendTable }) => {
14      extendTable({
15        table: schema.tables.additional_health_insurance_packages,
16        extraConfig: (table) => ({
17          l2_index: index("l2_index").using("hnsw", table.embedding.op("vector_l2_ops")),
18          lp_index: index("ip_index").using("hnsw", table.embedding.op("vector_ip_ops")),
19          cosine_index: index("cosine_index").using("hnsw", table.embedding.op("vector_cosine_ops")),
20          ts_index: index("ts_index").using("gin", sql`to_tsvector('english', ${table.embeddingText})`),
21        }),
22      });
23      return schema;
24    },
25  ],
26})

We add the embedding column in beforeSchemaInit so that payload generate:db-schema includes it in the Drizzle schema. The indexes are irrelevant to the generated schema, so we create them in afterSchemaInit for convenience.

Once set up, we can perform vector similarity searches in Drizzle queries, API routes, server actions, or Payload tasks/workflows:

route.ts

1// 1. Embed the query string
2const { embedding } = await embed({
3  model: mistral.embedding("mistral-embed"),
4  value: preferences,
5});
6
7// 2. Query top K most similar packages
8const base = payload.db.drizzle
9  .select({
10    id: additional_health_insurance_packages.id,
11    similarity: cosineDistance(
12      additional_health_insurance_packages.embedding,
13      embedding
14    ).as("similarity"),
15    ts_rank: sql<number>`
16      ts_rank(
17        to_tsvector(${additional_health_insurance_packages.embeddingText}),
18        plainto_tsquery(${preferences})
19      )
20    `.as("ts_rank"),
21  })
22  .from(additional_health_insurance_packages)
23  .as("base");
24
25const results = await payload.db.drizzle
26  .select({
27    id: base.id,
28    similarity: base.similarity,
29    ts_rank: base.ts_rank,
30    score: sql<number>`${base.ts_rank}`,
31  })
32  .from(base)
33  .orderBy((t) => desc(t.score))
34  .limit(100);

Note: Import cosineDistance, desc, and sql from @payloadcms/db-postgres/drizzle—not from drizzle-orm—to avoid version mismatches with Payload’s generated schema.

This approach keeps embeddings, indexes, and queries fully typed and integrated, letting us run semantic search seamlessly inside Payload while using Drizzle for low-level control.

Tracking Token Usage & Messages

For both SaaS and internal apps, tracking token usage is important for cost control. Tracking messages exchanged with AI agents is equally valuable for debugging and optimization—especially when prompts are dynamically generated by user-driven templates. Seeing the final prompts and outputs helps refine workflows and ensure reliability.

In our projects with the Vercel AI SDK, we created a TokenUsage collection modeled on the SDK’s types. It stores both token usage and the full messages exchanged:

ai.ts

1const usageFields: Field[] = [
2  {
3    name: "inputTokens",
4    type: "number",
5  },
6  {
7    name: "outputTokens",
8    type: "number",
9  },
10  {
11    name: "totalTokens",
12    type: "number",
13  },
14  {
15    name: "reasoningTokens",
16    type: "number",
17  },
18  {
19    name: "cachedInputTokens",
20    type: "number",
21  },
22];
23
24export const TokenUsage: CollectionConfig = {
25  slug: "token-usages",
26  access: {
27    ...isAdmin,
28    create: () => true,
29    read: belongsToUser.read,
30  },
31  fields: [
32    {
33      name: "type",
34      type: "select",
35      options: ["completions", "embedding"],
36      required: true,
37      defaultValue: "completions",
38    },
39    {
40      ...modelId,
41      options: [...modelId.options, "mistral-embed"],
42      required: true,
43      enumName: "enum_token_usages_model_id",
44    },
45    {
46      name: "usage",
47      type: "group",
48      fields: [
49        {
50          type: "row",
51          fields: usageFields,
52        },
53      ],
54    },
55    {
56      name: "totalUsage",
57      type: "group",
58      fields: [
59        {
60          type: "row",
61          fields: usageFields,
62        },
63      ],
64    },
65    {
66      name: "embeddingUsage",
67      type: "group",
68      fields: [
69        {
70          name: "tokens",
71          type: "number",
72        },
73      ],
74    },
75    {
76      name: "owner",
77      type: "relationship",
78      relationTo: [
79        "chats",
80        "comparisons",
81        "insurance-policy-offers",
82        "insurance-providers",
83        "recommendations",
84      ],
85    },
86    {
87      name: "messages",
88      type: "array",
89      fields: messageFields,
90    },
91    {
92      name: "createdBy",
93      type: "relationship",
94      relationTo: "users",
95      admin: {
96        readOnly: true,
97      },
98    },
99  ],
100  hooks: {
101    beforeChange: [
102      (async ({ data, operation, req: { user } }) => {
103        if (operation === "create") {
104          if (user) {
105            data.createdBy = user.id;
106          }
107        }
108      }) satisfies CollectionBeforeChangeHook<ITokenUsage>,
109    ],
110  },
111};

To track usage, including messages, we created a helper that saves the usage for a given user, owner, and accepts the AI SDK's response type as an argument from which it will read all the messages and store them in the database. This lets us not just introspect system & user prompts, as well as AI responses, but even tool calls in the admin panel:

We integrate this helper with the AI SDK via the onFinish callback, or call it manually if additional tasks need to run after a request. In Vercel AI SDK v6, we track usage for structured outputs using:

route.ts

1// Track usage for streamText
2const result = streamText({
3  model: mistral(ai.chat.modelId),
4  messages: convertToModelMessages(validatedMessages),
5  onFinish: trackUsage(
6    data!.user.payloadUser as User,
7    payload,
8    ai.chat.modelId,
9    { relationTo: "chats", value: chat }
10  ),
11});
12
13// Track usage for generateText
14await generateText({
15  model: getModel(ai, ai.healthInsurance.recommendations.classificationModelId),
16  messages: classificationMessages,
17  output: Output.object({ schema: preferenceOrPackagesSchema }),
18  onFinish: trackUsage(
19    user,
20    payload,
21    ai.healthInsurance.recommendations.classificationModelId,
22    null,
23    classificationMessages
24  ),
25});

This ensures that every getText or streamText call is reliably logged, giving us actionable insights into AI behavior in production workflows.

Conclusion

This post highlighted some of the more powerful but less obvious features of Payload CMS—from the jobs queue for background workflows, to Drizzle schema hooks for custom fields and embeddings, and structured data management that extends beyond the admin panel.

At InnoPeak, we’re exploring how to leverage these capabilities to build smarter AI workflows for financial advisors, consultants, insurance brokers, and insurers. If you’re interested in collaborating or learning more about what we’re building, we’d love to hear from you.