Batch Onboarding: 5,000 Customers from Salesforce

End-to-end workflow for creating thousands of customer characters from a CRM export.

Table of contents

Prerequisites
Overview
Step 1: Export your data
Step 2: Create a template
Step 3: Dry run
1. Python
2. TypeScript
Step 4: Batch create
Step 5: Monitor job status
1. Python
2. TypeScript
Step 6: Verify
1. Python
2. TypeScript
Handling partial failures
Tier limits
Next steps

Prerequisites

A HippoDid account on Starter+ tier (batch creation requires a paid plan)
An API key (hd_key_...)
A CSV export from your CRM
Python SDK (pip install hippodid) or TypeScript SDK (npm install @hippodid/sdk)

Overview

The batch onboarding workflow has five steps:

Export your customer data as CSV
Create a template that maps CSV columns to character fields
Dry run to validate before committing
Batch create to run the real import
Verify the results

Each customer becomes a HippoDid character with its own memory namespace, ready to store and recall facts across AI interactions.

Step 1: Export your data

Export a CSV from Salesforce, HubSpot, or any CRM. The file should have one row per customer.

Example customers.csv:

sf_id,name,email,company,plan,signup_date,notes
SF-001,Alice Chen,alice@acme.com,Acme Corp,enterprise,2024-01-15,Prefers email communication
SF-002,Bob Martinez,bob@widgets.io,Widgets Inc,starter,2024-03-22,Technical user - API integration
SF-003,Carol Davis,carol@bigco.com,BigCo,enterprise,2023-11-01,Key account - quarterly reviews

Step 2: Create a template

A template defines how CSV columns map to character fields.

Python

from hippodid import HippoDid

hd = HippoDid(api_key="hd_key_...")

template = hd.create_character_template(
    name="Salesforce Customers",
    description="Maps Salesforce contact export to customer characters",
    field_mappings=[
        {"sourceColumn": "sf_id", "targetField": "externalId"},
        {"sourceColumn": "name", "targetField": "name"},
        {"sourceColumn": "email", "targetField": "alias"},
        {"sourceColumn": "company", "targetField": "tag"},
    ],
)

print(f"Template created: {template.id}")

TypeScript

import { HippoDid } from "@hippodid/sdk";

const hd = new HippoDid({ apiKey: "hd_key_..." });

const template = await hd.createCharacterTemplate({
  name: "Salesforce Customers",
  description: "Maps Salesforce contact export to customer characters",
  fieldMappings: [
    { sourceColumn: "sf_id", targetField: "externalId" },
    { sourceColumn: "name", targetField: "name" },
    { sourceColumn: "email", targetField: "alias" },
    { sourceColumn: "company", targetField: "tag" },
  ],
});

console.log(`Template created: ${template.id}`);

Field mapping reference

Target	Description
`externalId`	Unique external identifier (e.g., Salesforce ID). Used for conflict detection.
`name`	Character display name.
`alias`	Searchable alias for the character.
`tag`	Character tag for filtering and grouping.

Step 3: Dry run

Before creating real characters, validate your data with a dry run. This checks for conflicts, missing fields, and template errors without persisting anything.

The batch endpoint requires multipart/form-data with a CSV file upload. The SDKs handle this automatically — pass rows as a list of dicts (Python) or array of objects (TypeScript), and the SDK converts to CSV and uploads via multipart internally.

Python

import csv

# Read the CSV into rows
with open("customers.csv") as f:
    rows = list(csv.DictReader(f))

print(f"Loaded {len(rows)} rows")

# Dry run — validates without creating
job = hd.batch_create_characters(
    template_id=template.id,
    data=rows,
    external_id_column="sf_id",
    on_conflict="SKIP",
    dry_run=True,
)

print(f"Status: {job.status}")
print(f"Total rows: {job.total_rows}")
print(f"Succeeded: {job.succeeded}")
print(f"Failed: {job.failed}")

if job.errors:
    for err in job.errors:
        print(f"  Row {err.row_index}: {err.message}")

TypeScript

import { readFileSync } from "fs";
import { parse } from "csv-parse/sync";

const csv = readFileSync("customers.csv", "utf-8");
const rows = parse(csv, { columns: true });
console.log(`Loaded ${rows.length} rows`);

// Dry run — validates without creating
const job = await hd.batchCreateCharacters({
  templateId: template.id,
  rows,
  externalIdColumn: "sf_id",
  onConflict: "SKIP",
  dryRun: true,
});

console.log(`Status: ${job.status}`);
console.log(`Total rows: ${job.totalRows}`);
console.log(`Succeeded: ${job.succeeded}`);
console.log(`Failed: ${job.failed}`);

if (job.errors.length > 0) {
  for (const err of job.errors) {
    console.log(`  Row ${err.rowIndex}: ${err.message}`);
  }
}

Step 4: Batch create

When the dry run looks clean, run the real import. Batch creation is asynchronous: the API returns a job ID immediately, and you poll for progress.

Python

# Start the real batch job
job = hd.batch_create_characters(
    template_id=template.id,
    data=rows,
    external_id_column="sf_id",
    on_conflict="SKIP",
)

print(f"Job started: {job.job_id}, status: {job.status}")

TypeScript

const job = await hd.batchCreateCharacters({
  templateId: template.id,
  rows,
  externalIdColumn: "sf_id",
  onConflict: "SKIP",
});

console.log(`Job started: ${job.jobId}, status: ${job.status}`);

Conflict strategies

Strategy	Behavior
`ERROR`	Fail the entire row if a character with the same external ID exists. Default.
`SKIP`	Silently skip rows where the external ID already exists. Best for re-running imports.
`UPDATE`	Update the existing character with new data from the row.

Step 5: Monitor job status

Poll the job endpoint until the status is COMPLETED or FAILED.

Python

import time

while True:
    status = hd.get_batch_job_status(job.job_id)

    print(f"Status: {status.status} — "
          f"{status.succeeded + status.failed}/{status.total_rows}")

    if status.status in ("COMPLETED", "FAILED"):
        break
    time.sleep(2)

# Final summary
print(f"\nSucceeded: {status.succeeded}")
print(f"Skipped:   {status.skipped}")
print(f"Failed:    {status.failed}")

TypeScript

let status;
do {
  await new Promise((r) => setTimeout(r, 2000));
  status = await hd.getBatchJobStatus(job.jobId);
  console.log(
    `Status: ${status.status} — ${status.succeeded + status.failed}/${status.totalRows}`
  );
} while (status.status !== "COMPLETED" && status.status !== "FAILED");

console.log(`\nSucceeded: ${status.succeeded}`);
console.log(`Skipped:   ${status.skipped}`);
console.log(`Failed:    ${status.failed}`);

Step 6: Verify

Spot-check that characters were created correctly.

Python

# List characters
characters = hd.list_characters(tag="company:Acme Corp")
for c in characters:
    print(f"{c.name} (id={c.id}, externalId={c.external_id})")

# Look up by external ID
alice = hd.get_character_by_external_id("SF-001")
print(f"\nResolved Alice: {alice.id}")

TypeScript

const characters = await hd.listCharacters({ tag: "company:Acme Corp" });
for (const c of characters.characters) {
  console.log(`${c.name} (id=${c.id}, externalId=${c.externalId})`);
}

const alice = await hd.getCharacterByExternalId("SF-001");
console.log(`\nResolved Alice: ${alice.id}`);

Handling partial failures

Batch jobs can partially succeed. If some rows fail (invalid data, duplicate external IDs with ERROR strategy), the job continues processing the remaining rows.

To handle failures:

Check status.failed after the job completes
Retrieve error details from status.errors
Fix the source data and re-run with onConflict: "SKIP" to only create the missing rows

if status.failed > 0:
    print("\nFailed rows:")
    for err in status.errors:
        print(f"  Row {err.row_index}: {err.message}")
    # Fix the CSV and re-run with SKIP to fill in gaps

Tier limits

Tier	Max rows per batch
Free	Not available
Starter	500
Developer	5,000
Enterprise	50,000

For imports larger than your tier limit, split the CSV into chunks and run multiple batch jobs sequentially.

Next steps

Memory Modes — choose how memories are processed for your characters
Assembly Strategies — build context blocks for each customer
CrewAI Integration — build customer service crews with per-customer memory
API Reference — full endpoint documentation