Batch Onboarding: 5,000 Customers from Salesforce
End-to-end workflow for creating thousands of customer characters from a CRM export.
Table of contents
Prerequisites
- A HippoDid account on Starter+ tier (batch creation requires a paid plan)
- An API key (
hd_key_...) - A CSV export from your CRM
- Python SDK (
pip install hippodid) or TypeScript SDK (npm install @hippodid/sdk)
Overview
The batch onboarding workflow has five steps:
- Export your customer data as CSV
- Create a template that maps CSV columns to character fields
- Dry run to validate before committing
- Batch create to run the real import
- Verify the results
Each customer becomes a HippoDid character with its own memory namespace, ready to store and recall facts across AI interactions.
Step 1: Export your data
Export a CSV from Salesforce, HubSpot, or any CRM. The file should have one row per customer.
Example customers.csv:
sf_id,name,email,company,plan,signup_date,notes
SF-001,Alice Chen,alice@acme.com,Acme Corp,enterprise,2024-01-15,Prefers email communication
SF-002,Bob Martinez,bob@widgets.io,Widgets Inc,starter,2024-03-22,Technical user - API integration
SF-003,Carol Davis,carol@bigco.com,BigCo,enterprise,2023-11-01,Key account - quarterly reviews
Step 2: Create a template
A template defines how CSV columns map to character fields.
Python
from hippodid import HippoDid
hd = HippoDid(api_key="hd_key_...")
template = hd.create_character_template(
name="Salesforce Customers",
description="Maps Salesforce contact export to customer characters",
field_mappings=[
{"sourceColumn": "sf_id", "targetField": "externalId"},
{"sourceColumn": "name", "targetField": "name"},
{"sourceColumn": "email", "targetField": "alias"},
{"sourceColumn": "company", "targetField": "tag"},
],
)
print(f"Template created: {template.id}")
TypeScript
import { HippoDid } from "@hippodid/sdk";
const hd = new HippoDid({ apiKey: "hd_key_..." });
const template = await hd.createCharacterTemplate({
name: "Salesforce Customers",
description: "Maps Salesforce contact export to customer characters",
fieldMappings: [
{ sourceColumn: "sf_id", targetField: "externalId" },
{ sourceColumn: "name", targetField: "name" },
{ sourceColumn: "email", targetField: "alias" },
{ sourceColumn: "company", targetField: "tag" },
],
});
console.log(`Template created: ${template.id}`);
Field mapping reference
| Target | Description |
|---|---|
externalId | Unique external identifier (e.g., Salesforce ID). Used for conflict detection. |
name | Character display name. |
alias | Searchable alias for the character. |
tag | Character tag for filtering and grouping. |
Step 3: Dry run
Before creating real characters, validate your data with a dry run. This checks for conflicts, missing fields, and template errors without persisting anything.
The batch endpoint requires multipart/form-data with a CSV file upload. The SDKs handle this automatically — pass rows as a list of dicts (Python) or array of objects (TypeScript), and the SDK converts to CSV and uploads via multipart internally.
Python
import csv
# Read the CSV into rows
with open("customers.csv") as f:
rows = list(csv.DictReader(f))
print(f"Loaded {len(rows)} rows")
# Dry run — validates without creating
job = hd.batch_create_characters(
template_id=template.id,
data=rows,
external_id_column="sf_id",
on_conflict="SKIP",
dry_run=True,
)
print(f"Status: {job.status}")
print(f"Total rows: {job.total_rows}")
print(f"Succeeded: {job.succeeded}")
print(f"Failed: {job.failed}")
if job.errors:
for err in job.errors:
print(f" Row {err.row_index}: {err.message}")
TypeScript
import { readFileSync } from "fs";
import { parse } from "csv-parse/sync";
const csv = readFileSync("customers.csv", "utf-8");
const rows = parse(csv, { columns: true });
console.log(`Loaded ${rows.length} rows`);
// Dry run — validates without creating
const job = await hd.batchCreateCharacters({
templateId: template.id,
rows,
externalIdColumn: "sf_id",
onConflict: "SKIP",
dryRun: true,
});
console.log(`Status: ${job.status}`);
console.log(`Total rows: ${job.totalRows}`);
console.log(`Succeeded: ${job.succeeded}`);
console.log(`Failed: ${job.failed}`);
if (job.errors.length > 0) {
for (const err of job.errors) {
console.log(` Row ${err.rowIndex}: ${err.message}`);
}
}
Step 4: Batch create
When the dry run looks clean, run the real import. Batch creation is asynchronous: the API returns a job ID immediately, and you poll for progress.
Python
# Start the real batch job
job = hd.batch_create_characters(
template_id=template.id,
data=rows,
external_id_column="sf_id",
on_conflict="SKIP",
)
print(f"Job started: {job.job_id}, status: {job.status}")
TypeScript
const job = await hd.batchCreateCharacters({
templateId: template.id,
rows,
externalIdColumn: "sf_id",
onConflict: "SKIP",
});
console.log(`Job started: ${job.jobId}, status: ${job.status}`);
Conflict strategies
| Strategy | Behavior |
|---|---|
ERROR | Fail the entire row if a character with the same external ID exists. Default. |
SKIP | Silently skip rows where the external ID already exists. Best for re-running imports. |
UPDATE | Update the existing character with new data from the row. |
Step 5: Monitor job status
Poll the job endpoint until the status is COMPLETED or FAILED.
Python
import time
while True:
status = hd.get_batch_job_status(job.job_id)
print(f"Status: {status.status} — "
f"{status.succeeded + status.failed}/{status.total_rows}")
if status.status in ("COMPLETED", "FAILED"):
break
time.sleep(2)
# Final summary
print(f"\nSucceeded: {status.succeeded}")
print(f"Skipped: {status.skipped}")
print(f"Failed: {status.failed}")
TypeScript
let status;
do {
await new Promise((r) => setTimeout(r, 2000));
status = await hd.getBatchJobStatus(job.jobId);
console.log(
`Status: ${status.status} — ${status.succeeded + status.failed}/${status.totalRows}`
);
} while (status.status !== "COMPLETED" && status.status !== "FAILED");
console.log(`\nSucceeded: ${status.succeeded}`);
console.log(`Skipped: ${status.skipped}`);
console.log(`Failed: ${status.failed}`);
Step 6: Verify
Spot-check that characters were created correctly.
Python
# List characters
characters = hd.list_characters(tag="company:Acme Corp")
for c in characters:
print(f"{c.name} (id={c.id}, externalId={c.external_id})")
# Look up by external ID
alice = hd.get_character_by_external_id("SF-001")
print(f"\nResolved Alice: {alice.id}")
TypeScript
const characters = await hd.listCharacters({ tag: "company:Acme Corp" });
for (const c of characters.characters) {
console.log(`${c.name} (id=${c.id}, externalId=${c.externalId})`);
}
const alice = await hd.getCharacterByExternalId("SF-001");
console.log(`\nResolved Alice: ${alice.id}`);
Handling partial failures
Batch jobs can partially succeed. If some rows fail (invalid data, duplicate external IDs with ERROR strategy), the job continues processing the remaining rows.
To handle failures:
- Check
status.failedafter the job completes - Retrieve error details from
status.errors - Fix the source data and re-run with
onConflict: "SKIP"to only create the missing rows
if status.failed > 0:
print("\nFailed rows:")
for err in status.errors:
print(f" Row {err.row_index}: {err.message}")
# Fix the CSV and re-run with SKIP to fill in gaps
Tier limits
| Tier | Max rows per batch |
|---|---|
| Free | Not available |
| Starter | 500 |
| Developer | 5,000 |
| Enterprise | 50,000 |
For imports larger than your tier limit, split the CSV into chunks and run multiple batch jobs sequentially.
Next steps
- Memory Modes — choose how memories are processed for your characters
- Assembly Strategies — build context blocks for each customer
- CrewAI Integration — build customer service crews with per-customer memory
- API Reference — full endpoint documentation