Batch Onboarding: 5,000 Customers from Salesforce

End-to-end workflow for creating thousands of customer characters from a CRM export.

Table of contents
  1. Prerequisites
  2. Overview
  3. Step 1: Export your data
  4. Step 2: Create a template
    1. Python
    2. TypeScript
    3. Field mapping reference
  5. Step 3: Dry run
    1. Python
    2. TypeScript
  6. Step 4: Batch create
    1. Python
    2. TypeScript
    3. Conflict strategies
  7. Step 5: Monitor job status
    1. Python
    2. TypeScript
  8. Step 6: Verify
    1. Python
    2. TypeScript
  9. Handling partial failures
  10. Tier limits
  11. Next steps

Prerequisites

  • A HippoDid account on Starter+ tier (batch creation requires a paid plan)
  • An API key (hd_key_...)
  • A CSV export from your CRM
  • Python SDK (pip install hippodid) or TypeScript SDK (npm install @hippodid/sdk)

Overview

The batch onboarding workflow has five steps:

  1. Export your customer data as CSV
  2. Create a template that maps CSV columns to character fields
  3. Dry run to validate before committing
  4. Batch create to run the real import
  5. Verify the results

Each customer becomes a HippoDid character with its own memory namespace, ready to store and recall facts across AI interactions.


Step 1: Export your data

Export a CSV from Salesforce, HubSpot, or any CRM. The file should have one row per customer.

Example customers.csv:

sf_id,name,email,company,plan,signup_date,notes
SF-001,Alice Chen,alice@acme.com,Acme Corp,enterprise,2024-01-15,Prefers email communication
SF-002,Bob Martinez,bob@widgets.io,Widgets Inc,starter,2024-03-22,Technical user - API integration
SF-003,Carol Davis,carol@bigco.com,BigCo,enterprise,2023-11-01,Key account - quarterly reviews

Step 2: Create a template

A template defines how CSV columns map to character fields.

Python

from hippodid import HippoDid

hd = HippoDid(api_key="hd_key_...")

template = hd.create_character_template(
    name="Salesforce Customers",
    description="Maps Salesforce contact export to customer characters",
    field_mappings=[
        {"sourceColumn": "sf_id", "targetField": "externalId"},
        {"sourceColumn": "name", "targetField": "name"},
        {"sourceColumn": "email", "targetField": "alias"},
        {"sourceColumn": "company", "targetField": "tag"},
    ],
)

print(f"Template created: {template.id}")

TypeScript

import { HippoDid } from "@hippodid/sdk";

const hd = new HippoDid({ apiKey: "hd_key_..." });

const template = await hd.createCharacterTemplate({
  name: "Salesforce Customers",
  description: "Maps Salesforce contact export to customer characters",
  fieldMappings: [
    { sourceColumn: "sf_id", targetField: "externalId" },
    { sourceColumn: "name", targetField: "name" },
    { sourceColumn: "email", targetField: "alias" },
    { sourceColumn: "company", targetField: "tag" },
  ],
});

console.log(`Template created: ${template.id}`);

Field mapping reference

Target Description
externalId Unique external identifier (e.g., Salesforce ID). Used for conflict detection.
name Character display name.
alias Searchable alias for the character.
tag Character tag for filtering and grouping.

Step 3: Dry run

Before creating real characters, validate your data with a dry run. This checks for conflicts, missing fields, and template errors without persisting anything.

The batch endpoint requires multipart/form-data with a CSV file upload. The SDKs handle this automatically — pass rows as a list of dicts (Python) or array of objects (TypeScript), and the SDK converts to CSV and uploads via multipart internally.

Python

import csv

# Read the CSV into rows
with open("customers.csv") as f:
    rows = list(csv.DictReader(f))

print(f"Loaded {len(rows)} rows")

# Dry run — validates without creating
job = hd.batch_create_characters(
    template_id=template.id,
    data=rows,
    external_id_column="sf_id",
    on_conflict="SKIP",
    dry_run=True,
)

print(f"Status: {job.status}")
print(f"Total rows: {job.total_rows}")
print(f"Succeeded: {job.succeeded}")
print(f"Failed: {job.failed}")

if job.errors:
    for err in job.errors:
        print(f"  Row {err.row_index}: {err.message}")

TypeScript

import { readFileSync } from "fs";
import { parse } from "csv-parse/sync";

const csv = readFileSync("customers.csv", "utf-8");
const rows = parse(csv, { columns: true });
console.log(`Loaded ${rows.length} rows`);

// Dry run — validates without creating
const job = await hd.batchCreateCharacters({
  templateId: template.id,
  rows,
  externalIdColumn: "sf_id",
  onConflict: "SKIP",
  dryRun: true,
});

console.log(`Status: ${job.status}`);
console.log(`Total rows: ${job.totalRows}`);
console.log(`Succeeded: ${job.succeeded}`);
console.log(`Failed: ${job.failed}`);

if (job.errors.length > 0) {
  for (const err of job.errors) {
    console.log(`  Row ${err.rowIndex}: ${err.message}`);
  }
}

Step 4: Batch create

When the dry run looks clean, run the real import. Batch creation is asynchronous: the API returns a job ID immediately, and you poll for progress.

Python

# Start the real batch job
job = hd.batch_create_characters(
    template_id=template.id,
    data=rows,
    external_id_column="sf_id",
    on_conflict="SKIP",
)

print(f"Job started: {job.job_id}, status: {job.status}")

TypeScript

const job = await hd.batchCreateCharacters({
  templateId: template.id,
  rows,
  externalIdColumn: "sf_id",
  onConflict: "SKIP",
});

console.log(`Job started: ${job.jobId}, status: ${job.status}`);

Conflict strategies

Strategy Behavior
ERROR Fail the entire row if a character with the same external ID exists. Default.
SKIP Silently skip rows where the external ID already exists. Best for re-running imports.
UPDATE Update the existing character with new data from the row.

Step 5: Monitor job status

Poll the job endpoint until the status is COMPLETED or FAILED.

Python

import time

while True:
    status = hd.get_batch_job_status(job.job_id)

    print(f"Status: {status.status}"
          f"{status.succeeded + status.failed}/{status.total_rows}")

    if status.status in ("COMPLETED", "FAILED"):
        break
    time.sleep(2)

# Final summary
print(f"\nSucceeded: {status.succeeded}")
print(f"Skipped:   {status.skipped}")
print(f"Failed:    {status.failed}")

TypeScript

let status;
do {
  await new Promise((r) => setTimeout(r, 2000));
  status = await hd.getBatchJobStatus(job.jobId);
  console.log(
    `Status: ${status.status}${status.succeeded + status.failed}/${status.totalRows}`
  );
} while (status.status !== "COMPLETED" && status.status !== "FAILED");

console.log(`\nSucceeded: ${status.succeeded}`);
console.log(`Skipped:   ${status.skipped}`);
console.log(`Failed:    ${status.failed}`);

Step 6: Verify

Spot-check that characters were created correctly.

Python

# List characters
characters = hd.list_characters(tag="company:Acme Corp")
for c in characters:
    print(f"{c.name} (id={c.id}, externalId={c.external_id})")

# Look up by external ID
alice = hd.get_character_by_external_id("SF-001")
print(f"\nResolved Alice: {alice.id}")

TypeScript

const characters = await hd.listCharacters({ tag: "company:Acme Corp" });
for (const c of characters.characters) {
  console.log(`${c.name} (id=${c.id}, externalId=${c.externalId})`);
}

const alice = await hd.getCharacterByExternalId("SF-001");
console.log(`\nResolved Alice: ${alice.id}`);

Handling partial failures

Batch jobs can partially succeed. If some rows fail (invalid data, duplicate external IDs with ERROR strategy), the job continues processing the remaining rows.

To handle failures:

  1. Check status.failed after the job completes
  2. Retrieve error details from status.errors
  3. Fix the source data and re-run with onConflict: "SKIP" to only create the missing rows
if status.failed > 0:
    print("\nFailed rows:")
    for err in status.errors:
        print(f"  Row {err.row_index}: {err.message}")
    # Fix the CSV and re-run with SKIP to fill in gaps

Tier limits

Tier Max rows per batch
Free Not available
Starter 500
Developer 5,000
Enterprise 50,000

For imports larger than your tier limit, split the CSV into chunks and run multiple batch jobs sequentially.


Next steps


Copyright © 2026 SameThoughts. HippoDid is proprietary software. Open-source components (Spring Boot Starter, MCP Server) are Apache 2.0.

This site uses Just the Docs, a documentation theme for Jekyll.