Examples

Sanitized examples showing how teams structure datasets for fine-tuning runs.

These examples are sanitized. Validation runs before payment; if validation fails, you won't be charged.

Customer Support Tone Adapter

Use case:
Consistent, empathetic replies across support channels with clear escalation boundaries.
Dataset format:
JSONL with chat-format records ({"messages": [...]}), each containing a customer message and approved response, tagged by scenario.
Validation checks:
JSONL validity, required fields and roles, dataset size, token cap, record count, and max line length.
Expected outcome:
More consistent tone and fewer unnecessary escalations.

Product Q&A Bot

Use case:
Accurate answers to repetitive product questions with technical precision.
Dataset format:
JSONL prompt/answer records, each grounded to canonical product documentation sources.
Validation checks:
JSONL validity, required fields, dataset size and token caps per tier, record count, and max line length.
Expected outcome:
Higher first-response accuracy in self-serve channels.

Structured Data Extraction

Use case:
Reliable extraction from semi-structured text into a stable schema for automation.
Dataset format:
JSONL with instruction-format records ({"instruction": ..., "output": ...}) aligned to a single extraction schema.
Validation checks:
JSONL validity, supported instruction/output structure, dataset size and token caps per tier, record count, and per-record line-length limits.
Expected outcome:
Cleaner downstream automation with less manual cleanup.
Start Run