This page covers tier caps, retention windows, and what can cause eligibility to fail before payment. Use it to confirm your run is ready and avoid retries. For dollar pricing, use the Pricing page.
Tier caps and retention
Eligibility is checked before payment against file format, record structure, dataset size, token cap, record count, and max line length. Record count above 200,000 and line length above 20,000 characters are global hard stops before any tier is assigned.
| Tier | Dataset Limit | Token cap | Records/lines | Max line length | Artifact Retention |
|---|
| Launch S | Up to 50 MB | 5.5M | 200,000 | 20,000 chars | 7 days |
| Launch M | Over 50 MB to 150 MB | 7.5M | 200,000 | 20,000 chars | 7 days |
| Launch L | Over 150 MB to 300 MB | 10.0M | 200,000 | 20,000 chars | 7 days |
| Orbit S | Over 300 MB to 400 MB | 16.5M | 200,000 | 20,000 chars | 7 days |
| Orbit M | Over 400 MB to 500 MB | 22.5M | 200,000 | 20,000 chars | 7 days |
| Orbit L | Over 500 MB to 600 MB | 28.0M | 200,000 | 20,000 chars | 7 days |
Launch S
- Dataset
- Up to 50 MB
- Tokens
- 5.5M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Launch M
- Dataset
- Over 50 MB to 150 MB
- Tokens
- 7.5M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Launch L
- Dataset
- Over 150 MB to 300 MB
- Tokens
- 10.0M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Orbit S
- Dataset
- Over 300 MB to 400 MB
- Tokens
- 16.5M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Orbit M
- Dataset
- Over 400 MB to 500 MB
- Tokens
- 22.5M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Orbit L
- Dataset
- Over 500 MB to 600 MB
- Tokens
- 28.0M
- Records
- 200,000
- Line chars
- 20,000
- Retention
- 7 days
Displayed dataset ranges are planning bands. Final tier assignment happens automatically after upload and validation, and the higher required tier wins when dataset size and token estimate point to different tiers.
Common eligibility failures
- File is not a valid
.jsonl dataset or is not UTF-8 encoded. - One or more lines are not valid JSON objects or do not use a supported record structure.
- Dataset size exceeds BeaverYard's maximum published size cap.
- Token estimate exceeds BeaverYard's maximum published token cap.
- Record count exceeds the global 200,000 records/lines cap.
- One or more lines exceed the global 20,000-character line-length cap.
Priority Processing
- Priority Processing is available as an optional scheduling add-on at checkout.
- Priority Processing is included automatically for BeaverYard Plus and Pro members at no extra cost.
- It does not bypass dataset format or published tier size limits.
What we do not check
- We do not score dataset quality or predict model performance.
- We do not rewrite, deduplicate, or otherwise improve your dataset automatically.
- Training quality remains your responsibility and depends on the data you upload.