Documentation

Behavior Packages, end to end.

Everything you need to know to author, install, and govern a Behavior Package on Joseki WrapperHub.

Concept

What is a Behavior Package?

A Behavior Package is a versioned bundle that captures everything about how an LLM should behave. Instead of copying prompts between repos and wiring up evals from scratch each time, you install a Behavior Package the same way you would an npm or PyPI dependency.

A package can be made up of any combination of these typed specs:

  • PromptSpec — system prompts, templates, examples
  • FineTuneSpec — training recipes and configs
  • EvalSpec — eval suites and benchmarks
  • SafetySpec — safety policies and guardrails
  • LicenseSpec — usage terms and attribution
  • InstallSpec — runtime setup and dependencies
  • EvidencePack — provenance and audit artifacts

Specs

What is PromptSpec?

A PromptSpec defines the system prompt, message templates, and example dialogues a model uses to produce consistent output. It is the smallest reusable unit on WrapperHub.

A PromptSpec includes:

  • System prompt with placeholders for runtime variables
  • Few-shot examples used at inference time
  • Compatible models and provider-specific overrides
  • Optional formatting and tool-call schemas

Specs

What is FineTuneSpec?

A FineTuneSpec is a portable training recipe. It encodes the dataset reference, hyperparameters, target base model(s), and the loss/eval signals used during training.

Because the recipe is portable, the same package can run on multiple providers (OpenAI fine-tuning API, Anthropic, your own cluster) and produce reproducible results.

Specs

What is EvalSpec?

An EvalSpec is a runnable bundle of evaluation cases. It captures inputs, expected behaviors, judging rubrics (heuristic or model-as-judge), and the reproducibility seed.

Eval results help consumers understand how a package performs against the listed compatible models before they adopt it in production.

Trust & data

Data packs: hosted vs self-hosted

A data pack is the dataset, schema, eval examples, citations, and version metadata bundled with a Behavior Package so others can reproduce the same workflow—not just the prompt text.

Joseki supports two complementary modes:

  • Hosted on Joseki — you upload a file per wrapper version. We store a sha256 digest, size, declared license, optional provenance JSON, optional eval-set reference, and a malware scan status (today: synchronous heuristics; production path is async vendor scanning). Downloads respect visibility: inherit follows the wrapper's public/private flag; public or private override who may fetch bytes. Version rows are immutable, so the pack is pinned to that release.
  • External (self-hosted) — for enterprises, regulated teams, or large blobs, you keep custody in s3, r2, gcs, github, a plain url, or a private_api. The registry stores the URI, optional checksum, access_policy hint (signed URL, OAuth, API key, or user-managed), license string, and a logical version for the data artifact. Consumers resolve access under their own credentials—Joseki never becomes the default custodian of sensitive data unless you opt into hosted mode.

Runs can optionally carry data_pack_sha256 (must match the hosted blob when one exists) plus a reproducibility_context JSON object (eval case ids, input fingerprints, schema versions) so audits and liveness reports line up with exact inputs.

API: list versions GET /wrappers/{id}/versions, trust view GET /wrappers/{id}/versions/{vid}/data-pack, set external manifest PUT .../data-pack/manifest, upload hosted bytes POST .../data-pack/hosted (multipart), download GET .../data-pack/download.

External data pack (conceptual YAML)yaml
data_pack:
  mode: external
  source_type: s3
  uri: s3://my-org-bucket/prompt-packs/tax-2026/v3/dataset.parquet
  checksum: sha256:8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc09aa8ad
  checksum_algorithm: sha256
  access_policy: signed_url
  license: CC-BY-4.0
  version: "2026.03.1"
  provenance:
    schema_uri: https://example.org/schemas/tax-v1.json
    citations:
      - https://www.irs.gov/publications/p17
  eval_set_uri: https://example.org/evals/tax-pack-v3.jsonl

Reference

Example manifest

Every Behavior Package ships with a top-level manifest.yaml. The minimal form looks like:

manifest.yamlyaml
name: customer-support-classifier
version: 0.1.0
type: fine_tune_recipe
base_model: gpt-4.1-mini
evals:
  - label_accuracy
  - format_following
  - pii_leakage
safety:
  package_signing: required
  artifact_hashes: required
license: private-commercial

See the customer-support-classifier package for a working example.

Not every package is a fine-tune. A prompt_template package, for instance, exposes runtime variables and a battery of evals that prove the generated artifact actually works. A toy "build me Pac-Man" package looks like:

manifest.yamlyaml
name: pacman-builder
version: 0.2.1
type: prompt_template
description: >-
  Generates a runnable, single-file Pac-Man clone in the requested
  language. Includes maze layout, ghost AI behaviors, scoring, and
  a level-completion check.
base_model:
  - gpt-4.1
  - claude-4.6-sonnet
inputs:
  - target_language     # python | javascript | typescript
  - canvas_size         # e.g. "28x31" (classic arcade dimensions)
  - difficulty          # easy | classic | nightmare
  - ghost_ai            # chase | scatter | mixed
outputs:
  - source_files        # map of filename -> file body
  - run_instructions    # how to launch the game locally
evals:
  - render_smoke        # generated code starts without runtime errors
  - ghost_ai_correctness  # ghosts respect their assigned mode
  - score_logic         # dot/pellet/ghost points match arcade rules
  - level_completion    # clearing all dots ends the level
safety:
  package_signing: required
  artifact_hashes: required
  no_external_calls: required   # generated game must run fully offline
license: apache-2.0

The shape of the manifest is the same — only the type, the declared inputs/outputs, and the eval suite change to match what this package actually produces.

Joseki | Docs — How LLM Behavior Packages Work