Model Routing
A scientific working group does not need every role to use the same model route. In this container, the PI Liaison and Scientific Director should stay on the most reliable human-facing route available, while narrower specialist agents can be evaluated against open-model APIs.
The practical pattern is conservative: keep the user-facing and decision-setting roles on OpenAI/Codex OAuth or another approved high-reliability route, then experiment with hosted open models for bounded work such as data inventories, literature triage, first-pass technical drafts, and skeptic checklists. This follows the same bounded-role idea used throughout the working group: agents get specific jobs, expected outputs, and explicit limits.
The workspace seed includes /workspace/MODEL_ASSIGNMENTS.md as the role-level routing register. Update that file when a role changes model route, and record meaningful changes or reversions in /workspace/DECISIONS.md.
Suggested Defaults
| Role group | Recommended route | Why |
|---|---|---|
| PI Liaison / User Interview Agent | OpenAI/Codex OAuth or another approved high-reliability route | This role talks to the user, manages Slack intake, and batches decisions. |
| Scientific Director | OpenAI/Codex OAuth or another approved high-reliability route | This role sets scientific direction and decides when claims can advance. |
| Deputy Director / Integrator | High-reliability route first, open-model experiments after evaluation | This role maintains coherence across the team. |
| Data, modeling, citation, communication, and skeptic roles | Approved open-model API route allowed for bounded tasks | These roles can be tested with narrower prompts and reviewable outputs. |
| Societal Impact / Translation Agent | High-reliability route preferred | Sensitive claims about communities, policy, public health, legal rules, or sovereignty require careful review. |
Verde-Style API Experiments
For Verde/CyVerse experiments, use the OpenAI-compatible base URL https://llm-api.cyverse.ai/v1 and keep credentials local in .env. The AI-VERDE API documentation explains that keys are obtained from the AI-VERDE course or team details and that available models can be listed from the API. Do not commit keys, paste them into chats, or store them in markdown:
VERDE_LLM_BASE_URL=https://llm-api.cyverse.ai/v1
VERDE_LLM_API_KEY=
VERDE_LLM_DEFAULT_MODEL=
VERDE_LLM_PROVIDER_NAME=verde
After setting a local key, inspect the models your team can access:
curl -s -L "${VERDE_LLM_BASE_URL}/models" \
-H "Authorization: Bearer ${VERDE_LLM_API_KEY}" \
-H "Content-Type: application/json"
Current AI-VERDE Candidate Models
The following model IDs were reported as available for the current API key context on 2026-05-18. Availability can vary by course, team, quota, and provider changes, so confirm with the /models endpoint before assigning a role:
| Model ID | Candidate use |
|---|---|
nrp/glm-4.7 |
General specialist-agent evaluation |
nrp/glm-5 |
General specialist-agent evaluation |
nrp/kimi |
Literature triage, summaries, and long-context checks |
nrp/gemma |
Lightweight drafts and comparison tests |
nrp/qwen3 |
General specialist-agent evaluation |
nrp/qwen3-small |
Fast, low-risk smoke tests and simple routing checks |
nrp/minimax-m2 |
General specialist-agent evaluation |
nrp/gpt-oss |
Open-model comparison tests |
Meta-Llama-3.1-70B-Instruct-quantized |
Larger-model baseline comparisons |
llama-3.3-70b-instruct-quantized |
Larger-model baseline comparisons |
phi-4 |
Lightweight technical drafts and checks |
Llama-3.3-70B-Instruct-quantized |
Larger-model baseline comparisons |
phi-4-multimodal-instruct |
Multimodal experiments only when inputs and privacy are reviewed |
gemma-3-12b-it |
Lightweight drafts and comparison tests |
Llama-3.2-11B-Vision-Instruct |
Vision experiments only when images are appropriate to share |
esiil/GLM-4.7-Flash |
Fast ESIIL-context experiments and smoke tests |
js2/llama-4-scout |
Jetstream2-hosted model experiments |
js2/gpt-oss-120b |
Larger open-model comparison tests |
Do not treat this table as approval to use a model for sensitive claims or external actions. It is an inventory for bounded experiments.
These variables are intentionally generic placeholders. They make the container ready for local experimentation, but they do not guarantee that OpenClaw has registered a provider. Provider registration should be added only after checking the installed OpenClaw version and documenting the exact provider name, model ID, validation command, and fallback route.
Evaluation Workflow
- Pick one bounded role and one bounded task.
- Run the task with the current reliable route.
- Run the same task with the open-model candidate.
- Compare evidence handling, citation behavior, role compliance, and uncertainty language.
- Record the result in
/workspace/agent_reports/model_evaluations.mdor/workspace/DECISIONS.md. - Promote the route only after human approval when the role is user-facing, Slack-facing, director-level, or involved in sensitive claims.
Scaling Pattern
The scalable unit is not "one key per agent." The scalable unit is a reviewed route assignment: role, provider, model, allowed task types, blocked actions, evaluation date, and fallback. That keeps the system reproducible as users add local keys, hosted endpoints, or future OpenClaw provider integrations.