Design e UI

Conduzir execuções exploratórias controladas

Planeja e resume testes exploratórios autorizados, com limites claros, comparação justa e evidências organizadas em explore_outputs/.

Ver código no GitHub Instala diretamente do repositório-fonte.

VISÃO GERAL

O que esta skill faz

Use esta skill apenas quando o pesquisador autorizar explicitamente a exploração. Ela cobre testes em subconjuntos, ciclos curtos, sweeps e transfer learning rápido, tratando os resultados como candidatos e evitando conclusões excessivas.

CASOS DE USO

Quando usar

Validar uma hipótese em um subconjunto pequeno
Executar um ciclo curto de tentativa e ajuste
Planejar um batch sweep
Aproveitar GPU ociosa em uma busca limitada
Testar rapidamente transfer learning

GUIA PRÁTICO

Como usar

Obtenha autorização explícita para explorar
Revise o repositório e delimite o candidato
Defina orçamento, comparação e critérios da rodada
Encaminhe a execução ao recurso apropriado
Resuma evidências e ressalvas em explore_outputs/

LIMITAÇÕES

O que revisar antes de instalar

Não certifica sucesso confiável
Não deve ser acionada implicitamente
Não comprova SOTA
Não substitui a orquestração de pesquisa exploratória

CONTEÚDO ORIGINAL

SKILL.md

---
name: explore-run
description: Rigor Improve / Rigor Explore run leaf skill for bounded exploratory evidence in deep learning research repositories. Use when the researcher explicitly authorizes exploratory runs such as small-subset validation, short-cycle guess-and-check, batch sweeps, idle-GPU search, or quick transfer-learning trials, with fair-comparison caveats and no-overclaim summaries in `explore_outputs/`. Do not use for end-to-end exploration orchestration on top of `current_research`, trusted baseline execution, conservative training verification, default routing, verified SOTA claims, or implicit experimentation.
---

# explore-run

Use this as the Rigor Improve / Rigor Explore run leaf skill. The installed slug
remains `explore-run` for compatibility.

Use the shared operating principles in
`../../references/agent-operating-principles.md`; this skill should guide
candidate run planning while preserving model judgment about the active repo.

## When to apply

- When the researcher explicitly authorizes exploratory runs.
- When the task is a small-subset validation, short-cycle training probe, batch sweep, idle-GPU search, or quick transfer-learning trial.
- When the output should rank candidate runs rather than certify trusted success.

## When not to apply

- When the user wants trusted training execution or conservative verification.
- When there is no explicit exploratory authorization.
- When the task is repository setup, intake, or debugging.

## Clear boundaries

- This skill owns exploratory execution planning and summary only.
- Use `ai-research-explore` instead when the task spans both current_research coordination and exploratory code changes.
- It may hand off actual command execution to `minimal-run-and-audit` or `run-train`.
- It should keep experiment state isolated from the trusted baseline.
- It should prefer small-subset and short-cycle checks before heavier exploratory runs.
- It should label run results as bounded evidence and explain when a comparison
is not directly fair.

## Ranking Semantics

- Pre-execution candidate selection uses three factors: `cost`, `success_rate`, and `expected_gain`.
- Default weights should stay conservative unless the researcher explicitly provides `selection_weights`.
- Budget pruning still applies after scoring through `max_variants` and `max_short_cycle_runs`.
- If runs are executed later, downstream ranking should switch to real execution evidence, not stay purely heuristic.

## Variant Spec Hints

- Use `variant_axes` to define the candidate dimension grid.
- Use `subset_sizes` and `short_run_steps` to express exploratory run scale.
- Use `selection_weights` to rebalance `cost`, `success_rate`, and `expected_gain`.
- Use `primary_metric` and `metric_goal` so downstream ranking can order executed candidates consistently.

## Output expectations

- `explore_outputs/CHANGESET.md`
- `explore_outputs/SCIENTIFIC_CHANGELOG.md`
- `explore_outputs/COMPARABILITY_REPORT.md`
- `explore_outputs/TOP_RUNS.md`
- `explore_outputs/status.json`

## Notes

Use `references/execution-policy.md`, `../../references/explore-variant-spec.md`, `../../references/deep-learning-experiment-principles.md`, `scripts/plan_variants.py`, and `scripts/write_outputs.py`.