# Tool Call Benchmark Report

**Model:** kimi-k2.5-turbo  
**Date:** 2026-04-03 16:13:16  
**Suite:** full  
**Score:** 16/16 (100.0%)
  
**First-attempt accuracy:** 16/16 (100.0%)
  
*4 test(s) skipped — not counted in score*

## Key Metrics

| Metric | Value |
|--------|-------|
| Hits | 16 |
| Misses | 0 |
| Skips | 4 |
| Misfires | 0 |
| Total attempts | 20 |
| Clean attempts | 20 |
| Total duration | 10.0s |

## Summary

| Category | Passed | Total | Misfires | Duration | Score |
|----------|--------|-------|----------|----------|-------|
| Bash Execution | 4 | 4 | 0 | 2.7s | 100% |
| File Operations | 6 | 6 | 0 | 1.2s | 100% |
| MCP Tool Calls | 0 | 0 | 0 | — | 0% |
| Skill Invocations | 3 | 3 | 0 | 3.3s | 100% |
| Generation | 3 | 3 | 0 | 2.9s | 100% |

## Detailed Results

### Bash Execution

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-B01 | Echo exact string | ✓ PASS | 1619ms | 1 |  |
| TC-B02 | Python arithmetic | ✓ PASS | 312ms | 1 |  |
| TC-B03 | Node JSON output | ✓ PASS | 445ms | 1 |  |
| TC-B04 | Pipeline command | ✓ PASS | 287ms | 1 |  |

### File Operations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-F01 | Write file | ✓ PASS | 156ms | 1 |  |
| TC-F02 | Read file back | ✓ PASS | 203ms | 1 |  |
| TC-F03 | Edit file | ✓ PASS | 178ms | 1 |  |
| TC-F04 | Verify edit | ✓ PASS | 195ms | 1 |  |
| TC-F05 | Glob find | ✓ PASS | 234ms | 1 |  |
| TC-F06 | Grep search | ✓ PASS | 267ms | 1 |  |

### MCP Tool Calls

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-M01 | ToolSearch - fetch deferred schema | ⊘ SKIP | 0ms | 1 | No direct ToolSearch primitive - harness uses native tool discovery |
| TC-M02 | Context7 - resolve library | ⊘ SKIP | 0ms | 1 | MCP context7_resolve-library-id not available in harness |
| TC-M03 | Context7 - query docs | ⊘ SKIP | 0ms | 1 | Skipped - prerequisite TC-M02 skipped |
| TC-M04 | ToolSearch - keyword search | ⊘ SKIP | 0ms | 1 | No direct ToolSearch primitive - harness uses native tool discovery |

### Skill Invocations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-S01 | Invoke current-datetime skill | ✓ PASS | 1245ms | 1 |  |
| TC-S02 | Invoke brand-guidelines skill | ✓ PASS | 987ms | 1 |  |
| TC-S03 | Invoke chart-taste skill | ✓ PASS | 1043ms | 1 |  |

### Generation

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-G01 | Create PDF via Python | ✓ PASS | 234ms | 1 |  |
| TC-G02 | Verify PDF exists | ✓ PASS | 156ms | 1 |  |
| TC-G03 | SVG to PNG generation | ✓ PASS | 2487ms | 1 |  |

---

*Generated by `/oneshot-tool-call` benchmark*
