Fix: check_node_idle renamed to check_node_capacity. Benchmark results: - 6/6 nodes available (chiwk12 at 91% capacity, 8.2GB VRAM free) - 4 PDFs, 12 pages: PDF convert 3.7 pages/sec, decompose 28.2 files/sec - Validation: 22/22 PASS - Total pipeline: 4.3s end-to-end Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| benchmarks | ||
| skperf | ||
| tests | ||
| .gitignore | ||
| CLAUDE.md | ||
| README.md | ||
| setup.py | ||
skperf — Pipeline Performance Metrics & Benchmarking
Collects, stores, and analyzes performance metrics from the hammerTime document ingestion pipeline. Enables throughput prediction for different hardware configurations and tracks improvements over time.
Architecture
hammerTime Scripts
ingest_files.py
pdf_convert.py skperf Collector SQLite Store
vision_ocr.py --> Collector(pipeline=X) --> ~/.skperf/metrics.db
decompose.py .track(operation, ...) runs
embed.py .record(metric) metrics
graph_import.py .finish() hardware_profiles
benchmarks
|
v
Reports / Predictions
skperf report
skperf predict --pipeline pdf-convert --pages 500
skperf benchmark compare baseline v2
Quick Start
1. Install
cd /path/to/skperf
pip install -e .
2. Scan Hardware
Profile all cluster nodes (requires SSH access):
skperf hardware --scan --nodes chiap01,chiap02,chiap03,chiap04,chiap08
3. Import an Existing Log
skperf import-log ~/.hammerTime/logs/ingestion.log
4. Create a Baseline Benchmark
Snapshot the current state of all stored runs:
skperf benchmark create 2026-04-07-baseline --description "Initial corpus ingestion"
5. View Reports
skperf report
skperf report --pipeline pdf-convert
skperf report --since 2026-04-01
CLI Reference
| Command | Description | Example |
|---|---|---|
skperf hardware --scan |
SSH into cluster nodes and profile CPU/GPU/RAM/NFS | skperf hardware --scan --nodes chiap01,chiap02 |
skperf hardware --list |
List all stored hardware profiles | skperf hardware --list |
skperf import-log <path> |
Parse and import a hammerTime ingestion.log |
skperf import-log logs/ingestion.log |
skperf benchmark create <name> |
Snapshot current run data as a named benchmark | skperf benchmark create baseline |
skperf benchmark list |
List all stored benchmarks | skperf benchmark list |
skperf benchmark compare <a> <b> |
Diff two benchmarks side-by-side | skperf benchmark compare baseline v2 |
skperf benchmark export <name> |
Export benchmark to JSON | skperf benchmark export baseline > b.json |
skperf report |
Full throughput report across all pipelines | skperf report |
skperf report --pipeline <name> |
Report for a single pipeline | skperf report --pipeline vision-ocr |
skperf report --since <date> |
Filter report to runs after a date | skperf report --since 2026-04-01 |
skperf predict |
Predict time to process N pages/files | skperf predict --pipeline pdf-convert --pages 1000 |
skperf predict --node <name> |
Prediction scoped to a specific node | skperf predict --pipeline vision-ocr --pages 500 --node chiap01 |
skperf runs list |
List all recorded runs | skperf runs list |
skperf runs show <run_id> |
Show detail for one run | skperf runs show abc-123 |
skperf db path |
Print path to SQLite database | skperf db path |
skperf db reset |
Wipe database and start fresh | skperf db reset |
Integration with hammerTime
All hammerTime ingestion scripts use a soft-import pattern so skperf is optional:
try:
from skperf import Collector
perf = Collector(pipeline="pdf-convert")
except ImportError:
perf = None
# Inside a processing loop:
if perf:
with perf.track("convert", target=filename, node="chiap04"):
result = convert_pdf(filename)
perf.record(chars_out=len(result), status="ok")
# At the end of the script:
if perf:
perf.finish(totals={"files": file_count, "chars": total_chars})
The Collector class handles:
- Generating a unique
run_idper invocation - Timing each
track()context manager block - Writing metrics to
~/.skperf/metrics.dbvia theStorebackend - Gracefully no-op-ing if the DB is unavailable
Data Model
runs table
| Column | Type | Description |
|---|---|---|
run_id |
TEXT PK | UUID for this pipeline invocation |
pipeline |
TEXT | Pipeline name (e.g. pdf-convert) |
started_at |
TEXT | ISO 8601 UTC |
finished_at |
TEXT | ISO 8601 UTC, set on finish() |
elapsed_ms |
REAL | Total wall-clock milliseconds |
status |
TEXT | running, ok, error |
config |
TEXT | JSON — worker counts, flags |
hardware |
TEXT | Node name where run originated |
totals |
TEXT | JSON — files, pages, chars, etc. |
metrics table
| Column | Type | Description |
|---|---|---|
metric_id |
TEXT PK | UUID |
run_id |
TEXT FK | Parent run |
operation |
TEXT | Operation name (convert, ocr, embed) |
target |
TEXT | Filename or identifier |
node |
TEXT | Cluster node |
elapsed_ms |
REAL | Duration |
chars_in |
INTEGER | Input character count |
chars_out |
INTEGER | Output character count |
status |
TEXT | ok, error, retry |
retries |
INTEGER | Retry count |
metadata |
TEXT | JSON — method, model, etc. |
hardware_profiles table
Stores one row per node per scan. Columns: profile_id, node_name, cpu_model, cpu_cores, ram_gb, gpu_model, gpu_vram_gb, gpu_backend, nfs_read_mbps, nfs_write_mbps, ollama_models, captured_at.
benchmarks table
Named snapshots linking to sets of runs. Columns: benchmark_id, name, description, runs (JSON array of run_ids), hardware_profiles (JSON), created_at, summary (JSON throughput summary).
Hardware Cluster
| Node | CPU | Cores | RAM | GPU | VRAM | Backend |
|---|---|---|---|---|---|---|
| chiap01 | AMD Ryzen 9 7950X (est) | 32 | 61 GB | NVIDIA RTX 4080 | 16 GB | CUDA |
| chiap02 | AMD Ryzen 9 5900X (est) | 24 | 31 GB | AMD RX 7600 | 8 GB | ROCm |
| chiap03 | AMD Ryzen 9 5900X (est) | 24 | 31 GB | AMD RX 7600 | 8 GB | ROCm |
| chiap04 | Intel Core (est) | 16 | 16 GB | NVIDIA RTX 3060 | 6 GB | CUDA |
| chiap08 | AMD (est) | 24 | 31 GB | NVIDIA RTX 2080 Super | 8 GB | CUDA |
Totals: 120 CPU cores, 170 GB RAM, 46 GB VRAM across 5 nodes.
Vision OCR (minicpm-v) runs on chiap01, chiap02, chiap03. Embedding (bge-large 1024-dim) runs on all nodes. Storage is fiber-speed NFS shared across the cluster.
Key Throughput Baselines (2026-04-07)
| Pipeline | Metric | Value |
|---|---|---|
| pdf-convert (text) | Pages/sec | 4.3 |
| pdf-convert (scanned OCR) | Pages/min | 19 |
| vision-ocr (3 nodes) | Pages/min | 19 |
| distributed-decompose (5 nodes) | Files/min | 22 |
| sip-embed (bge-large) | Points in Qdrant | 103,306 |
| sip-graph (FalkorDB) | Nodes / Relationships | 663 / 10,778 |
Full baseline detail: benchmarks/2026-04-07-baseline.json