Fix: Ensure all skills are tracked as files, not submodules

2026-01-14 18:48:48 +01:00
parent 7f46ed8ca1
commit 8bd204708b
1113 changed files with 82065 additions and 2 deletions
--- a/skills/loki-mode/benchmarks/results/2026-01-05-00-23-56/SUMMARY.md
+++ b/skills/loki-mode/benchmarks/results/2026-01-05-00-23-56/SUMMARY.md
@@ -0,0 +1,48 @@
+# Loki Mode Benchmark Results
+
+## Overview
+
+This directory contains benchmark results for Loki Mode multi-agent system.
+
+## Benchmarks Available
+
+### HumanEval
+- **Problems:** 164 Python programming problems
+- **Metric:** Pass@1 (percentage of problems solved on first attempt)
+- **Competitor Baseline:** MetaGPT achieves 85.9-87.7%
+
+### SWE-bench Lite
+- **Problems:** 300 real-world GitHub issues
+- **Metric:** Resolution rate
+- **Competitor Baseline:** Top agents achieve 45-77%
+
+## Running Benchmarks
+
+```bash
+# Run all benchmarks
+./benchmarks/run-benchmarks.sh all
+
+# Run specific benchmark
+./benchmarks/run-benchmarks.sh humaneval --execute
+./benchmarks/run-benchmarks.sh swebench --execute
+```
+
+## Results Format
+
+Results are saved as JSON files with:
+- Timestamp
+- Problem count
+- Pass rate
+- Individual problem results
+- Token usage
+- Execution time
+
+## Methodology
+
+Loki Mode uses its multi-agent architecture to solve each problem:
+1. **Architect Agent** analyzes the problem
+2. **Engineer Agent** implements the solution
+3. **QA Agent** validates with test cases
+4. **Review Agent** checks code quality
+
+This mirrors real-world software development more accurately than single-agent approaches.