New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
PAI Lab public benchmark

Public agent repositories, measured against visible PSF evidence.

PAI scans public GitHub metadata and file paths for signs of production AI discipline: evals, output schemas, observability, deployment gates, human oversight, security policy, and provider resilience. This is evidence coverage, not certification.

Repositories24
Eval evidence2
Human oversight7
Observability15
Evidence coverage table

Recently active public AI agent repositories

Projects are discovered through GitHub repository search, then scanned for visible PSF-aligned evidence in their public file tree. Higher coverage means more evidence was visible to the scanner, not that PAI has certified or endorsed the project.

GitHub public repository search
Repository
Coverage
Grade
Visible evidence
NousResearch/hermes-agent

The agent that grows with you

202,076 starsPythonUpdated Jun 25, 2026aiai-agentai-agentsIssue helper
95%
A
D12/2
D22/2
D32/2
D42/2
D52/2
D61/2
D72/2
D82/2
AI observability instrumentationdocs/observability | docs/observability/README.md | optional-skills/creative/kanban-video-orchestrator/references/monitoring.md
Human approval gatestests/gateway/test_approval_prompt_redaction.py | tests/gateway/test_feishu_approval_buttons.py | tests/gateway/test_matrix_approval_reaction_fail_closed.py
omnigent-ai/omnigent

Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, and collaborate in real time from any device.

4,727 starsPythonUpdated Jun 25, 2026agent-frameworkagent-governanceagent-orchestrationIssue helper
76%
A
D12/2
D21/2
D31/2
D42/2
D52/2
D61/2
D71/2
D82/2
AI observability instrumentationap-web/src/components/ai-elements/stack-trace.tsx | omnigent/inner/tracing.py | omnigent/runtime/telemetry.py
Human approval gates.github/workflows/auto-assign-reviewer-test.yml | .github/workflows/auto-assign-reviewer.js | .github/workflows/auto-assign-reviewer.test.js
mateaix/mateclaw

🤖 MateClaw — Your second brain with Multi-Agent Orchestration, MCP Protocol, Skills & Memory, Dream, and Multi-Channel Support. Built on Spring AI Alibaba.

660 starsJavaUpdated Jun 25, 2026agentagent-harnessai-agentIssue helper
72%
A
D12/2
D22/2
D31/2
D41/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationmateclaw-server/src/main/java/vip/mate/agent/progress/ProgressEntry.java | mateclaw-server/src/main/resources/skills/popular-web-designs/templates/sentry.md | mateclaw-ui/src/assets/icons/mcp/sentry.svg
Human approval gatesmateclaw-server/src/main/java/vip/mate/approval/ApprovalWorkflowService.java | mateclaw-server/src/main/java/vip/mate/approval/event/WorkflowApprovalResolvedEvent.java | mateclaw-server/src/main/java/vip/mate/approval/grant/AutoApproveAuditLogger.java
vivekchand/clawmetry

See your agent think. Real-time observability for 12 AI agent runtimes - OpenClaw, NVIDIA NemoClaw, Claude Code, Codex & 8 more.

379 starsPythonUpdated Jun 25, 2026ai-agentclawmetrydashboardIssue helper
68%
A
D12/2
D21/2
D31/2
D42/2
D51/2
D61/2
D71/2
D81/2
AI observability instrumentation.github/workflows/harness-observability-audit.yml | OBSERVABILITY.md | PRD-tracing.md
Human approval gatestests/test_review_queue.py
HankHuang0516/EClaw

E-Claw - OpenClaw Channel for agent-to-agent communication

6 starsJavaScriptUpdated Jun 25, 2026ai-agentandroidelectronic-petIssue helper
68%
A
D12/2
D20/2
D31/2
D42/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationapp/src/main/java/com/hank/clawlive/data/remote/TelemetryHelper.kt | app/src/main/java/com/hank/clawlive/data/remote/TelemetryInterceptor.kt | backend/device-telemetry.js
Human approval gates.github/workflows/railway-preview-cleanup.yml
EKKOLearnAI/hermes-studio

Web dashboard for Hermes Agent — multi-platform AI chat, session management, scheduled jobs, usage analytics

8,417 starsTypeScriptUpdated Jun 25, 2026agentai-agentchat-uiIssue helper
65%
A
D12/2
D20/2
D31/2
D42/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationpackages/client/src/composables/useToolTraceVisibility.ts | tests/client/tool-trace-visibility.test.ts
Human approval gatesdocs/chat-chain-changes/2026-06-11-pending-write-gate-session-approval.md
respawn-llc/kent

Agentic Coding platform for Professional SWE work focusing on output quality. Self-review, supervision, dynamic workflows, dynamic agent loops.

33 starsGoUpdated Jun 25, 2026agentaiai-agentIssue helper
65%
A
D12/2
D22/2
D31/2
D40/2
D52/2
D61/2
D71/2
D81/2
Human approval gatesapps/desktop/src/features/workflow-editor/WorkflowGroupDragPreview.tsx | prompts/workflow/human_only_task_action_denied.md | prompts/workflow/task_complete_human_safety_warning.md
Security policy and secret hygieneSECURITY.md
phasespace-labs/palinode

The memory substrate for AI agents and developer tools. Git-versioned, file-native, MCP-first.

26 starsPythonUpdated Jun 25, 2026agent-memoryai-agentgit-versionedIssue helper
58%
A
D12/2
D21/2
D30/2
D42/2
D52/2
D60/2
D71/2
D81/2
AI observability instrumentationtests/test_telemetry_recall_exclusion.py
Security policy and secret hygieneSECURITY.md
Gitlawb/openclaude

runs anywhere. uses anything

29,336 starsTypeScriptUpdated Jun 25, 2026aiai-agentai-toolsIssue helper
54%
A
D12/2
D21/2
D31/2
D41/2
D51/2
D60/2
D71/2
D81/2
AI observability instrumentationscripts/no-telemetry-growthbook-stub.test.ts | scripts/no-telemetry-plugin.ts | src/commands/ant-trace
Security policy and secret hygieneSECURITY.md | src/bridge/bridgePermissionCallbacks.ts | src/bridge/workSecret.test.ts
NotASithLord/peerd

The first AI agent harness native to the browser. A Chrome/Firefox extension that runs the agent loop in your browser — drives your tabs, spins up sandboxed compute (JS notebooks, WASM Linux VMs, client-side apps), and shares what it builds peer-to-peer. BYOK · no backend · no telemetry.

67 starsJavaScriptUpdated Jun 25, 2026agenticai-agentbrowser-extensionIssue helper
53%
A
D12/2
D21/2
D32/2
D40/2
D51/2
D60/2
D71/2
D81/2
Security policy and secret hygieneSECURITY.md | docs/hooks/examples/block-secret-typing.md | docs/store/PERMISSION-JUSTIFICATIONS.md
Provider fallback or degraded modeextension/peerd-provider/adapters/anthropic.js | extension/peerd-provider/failover.js | extension/peerd-provider/format/from-anthropic.js
mobius-os/mobius

Self-hosted AI agent that builds apps. Chat with a coding agent (Claude Code or Codex) that builds mini-apps, modifies its own UI, and gets sharper the more you use it. Your personal AI operating system, on your own server.

8 starsPythonUpdated Jun 25, 2026agenticai-agentai-app-builderIssue helper
51%
A
D10/2
D21/2
D31/2
D41/2
D52/2
D60/2
D71/2
D82/2
AI observability instrumentationbackend/app/memory_trace.py | backend/tests/test_memory_trace.py
Security policy and secret hygieneSECURITY.md | backend/tests/test_secret_key_flag.py
zebbern/claude-code-guide

Claude Code Guide - Setup, Commands, workflows, agents, skills & tips-n-tricks go from beginner to power user!

4,333 starsPythonUpdated Jun 25, 2026aiai-agentai-agent-toolsIssue helper
45%
B
D12/2
D20/2
D30/2
D42/2
D51/2
D60/2
D72/2
D80/2
AI observability instrumentationskills/database-optimizer/references/monitoring-analysis.md | skills/playwright/playwright-cli/tracing-and-debugging.md
Security policy and secret hygieneskills/r2-upload/SECURITY.md | skills/repo-audit/scripts/secret-scan.sh
mhawthorne/gza

AI coding assistant task and workflow manager

11 starsPythonUpdated Jun 25, 2026ai-agentai-coding-toolsclaude-codeIssue helper
43%
B
D12/2
D21/2
D30/2
D41/2
D52/2
D60/2
D70/2
D81/2
Provider fallback or degraded modesrc/gza/providers/gemini.py
Release and deployment gates.github/workflows/pypi.yml | .github/workflows/test.yml | .github/workflows/testpypi.yml
staticroostermedia-arch/engram

Geometric agent memory for Agents: 8-tool lean contract, harness injection at wake, session handoff. 79 MCP tools. Not a vector DB. Rust + MCP.

12 starsRustUpdated Jun 25, 2026agent-memoryaiai-agentIssue helper
41%
A
D11/2
D21/2
D31/2
D41/2
D51/2
D60/2
D71/2
D80/2
AI observability instrumentationcrates/engram-gpu/kernels/int8_raytracer.wgsl | grok-plugin-engram/commands/engram-trace.md
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md
ozgurcd/gograph

Local-only Go static analysis engine with a built-in MCP server. Gives AI coding agents deterministic structural awareness: call graphs, impact analysis, symbol search, and more.

189 starsGoUpdated Jun 25, 2026agentic-codingai-agentai-coding-assistantIssue helper
36%
B
D11/2
D21/2
D30/2
D41/2
D51/2
D60/2
D71/2
D80/2
AI observability instrumentationinternal/search/trace.go
Security policy and secret hygieneSECURITY.md
1ay1/agentty

AI pair programming in your terminal — one static binary, sub-ms startup, any model

19 starsC++Updated Jun 25, 2026acpagentic-codingai-agentIssue helper
32%
B
D10/2
D20/2
D31/2
D40/2
D52/2
D60/2
D71/2
D81/2
Security policy and secret hygienedocs/agent_panel/09_permissions.md | include/agentty/runtime/view/thread/turn/permission.hpp | src/runtime/view/thread/turn/permission.cpp
Provider fallback or degraded modeinclude/agentty/provider/anthropic | include/agentty/provider/anthropic/oauth.hpp | include/agentty/provider/anthropic/provider.hpp
mm7894215/TokenTracker

Track token usage across 25 AI coding tools — Claude Code, Codex, Cursor, Gemini, Kiro, OpenCode, Antigravity, Copilot, Kimi, CodeBuddy, WorkBuddy, Grok, Kilo, Roo, Zed, Goose, Mimo, ZCode & more — local-first, zero-config, with a dashboard, macOS menu bar app, and desktop widgets.

764 starsJavaScriptUpdated Jun 25, 2026aiai-agentai-toolsIssue helper
29%
B
D11/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D81/2
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md
Provider fallback or degraded modesrc/lib/pricing/litellm-fetcher.js | test/kiro-fallback.test.js
dtzp555-max/ocp

Turn your Claude Pro/Max subscription into an OpenAI-compatible API. One proxy, multiple IDEs, LAN sharing for the whole family. $0 extra cost.

79 starsJavaScriptUpdated Jun 25, 2026aiai-agentai-agentsIssue helper
24%
B
D11/2
D20/2
D31/2
D41/2
D51/2
D60/2
D70/2
D80/2
Release and deployment gates.github/workflows/alignment.yml | .github/workflows/gitleaks.yml | .github/workflows/release.yml
Agent operating instructionsAGENTS.md | CLAUDE.md
voidly-ai/voidly-pay

Off-chain credit ledger + hire marketplace for AI agents. Ed25519-signed envelopes, atomic settlement, hire-and-release escrow. https://voidly.ai/pay

10 starsJavaScriptUpdated Jun 25, 2026a2aagent-paymentsagent-to-agentIssue helper
22%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D81/2
Security policy and secret hygieneSECURITY.md
Provider fallback or degraded modeadapters/openai-compat | adapters/openai-compat/README.md | adapters/openai-compat/package.json
JokerJohn/openclaw-autotrader

A 30-day public U.S. stock challenge: follow a 5000 HKD 🦞 claw through live market days.

38 starsJavaScriptUpdated Jun 25, 2026ai-agentalgorithmic-tradingautotraderIssue helper
12%
C
D10/2
D21/2
D30/2
D41/2
D50/2
D60/2
D70/2
D80/2
Schema or contract validationxhs-agent/schemas/public-snapshot.schema.json | xhs-agent/schemas/xhs-post-package.schema.json
Incident or drift evidencedocs/incidents | docs/incidents/.gitkeep
xbtlin/ai-berkshire

AI 时代的伯克希尔:基于 Claude Code 的价值投资研究框架。巴菲特·芒格·段永平·李录四大师方法论 + 多Agent并行研究。| AI-era Berkshire: a value investing research framework built on Claude Code. 4 masters' methodologies + multi-agent adversarial analysis.

1,333 starsPythonUpdated Jun 25, 2026aiai-agentanthropicIssue helper
7%
D
D11/2
D20/2
D30/2
D40/2
D50/2
D60/2
D70/2
D80/2
Agent operating instructionsCLAUDE.md
stormzhang/token-tracker

Track token usage across local AI agents (Claude Code, Codex) — Custom StatusLine, CLI Dashboard with cost analysis, rate limit monitoring, and session tracking

315 starsPythonUpdated Jun 25, 2026ai-agentclaude-codecliIssue helper
7%
D
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D70/2
D80/2
Release and deployment gates.github/workflows/ci.yml
Armaan29-09-2005/AI-OSINT-Security-Analyzer

AI OSINT Security Analyzer is an intelligent platform that leverages AI to perform autonomous investigations across various intelligence sources. With features like multi-source integration and real-time threat intelligence, it ensures comprehensive security assessments. 🛡️🔍

12 starsPythonUpdated Jun 25, 2026aiai-agentanalysisIssue helper
0%
U
D10/2
D20/2
D30/2
D40/2
D50/2
D60/2
D70/2
D80/2
No PSF evidence paths detectedPublic scan did not find matching path evidence.
MunnaXbadmash/ai-dev-assistant-framework

A plug-and-play framework for AI-assisted software development, enhancing context-aware collaboration in complex codebases. Perfect for tools like GitHub Copilot. 🐙✨

5 starsUpdated Jun 25, 2026aiai-agentai-agents-frameworkIssue helper
0%
U
D10/2
D20/2
D30/2
D40/2
D50/2
D60/2
D70/2
D80/2
No PSF evidence paths detectedPublic scan did not find matching path evidence.

Where the repository list comes from

The benchmark uses GitHub's public repository search endpoint and rotates focused queries for AI agent, agentic AI, LLM agent, and MCP server repositories. The run de-duplicates repositories, excludes archived projects and forks when GitHub returns those flags, and sorts the published table by visible PSF evidence coverage.

  • topic:ai-agent archived:false fork:false stars:>=5
  • topic:agentic-ai archived:false fork:false stars:>=5
  • topic:llm-agent archived:false fork:false stars:>=5
  • topic:mcp-server archived:false fork:false stars:>=5
  • "ai agent" in:name,description,readme archived:false fork:false stars:>=5

How teams use this

The benchmark gives maintainers and production AI teams a concrete way to improve visible evidence. A project can publish the missing artifacts, run its own Agent Readiness report, and link to a stable monthly edition when citing broader ecosystem findings.

  • Use the live table to inspect current public evidence patterns.
  • Use immutable editions for citations, journalism, and longitudinal comparison.
  • Use the issue generator to turn a gap into a constructive maintainer task.
  • Use the evidence pack and control templates to publish the missing artifacts.
Evidence pack builderIssue generatorControl templates
Start here — production AI

Foundational reference pages for practitioners and teams evaluating production AI safety, agent readiness, and certification paths.

What is production AI?AI agent production ready checklistAI certification comparedAI-proof your careerWorkflowOS open-source PSF studioPSF standard →