Org Architecture & Community Health
Enterprise-grade governance for 116 repositories
Related Live Sites
Org Architecture & Community Health Concept Sketch
The Problem: Standards at Scale
Every multi-repository organization faces the same entropy: standards drift, security configurations diverge, and contributor onboarding becomes a per-repo archaeology exercise. Conway's law The 1968 observation that system designs inevitably mirror the communication structures of the organizations that build them predicts this — organizations that design systems are constrained to produce designs which mirror their communication structures, and when those structures are fragmented across dozens of repositories, fragmentation is the inevitable output.[1] The open-source world has known this for decades. Raymond's analysis of the bazaar model revealed that even decentralized projects require institutional infrastructure — mailing list conventions, patch formats, release protocols — to avoid collapsing under their own coordination costs.[2] GitHub's special .github repository mechanism offers a solution: a single repository whose contents — workflows, templates, governance documents, AI configurations — are inherited by every other repository in the organization. This project takes that mechanism and builds a comprehensive organizational operating system on top of it, governing 116 repositories across 8 organizations from a single source of truth.
Community Health Layer
The foundation of any collaborative software project is not its code but its social contracts. GitHub's community health files Standardized governance documents (Code of Conduct, Contributing Guide, Security Policy) that GitHub automatically inherits across all repos in an organization inheritance means that a CODE_OF_CONDUCT.md, CONTRIBUTING.md, SECURITY.md, and SUPPORT.md placed in the .github repository appear on every repository in the organization that has not defined its own — creating a baseline of contributor experience without requiring any per-repo configuration.[3] Eghbal's research into the maintenance burden of open-source projects demonstrates that community health filesStandardized governance documents (Code of Conduct, Contributing Guide, Security Policy) that GitHub automatically inherits across all repos in an organization are not bureaucratic overhead but essential load-bearing infrastructure: projects without clear contribution guidelines see higher friction in first-time contributions and higher maintainer burnout. Fogel extends this argument to the operational level, showing that explicit governance documents reduce "bike-shedding" — endless procedural debates — by providing authoritative answers to common questions before they are asked.[4] In this system, the four health files are not boilerplate templates but carefully authored documents reflecting the specific governance philosophy of the eight-organ architecture: the Code of Conduct references the system's values, the Contributing guide maps onto the promotion state machineA model describing all possible states a system can be in and the transitions between them, and the Security policy defines responsible disclosure procedures tailored to the project's threat model.
name: CI Pipeline
on:
workflow_call:
inputs:
language:
required: true
type: string
jobs:
lint-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
with:
fetch-depth: 0
- name: Setup environment
uses: actions/setup-node@60edb5dd545a775178f52524783378180af0d1f8 # v4.0.2
if: inputs.language == 'typescript'
with:
node-version: '22'
- name: Security scan — Gitleaks
uses: gitleaks/gitleaks-action@1f2d10fb689bc07a5f56f48e6c6b8ee4a47c1dab # v2.3.6
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Security scan — CodeQL
uses: github/codeql-action/analyze@c4fb451437765abf5018c6571834e2c3c6a21745 # v3.24.6
with:
languages: ${{ inputs.language }}
- name: Run tests
run: |
if [ "${{ inputs.language }}" = "typescript" ]; then
npm ci && npm test
elif [ "${{ inputs.language }}" = "python" ]; then
pip install -r requirements.txt && pytest
fi CI/CD Infrastructure
The second layer transforms GitHub Actions from per-repository configuration files into an organizational platform. Every workflow uses workflow_call to expose itself as a reusable component — individual repositories call these shared workflows with parameters rather than duplicating pipeline logic. The critical security decision is SHA-pinning Referencing GitHub Actions by their immutable commit hash rather than a mutable tag, preventing supply-chain attacks from compromised action versions .[5] Humble and Farley's core insight — that the deployment pipeline should be a first-class artifact, versioned and tested like application code — applies doubly at the organizational level, where pipeline drift across repositories creates silent security gaps and inconsistent quality gates. The critical security decision is SHA-pinningReferencing GitHub Actions by their immutable commit hash rather than a mutable tag, preventing supply-chain attacks from compromised action versions: every GitHub Action reference uses a full commit SHA rather than a mutable tag, preventing supply-chain attacks where a compromised action tag is silently replaced with malicious code.[6] Bass et al. identify supply-chain integrity as a DevOps architectural concern — this repository implements that concern at the organizational boundary, ensuring that no repository in the system can accidentally reference an unvetted action version.
| Tool | Purpose | Integration Method | Scope |
|---|---|---|---|
| CodeQL | Static analysis & vulnerability detection | GitHub-native, SHA-pinned action | Language-aware semantic queries |
| Gitleaks | Secrets detection in git history | Pre-commit hook + CI action | Full repository history scan |
| TruffleHog | High-entropy string & credential scanning | CI action with custom regexes | Commit diffs + full scans |
The three scanning tools form a defense-in-depth strategy. CodeQL operates at the semantic level, understanding language-specific vulnerability patterns — SQL injection, cross-site scripting, insecure deserialization — by modeling code as a queryable database. Gitleaks works at the git layer, scanning every commit in repository history for patterns matching API keys, tokens, and credentials, catching secrets that were committed and later "deleted" but remain in git history. TruffleHog complements both with high-entropy string detection and custom regex patterns, catching credential formats that Gitleaks' built-in patterns miss.[5] Together, they implement what Humble and Farley call the "quality gate" pattern: no code reaches production without passing all three scans. The organizational inheritance model means this gate is not opt-in — every repository inherits it by default, and opting out requires an explicit, reviewable override.
AI Agent Framework
The third layer is the most forward-looking: a framework of production AI agents, each specialized for a domain within the organizational workflow. Security agents review pull requests for vulnerability patterns, infrastructure agents validate workflow configurations and dependency graphs, development agents assist with code generation within the established patterns, and documentation agents enforce consistency across the 116 repositories' README files and changelogs.[7] Wooldridge's taxonomy of agent architectures — reactive, deliberative, and hybrid — maps onto the framework's design: security agents are reactive (trigger on PR events), documentation agents are deliberative (analyze repository state and generate updates), and development agents are hybrid (respond to immediate requests while maintaining context across sessions). The agents are implemented as GitHub Copilot-compatible configurations, meaning they integrate with the developer's existing IDE workflow rather than requiring a separate toolchain.[8] Russell and Norvig's framework for rational agents — perceive, reason, act — is realized concretely: each agent perceives through GitHub event webhooks, reasons through its specialized prompt context, and acts through the GitHub API. The chatmodes layer adds specialized personas — a security auditor, a documentation editor, a dependency analyst — each with tailored system prompts and tool access policies.
name: security-reviewer
description: Reviews PRs for security vulnerabilities and compliance
triggers:
- pull_request.opened
- pull_request.synchronize
context:
files:
- SECURITY.md
- .github/workflows/security-scan.yml
instructions: |
You are a security reviewer for the organvm ecosystem.
Check for: hardcoded secrets, unsafe dependencies,
missing input validation, SQL injection vectors,
and OWASP Top 10 patterns.
permissions:
contents: read
pull-requests: write
security-events: read
copilot:
compatible: true
slash-commands:
- /security-review
- /threat-model
model-preference: claude-sonnet Design Philosophy: Infrastructure as Culture
The deeper claim of this project is that organizational culture can be encoded as versionable, reviewable, inheritable artifacts. This is infrastructure-as-culture Applying infrastructure-as-code principles to governance and social contracts, making organizational norms versionable, testable, and auditable thinking applied not to servers or networks but to governance, contributor experience, and development methodology.[9] Morris argues that treating infrastructure as code provides four benefits — reproducibility, consistency, auditability, and recoverability — and every one of these applies to organizational standards. When the Code of Conduct is a file in a Git repository, it has a commit history, a blame log, a review trail. When CI policy changes, the diff is visible and the rollback path is clear. This transforms governance from an implicit social agreement into an explicit, testable system.[10] Ostrom's Nobel Prize-winning research on institutional governance demonstrated that successful commons management requires clear boundaries, proportional rules, collective-choice arrangements, and monitoring — all of which this repository implements through GitHub's access controls, inherited workflow rules, PR-based governance changes, and automated health monitoring respectively.
| Ostrom Principle | Repository Implementation |
|---|---|
| Clearly defined boundaries | Organization membership + CODEOWNERS files |
| Proportional rules | Tiered CI templates (minimal, standard, full) |
| Collective-choice arrangements | PR-based governance changes with review requirements |
| Monitoring | Automated health audits + monthly metrics workflows |
| Graduated sanctions | Warning comments → required reviews → branch protection |
| Conflict resolution | Code of Conduct with escalation procedures |
The decision to make AI agents part of this governance layer — rather than a separate system — reflects a deliberate architectural position: AI tooling should be subject to the same review, versioning, and access control processes as any other piece of organizational infrastructure. An agent's prompt is as consequential as a CI pipeline's configuration, and should be governed accordingly.[7] Wooldridge identifies trust and delegation as the central challenges of multi-agent systems; by placing agent configurations within the same governance framework as security policies and contribution guidelines, the system makes agent capabilities visible, auditable, and revocable through familiar Git-based workflows rather than opaque administrative interfaces.
Automation Pipeline
Beyond static governance files, the repository includes a Python automation layer — an src/ directory containing two subsystems: ai_framework/ and automation/. The AI framework implements the agent, chatmode, and collection inventory generators that produce machine-readable manifests of every AI configuration deployed across the organization. The automation subsystem handles operational concerns: health score calculation, workflow analysis, repository evaluation, label synchronization, action pin updates, batch onboarding, and proactive maintenance scheduling. These are not ad-hoc scripts but production Python modules with typed interfaces, structured configuration, and comprehensive test coverage.[5] Humble and Farley's core principle — that everything needed to build, test, and deploy should be version-controlled and automated — extends here beyond code and pipelines to the organizational infrastructure itself: health audits, label taxonomies, agent inventories, and maintenance schedules are all generated from code, tested against assertions, and deployed through the same CI pipelines that govern application software.
from dataclasses import dataclass
from enum import Enum
class GovernanceDimension(Enum):
COMMUNITY_HEALTH = "community_health" # weight: 0.25
CI_COVERAGE = "ci_coverage" # weight: 0.30
SECURITY_SCANNING = "security_scanning" # weight: 0.25
AI_FRAMEWORK = "ai_framework" # weight: 0.20
@dataclass
class HealthReport:
repo: str
scores: dict[GovernanceDimension, float]
overrides: list[str]
last_audit: str
@property
def weighted_score(self) -> float:
weights = {
GovernanceDimension.COMMUNITY_HEALTH: 0.25,
GovernanceDimension.CI_COVERAGE: 0.30,
GovernanceDimension.SECURITY_SCANNING: 0.25,
GovernanceDimension.AI_FRAMEWORK: 0.20,
}
return sum(
self.scores.get(dim, 0.0) * w
for dim, w in weights.items()
)
@property
def compliance_status(self) -> str:
score = self.weighted_score
if score >= 0.9:
return "compliant"
elif score >= 0.7:
return "partial — review overrides"
return "non-compliant — action required" Testing and Verification
The repository maintains over sixty unit tests covering every automation module and AI framework component. Test categories include workflow health analysis, repository health scoring, agent and chatmode frontmatter validation, label schema validation, action pin security verification, batch onboarding processes, quota management, secret rotation, SLA monitoring, notification integration, incident response automation, and ML-based workflow failure prediction. The test suite also includes security-specific tests: SSRF Server-Side Request Forgery -- an attack where a server is tricked into making requests to unintended internal resources on the attacker's behalf protection logic, web crawler security boundaries, and secret manager access controls.[6] Bass et al.'s treatment of security as an architectural concern — not a feature to be bolted on but a quality attribute to be designed for — manifests in the test suite's structure: security tests are not isolated in a separate directory but woven throughout the automation modules, ensuring that every component that touches external resources or processes untrusted input is tested against adversarial scenarios.
| Domain | Test Count | Modules Covered |
|---|---|---|
| Automation | ~30 | Health scoring, workflow analysis, label sync, batch onboarding, maintenance |
| AI Framework | ~15 | Agent inventory, chatmode validation, collection frontmatter, prompt generation |
| Security | ~10 | SSRF protection, web crawler boundaries, secret management, token validation |
| Integration | ~8 | Notification pipelines, SLA monitoring, incident response, deployment checklists |
Tradeoffs and Limitations
The inheritance model has a fundamental tension: uniformity versus autonomy. A repository that needs a custom CI pipeline must explicitly override the inherited one, creating a maintenance burden at the repository level and a coordination cost at the organizational level — every override is a potential drift point that the monthly health audit must track.[2] Raymond's observation that open-source governance must balance centralized standards with decentralized innovation applies directly: too much inheritance and repositories become constrained; too little and the organizational operating system loses its coherence. The current design errs toward inheritance, with escape hatches for repositories that genuinely need custom behavior — a choice that reflects the system's scale (116 repositories) and its need for auditable consistency over per-repo flexibility.
A second tradeoff concerns the Python automation layer's relationship to the rest of the system. The .github repository is a GitHub-specific mechanism, and the automation scripts are tightly coupled to GitHub's API, webhook model, and inheritance semantics. This coupling delivers significant value within the GitHub ecosystem but would require substantial rearchitecting to support alternative forges like GitLab or Gitea. The decision to accept this coupling — rather than abstracting behind a forge-agnostic interface — reflects a pragmatic assessment: the system operates entirely on GitHub, and the organizational infrastructure's value comes from deep integration with GitHub-specific features (organization-level inheritance, Copilot agent configurations, CodeQL integration) that a generic abstraction would dilute.[9]
References
- Conway, Melvin E.. How Do Committees Invent?. Datamation, 1968.
- Raymond, Eric S.. The Cathedral and the Bazaar. O'Reilly Media, 1999.
- Eghbal, Nadia. Working in Public: The Making and Maintenance of Open Source Software. Stripe Press, 2020.
- Fogel, Karl. Producing Open Source Software: How to Run a Successful Free Software Project. O'Reilly Media, 2005.
- Humble, Jez and David Farley. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley, 2010.
- Bass, Len, Ingo Weber, and Liming Zhu. DevOps: A Software Architect's Perspective. Addison-Wesley, 2015.
- Wooldridge, Michael. An Introduction to MultiAgent Systems. John Wiley & Sons, 2009.
- Russell, Stuart and Peter Norvig. Artificial Intelligence: A Modern Approach. Pearson, 2020.
- Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud. O'Reilly Media, 2016.
- Ostrom, Elinor. Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press, 1990.