Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

PRD-001: GitHub Integration

Author: Product Team
Date: 2024-01-20
Status: Launched

Summary

Enable customers to connect their GitHub organizations to Tenki Cloud and automatically provision runners for their repositories without any configuration or infrastructure management.

Problem Statement

Development teams waste significant time and money managing GitHub Actions infrastructure:

  • Setting up self-hosted runners requires DevOps expertise
  • Maintaining runner infrastructure distracts from product development
  • GitHub’s hosted runners are expensive and have limited customization
  • Scaling runners up/down based on demand is complex

Who experiences this: Engineering teams using GitHub Actions for CI/CD Impact: Teams spend 10-20 hours/month on runner management instead of shipping features

Goals & Success Metrics

Primary Goal: Zero-config GitHub Actions runners that just work

Success Metrics:

  • Time to first runner: < 5 minutes from signup
  • Runner startup time: < 30 seconds
  • Platform uptime: 99.9%
  • Customer runner cost: 50% less than GitHub hosted
  • Monthly active organizations: 100 by Q2

User Stories

  1. As a developer, I want to connect my GitHub org so that runners are automatically available for all my repos
  2. As a team lead, I want to set spending limits so that we don’t exceed our CI/CD budget
  3. As a DevOps engineer, I want to customize runner specs so that our builds run efficiently
  4. As a finance manager, I want to see detailed usage reports so that I can allocate costs to teams

Requirements

Must Have (MVP)

  • GitHub App for OAuth authentication
  • Automatic runner provisioning for workflow_job events
  • Support for Linux runners (Ubuntu 22.04)
  • Basic usage dashboard showing minutes used
  • Automatic runner cleanup after job completion
  • Support for public and private repositories

Should Have

  • Multiple runner sizes (2-16 vCPU)
  • Usage alerts and spending limits
  • Windows and macOS runners
  • Runner caching between jobs
  • Team-based access controls

Nice to Have

  • Custom runner images
  • Dedicated runner pools
  • GitHub Enterprise Server support
  • API for programmatic management

Technical Approach

  1. GitHub App handles authentication and webhook events
  2. Webhook handler processes workflow_job events
  3. Temporal workflows orchestrate runner lifecycle
  4. Kubernetes operators manage runner pods
  5. Usage tracking via TigerBeetle for accurate billing

Risks & Mitigations

RiskImpactLikelihoodMitigation
GitHub API rate limitsHighMediumImplement caching and exponential backoff
Runner startup time > 30sHighMediumPre-warm runner pools, optimize images
Security vulnerabilitiesHighLowRegular security audits, isolated runners
Cost overrunsMediumMediumReal-time usage tracking and limits

Timeline

  • Week 1-2: GitHub App development and authentication
  • Week 3-4: Webhook handling and runner provisioning
  • Week 5-6: Usage tracking and billing integration
  • Week 7: Beta testing with friendly customers
  • Week 8: Public launch

Open Questions

  • Should we support GitHub Enterprise? → Not in MVP
  • How do we handle runner caching? → Post-MVP feature
  • What’s our runner retention policy? → 7 days for logs
  • How do we handle abuse/crypto mining? → Usage anomaly detection

Post-Launch Results

Launched: 2025-04-15

Actual Metrics (as of 2024-06-01):

  • Time to first runner:
  • Runner startup time:
  • Platform uptime:
  • Cost savings:
  • Monthly active orgs:

Key Learnings:

  1. Pre-warming runner pools was critical for startup time
  2. Customers want custom images more than expected
  3. Windows runner demand higher than anticipated