Neddar Islam
0%
Software Engineering
Featured

Goodhart's & Gilb's Laws: The Science of Software Metrics

Master the art of software metrics with Goodhart's and Gilb's Laws. Learn how to avoid measurement traps and build actionable monitoring systems for engineering teams.

Islam Neddar
7 min read
metrics
monitoring
engineering-management
team-performance
DORA-metrics
software-engineering

Goodhart's & Gilb's Laws: The Science of Software Metrics

Every engineering organization wrestles with the same fundamental challenge: How do you measure what matters without destroying what you're trying to improve?

This question sits at the heart of modern software development. We need metrics to understand our systems, track progress, and make informed decisions. Yet poorly implemented metrics can create perverse incentives that actually harm the very outcomes we're trying to achieve.

Two powerful laws provide the framework for navigating this challenge: Goodhart's Law and Gilb's Law. Understanding these principles will transform how you approach metrics in your engineering organization, helping you build monitoring systems that drive genuine improvement rather than gaming behaviors.

Goodhart's Law: When Metrics Become Targets

"When a measure becomes a target, it ceases to be a good measure."

This is perhaps the most important principle in organizational measurement. Goodhart's Law reveals why so many well-intentioned metric programs backfire spectacularly in engineering teams.

The Mechanics of Metric Corruption

The moment you make a metric a target for reward or evaluation, human behavior adapts to optimize for that specific number—often at the expense of the actual desired outcome.

Real-world examples:

  • Lines of code as productivity measure → Verbose, bloated codebases
  • Story points for velocity tracking → Inflated estimates and artificially small tasks
  • Code coverage targets → Meaningless tests that inflate percentages without improving quality
  • Bug closure rates → Tickets marked "resolved" without actually fixing underlying issues

Why Gaming Always Wins

There are three fundamental reasons why Goodhart's Law is so pervasive:

1. The Path of Least Resistance

It's almost always easier to manipulate a proxy metric than achieve the real goal it represents. Hitting a numeric target becomes the primary job, displacing the complex work of delivering genuine value.

2. Malicious Compliance

When pressured to hit specific numbers, people comply with the letter of the law while violating its spirit. They'll deploy code that meets technical specifications but fails to solve user problems.

3. Output vs. Outcome Confusion

Simple, countable metrics (outputs) like commits or deployments are easy to measure and target. The real goals (outcomes) like improved user satisfaction or reduced system downtime are harder to quantify, so organizations default to measuring what's convenient rather than what's valuable.

Defending Against Goodhart's Law

Use Process Metrics, Not People Metrics

Focus on understanding system health rather than evaluating individual performance:

yaml
# Good: System Health Metrics
Cycle Time: "How long from idea to production?"
Deployment Success Rate: "What percentage of deployments succeed?"
Mean Time to Recovery: "How quickly do we resolve incidents?"

# Avoid: Individual Performance Metrics
Developer Lines of Code: "How productive is Sarah?"
Commits per Day: "Is John working hard enough?"
Tickets Closed: "Who's the most efficient?"

Build Balanced Dashboards

Never rely on a single metric. Create dashboards with counter-balancing measures that make gaming difficult:

DORA Metrics (Excellent Example):

  • Deployment Frequency (Speed) ↔ Change Fail Rate (Quality)
  • Lead Time for Changes (Efficiency) ↔ Mean Time to Recovery (Reliability)

Tie Targets to Business Outcomes

When you must set targets, connect them directly to customer or business value:

diff
- Target: "Ship 10 features this quarter"
+ Target: "Reduce customer churn by 5% through improved UX"

- Target: "Achieve 90% test coverage"
+ Target: "Reduce production bugs by 50% while maintaining deployment velocity"

Gilb's Law: The Measurement Imperative

"Anything you need to quantify can be measured in some way that is superior to not measuring it at all."

Gilb's Law directly challenges the common excuse: "But that's too subjective to measure!" It argues that an imperfect metric is vastly superior to no metric at all.

Why Measurement Matters

Vagueness Prevents Action

Without measurement, goals remain wishful thinking:

  • "Improve user experience" → Vague wish
  • "Reduce core workflow clicks from 7 to 4" → Actionable goal

Measurement Forces Clarity

Attempting to measure fuzzy concepts like "code quality" forces definition. Is it test coverage? Cyclomatic complexity? Bug rates? This definitional process creates shared team understanding.

Progress Over Perfection

Teams get stuck in analysis paralysis, endlessly debating metric flaws. Gilb's Law encourages pragmatism: start with "good enough" now, learn and iterate.

Practical Implementation Strategies

Start with Proxy Metrics

For seemingly unmeasurable qualities, find reasonable proxies:

yaml
Developer Morale:
  Proxies:
    - Voluntary team event attendance
    - Weekly happiness poll responses
    - Internal tool usage rates
    - Code review participation

Platform Stability:
  Proxies:
    - Mean Time Between Failures (MTBF)
    - Mean Time to Recovery (MTTR)
    - P1 incident count per month
    - System uptime percentage

Decompose Abstract Goals

Break large concepts into measurable components:

"Improve Development Velocity" becomes:

  • Reduce average pull request review time
  • Decrease build pipeline duration
  • Increase deployment success rate
  • Minimize rollback frequency

Iterate Your Metrics

Treat metrics like code—your first version will have bugs:

  1. Implement the initial metric
  2. Observe its behavior and shortcomings
  3. Learn from unintended consequences
  4. Refactor or replace as needed

Here's how these principles apply to common engineering tools:

Datadog/New Relic Implementation

javascript
// Good: Balanced metric collection
const metrics = {
  performance: ['response_time', 'throughput'],
  quality: ['error_rate', 'success_rate'],
  user_experience: ['page_load_time', 'bounce_rate']
};

// Avoid: Single-metric focus
const badMetrics = {
  only_speed: ['requests_per_second'] // Missing quality context
};

GitHub Analytics Approach

Focus on team process health, not individual ranking:

yaml
Team Health Metrics:
  - Pull request review time distribution
  - Deployment frequency trends
  - Incident response effectiveness
  - Technical debt tracking

Individual Growth Metrics (Private):
  - Skill development progress
  - Mentoring contributions
  - Learning goal achievement

Synthesis: Building Effective Measurement Systems

The combination of Goodhart's and Gilb's Laws provides a powerful framework:

The Four-Step Approach

  1. Identify What Matters (Gilb): Define the business outcomes you actually care about
  2. Find Proxy Metrics (Gilb): Create measurable approximations of those outcomes
  3. Balance Your Dashboard (Goodhart): Use multiple, counter-balancing metrics
  4. Focus on Learning (Both): Use metrics for insight, not punishment

Real-World Example: Developer Productivity

Instead of measuring individual output, focus on system effectiveness:

yaml
Productivity System Health:
  Flow Metrics:
    - Work in Progress limits adherence
    - Cycle time variability
    - Queue time analysis
  
  Quality Metrics:
    - Defect escape rate
    - Technical debt trends
    - Code review effectiveness
  
  Learning Metrics:
    - Experiment success rate
    - Knowledge sharing frequency
    - Cross-team collaboration index

Key Takeaways for Engineering Leaders

Goodhart's Law teaches us:

  • Metrics become corrupted when used as targets
  • Focus on process improvement, not people evaluation
  • Balance speed metrics with quality metrics
  • Connect measurements to genuine business value

Gilb's Law reminds us:

  • An imperfect metric beats no metric
  • Measurement forces clarity of thought
  • Start simple and iterate
  • Use metrics for insight, not weapons

Take Action: Start Measuring What Matters

The next time you're designing metrics for your team, ask yourself:

  1. What business outcome am I trying to achieve?
  2. How might this metric be gamed?
  3. What counter-balancing metrics should I include?
  4. Am I measuring the system or the people?

Remember: The goal isn't perfect measurement—it's actionable insight that drives genuine improvement.

Ready to implement better metrics in your organization? Start with one balanced pair of metrics this week. Measure both speed and quality, both efficiency and effectiveness. Your future self (and your team) will thank you.

Want more insights on engineering leadership and building high-performing teams? Subscribe to my newsletter for practical advice delivered weekly, or explore my other articles on software engineering management.

Share: