Implementing Feature Flags for Safer Deployments

deploymentfeature-flagsrisk-managementtooling

Our deployment process was all-or-nothing. New features went live to all users immediately upon deployment, making rollbacks disruptive and limiting our ability to test in production.

Implement a feature flag system using LaunchDarkly for gradual rollouts and instant kill switches

Build custom feature flag system

Pros
  • Full control over implementation
  • No external dependency
  • No per-seat licensing costs
Cons
  • Significant development effort
  • Need to build targeting, analytics, UI
  • Maintenance burden on the team

Use LaunchDarkly

Pros
  • Battle-tested at scale
  • Rich targeting capabilities
  • Built-in analytics and experimentation
  • Good SDK support
Cons
  • Monthly cost (~$500/month for our scale)
  • External dependency
  • Data leaves our infrastructure

Use environment variables

Pros
  • Simple to implement
  • No external dependencies
Cons
  • Requires redeployment to change
  • No gradual rollout capability
  • No user targeting

The cost of LaunchDarkly is justified by the development time saved and the risk reduction from gradual rollouts. Building a comparable system in-house would take months and require ongoing maintenance. The ability to instantly disable problematic features without redeployment is invaluable.

The Problem

Our deployment anxiety was high:

  • Every deploy was a potential incident
  • Rollbacks required full redeployment (5-10 minutes)
  • No way to test features with subset of users
  • Product couldn’t run A/B tests

Feature Flag Strategy

We established patterns for flag usage:

Release Flags: Temporary flags for new features

if (flags.isEnabled('new-checkout-flow', user)) {
  return newCheckoutFlow();
}
return legacyCheckoutFlow();

Ops Flags: Permanent flags for operational control

if (flags.isEnabled('enable-cache', { service: 'api' })) {
  return cachedResponse();
}

Experiment Flags: For A/B testing

const variant = flags.getVariant('pricing-test', user);
return pricingPages[variant];

Rollout Process

New features now follow this process:

  1. Deploy with flag disabled (0%)
  2. Enable for internal users (dogfooding)
  3. Enable for 1% of users, monitor
  4. Gradually increase: 5% → 25% → 50% → 100%
  5. Remove flag after feature is stable

Results

  • Deployment frequency: 3x increase (less fear)
  • Incident recovery time: 90% reduction (instant kill switch)
  • A/B tests run: 12 in first quarter (previously 0)
  • Developer confidence: Significantly improved

The $500/month cost has paid for itself many times over in reduced incident impact and faster iteration.

Lessons Learned

  • Flag hygiene matters: We schedule flag cleanup to avoid technical debt
  • Default to off: New flags should be disabled by default
  • Document flag purpose: Every flag needs an owner and expiration date