Automation Tips|
Jan 20, 2026
|
8 min read

Automation Not Working? Here's How to Fix It (Step-by-Step)

G

Written by

Go Rogue Ops Team

The 8:43 AM Panic

It's 8:43 AM. You're sipping your coffee when Slack lights up:

"Hey, did we get any leads yesterday? The CRM is empty."

Your heart sinks. You check Zapier. Last successful run: 36 hours ago. You've been bleeding leads for a day and a half without knowing.

Now what?

Whether it's Zapier, Make, n8n, or custom code—when automation breaks, every minute counts. Leads are lost. Orders aren't processed. Customers don't get notifications.

This guide walks you through exactly how to diagnose and fix broken automation, even if you're not technical. We'll cover:

  • How to tell WHAT broke
  • How to tell WHY it broke
  • How to fix it (or decide you need help)
  • How to prevent it from breaking again

Let's get your automation back online.

Step 1: Confirm It's Actually Broken

Sometimes the issue isn't the automation—it's the data. Before panicking, run these quick checks:

Check the Source

Is the trigger actually happening?

  • No new leads might mean no traffic (not an automation problem)
  • No new orders might mean website is down (different problem)
  • No new form submissions might mean form itself is broken

Verify that the input data exists before blaming the automation.

Check Recent Changes

  • Did YOU change something yesterday?
  • Did a team member modify the workflow?
  • Did you update a connected app?
  • Did you change passwords or permissions?

Most "mysterious" breaks have a simple cause: someone changed something.

Check Status Pages

  • Zapier: status.zapier.com
  • Make: status.make.com
  • Google: status.cloud.google.com
  • Your other tools: [toolname] status page

If there's a known outage, you wait. Nothing else to do. If you're questioning whether Zapier is reliable enough for your use case, read our data-driven Zapier reliability assessment.

Check Error Logs

  • Zapier: Task History → Filter by "Failed" or "Held"
  • Make: Scenario execution history → Filter errors
  • n8n: Executions → Filter by failed

Look for error messages. They'll guide the next step.

Decision Point

  • Status page shows outage? → Wait for resolution
  • You made recent changes? → Revert them
  • Error logs show issues? → Continue to Step 2
  • No obvious cause? → Continue to Step 2

Step 2: Identify Where It Failed

Every automation platform has execution logs. Here's how to find them.

Zapier: Task History

  1. Go to "Task History" in left sidebar
  2. Filter: "Held" or "Failed"
  3. Click most recent failed task
  4. Read the error message
  5. Note which step failed (it's highlighted)

Make: Scenario Executions

  1. Open the scenario that's not working
  2. Click "History" tab
  3. Find failed executions (red icons)
  4. Click to see which module failed
  5. Expand to see what data was passed

n8n: Execution History

  1. Click "Executions" in sidebar
  2. Find failed execution
  3. Click to expand
  4. See which node failed and the error message

Common Error Messages (What They Mean)

Error MessageWhat It MeansQuick Fix
"Authentication failed"API key expired or invalidReconnect account
"Required field missing"Data not passing throughCheck field mapping
"Resource not found"Record deleted or movedVerify IDs still valid
"Rate limit exceeded"Too many requestsAdd delay or reduce frequency
"Timeout"Response took too longCheck server/connection
"Invalid format"Data type mismatchAdd formatter step

Step 3: The 8 Most Common Culprits (And How to Fix Each)

After diagnosing hundreds of broken workflows, these 8 causes account for 95% of failures.

Culprit #1: Authentication Expired (30% of failures)

What happened: Connected app changed password, API key rotated, OAuth token expired.

How to recognize: Error message includes "authentication," "unauthorized," "401," or "forbidden."

How to fix:

  1. Find the failing step in your workflow
  2. Click "Reconnect" or "Reauthorize"
  3. Log in again with correct credentials
  4. Test the workflow
  5. Done

Prevent recurrence:

  • Enable email notifications for auth issues
  • Use service accounts (not personal accounts)
  • Document which account is connected where

Culprit #2: Field Mapping Broke (25% of failures)

What happened: Source app changed field names or structure. Your automation can't find the data anymore.

How to recognize: Error shows "Required field missing," "undefined," or "null."

How to fix:

  1. Run a test in the trigger step to see current data format
  2. Compare to what your automation expects
  3. Update field mappings to match new structure
  4. Re-map affected fields
  5. Test end-to-end

Real example: Zapier workflow pulling "Lead Source" from a form. Form owner changed field name to "Source of Lead." Automation broke because exact field name no longer existed. Fix: Update field mapping to new name. 5-minute fix.

Culprit #3: API Changes (15% of failures)

What happened: The service updated their API without warning (or you missed the warning).

How to recognize:

  • Error mentions API version
  • Error says "deprecated endpoint"
  • Everything worked yesterday, nothing changed on your end

How to fix:

  1. Check the service's API changelog or developer blog
  2. Identify what changed
  3. Update your workflow to use new API version/endpoint
  4. May require rebuilding parts of workflow
  5. Test thoroughly

When to call for help: If API changes are extensive, you may need a developer. This isn't a quick fix.

Culprit #4: Rate Limiting (10% of failures)

What happened: You're making too many requests too fast.

How to recognize: Error shows "Rate limit exceeded," "429," or "too many requests."

How to fix:

  1. Add delays between actions (Zapier: "Delay" step, Make: "Sleep" module)
  2. Reduce frequency (run every 15 min instead of every 5 min)
  3. Batch requests if possible
  4. Upgrade API plan if hitting limits regularly

Prevent recurrence:

  • Check API rate limits when building
  • Add buffer (don't run at 90% of limit)
  • Monitor usage over time

Culprit #5: Data Format Changes (10% of failures)

What happened: Source changed how they format dates, numbers, or text.

How to recognize:

  • Workflow runs but data looks wrong
  • Errors about "invalid date format"
  • Numbers treated as text (or vice versa)

How to fix:

  1. Add a formatter step before the failing action
  2. Convert data to expected format
  3. For dates: Use "Format" → "Date/Time"
  4. For numbers: Use "Format" → "Numbers"
  5. For text: Use "Format" → "Text"
  6. Test with real data

Culprit #6: Missing Dependencies (5% of failures)

What happened: Workflow depends on another workflow or record that no longer exists.

How to recognize:

  • "Record not found"
  • "Dependency missing"
  • Workflow worked when linked record existed

How to fix:

  1. Identify the missing dependency
  2. Either recreate it or update workflow to not require it
  3. Add error handling for future occurrences

Culprit #7: Permissions Changed (3% of failures)

What happened: Someone revoked access or changed permissions in connected app.

How to recognize:

  • "Access denied"
  • "Insufficient permissions"
  • "403 Forbidden"

How to fix:

  1. Check permissions in source/destination apps
  2. Grant necessary permissions
  3. Reconnect if needed

Prevent recurrence:

  • Use service accounts with stable permissions
  • Document required permissions
  • Limit who can change permissions

Culprit #8: Edge Case Data (2% of failures)

What happened: Real-world data broke your assumptions.

How to recognize:

  • Works most of the time, fails occasionally
  • Error is inconsistent
  • Failed records have something unusual

How to fix:

  1. Examine the specific record that failed
  2. Identify what's different about it
  3. Add handling for that case
  4. Add validation/filtering before processing

Example: Lead capture worked fine until someone entered phone number with letters. Automation expected only digits, crashed. Fix: Add step to strip non-numeric characters before processing.

Step 4: Test Your Fix

Don't assume it's fixed. Test it.

Testing Checklist

1. Test manually first

  • Run the workflow with test data
  • Watch each step execute
  • Verify data flows correctly through every step

2. Test with real data

  • Use recent actual data (not fake test data)
  • Process 3-5 real transactions
  • Verify end result is correct in destination system

3. Test edge cases

  • Empty fields
  • Special characters
  • Maximum lengths
  • Wrong data types

4. Monitor for 24-48 hours

  • Watch for new failures
  • Check success rate
  • Review any errors that appear

When You Know It's Fixed

  • ✅ Manual test passes
  • ✅ Real data processes correctly
  • ✅ No new errors in 24 hours
  • ✅ End result verified in destination

If it breaks again immediately: You didn't identify the real problem. Go back to Step 2. Many recurring failures stem from common automation pitfalls that are preventable with proper setup.

Step 5: Prevent Future Breaks

Once it's fixed, prevent recurrence. This is the step most people skip—and why they end up fixing the same problems repeatedly.

Immediate Actions

1. Set up monitoring

  • UptimeRobot to ping workflow endpoints (free tier available)
  • Error notifications enabled for all critical workflows
  • Daily summary email of runs

2. Document what broke and how you fixed it

  • Future you will thank present you
  • Include the error messages
  • Include the exact solution steps

3. Add error handling

  • Try/catch blocks where possible
  • Fallback actions for common failures
  • Graceful degradation (partial success better than total failure)

4. Review and improve

  • Could this have been prevented?
  • Is the workflow too fragile?
  • Should we rebuild it differently?

Long-Term Prevention

Create a maintenance schedule:

  • Weekly: Review error logs
  • Monthly: Test critical workflows manually
  • Quarterly: Review and optimize

Set up proactive checks:

  • Health check workflow that tests your other workflows
  • Runs daily, alerts if any issues
  • Catches problems before they cause damage

This is what we do with our maintenance plans: proactive monitoring, quick fixes, updates when APIs change. For businesses where downtime costs money, it's cheaper than scrambling. We don't build and bail.

When to Call for Help

Some fixes require expertise. Know when to escalate instead of wasting hours on trial and error.

Call for Help If:

1. Error message is incomprehensible

  • Technical jargon you don't understand
  • Multiple layers of nested errors
  • No clear solution in documentation

2. Fix requires code

  • Custom API calls needed
  • Complex data transformation
  • Security/authentication issues beyond basic reconnection

3. It keeps breaking

  • You've "fixed" it 3+ times
  • Same error recurring
  • Different errors each time

4. Multiple workflows affected

  • Systemic issue (not isolated)
  • Might indicate architectural problem
  • May need to rebuild from scratch

5. Stakes are too high

  • Revenue-critical workflow
  • Customer-facing process
  • Can't afford trial-and-error

What to Have Ready When You Call

  • Error messages (screenshots help)
  • What you've tried already
  • When it started breaking
  • How often it breaks
  • Business impact (cost of downtime)

Support Options

OptionCostSpeedBest For
Platform support (Zapier, Make)FreeSlowSimple issues
Freelancer$50-150/hourFastOne-time fixes
Agency$100-200/hourVariesComplex rebuilds
Maintenance retainer$300-800/monthSame-dayOngoing reliability

Platform support is free but slow. Freelancers are fast but no ongoing relationship. A maintenance retainer is proactive instead of reactive—we catch issues before they cause damage.

Quick Reference: Troubleshooting Flowchart

Automation not working?

  1. Check status pages → If outage, wait
  2. Check if trigger is firing → If no trigger data, fix source
  3. Check error logs → Note the error message
  4. Match error to culprit list above
  5. Apply the fix
  6. Test thoroughly
  7. Set up monitoring
  8. Document what happened

Time estimates:

  • Authentication expired: 5-15 minutes
  • Field mapping broke: 15-30 minutes
  • Rate limiting: 10-20 minutes
  • API changes: 1-4 hours (or call for help)
  • Edge case handling: 30-60 minutes

The Key Insight

When automation breaks, here's your game plan:

  1. Confirm it's actually broken (check status pages)
  2. Identify where it failed (error logs)
  3. Diagnose the culprit (8 common causes)
  4. Test your fix (don't assume it worked)
  5. Prevent future breaks (monitoring + maintenance)

Most failures are fixable with these steps. Some require expertise.

The key: catch failures fast. The longer automation is broken, the more expensive the fix. Every hour of downtime is leads lost, orders missed, customers frustrated.

This is why "set it and forget it" is a lie. Automation needs care. APIs change. Platforms have outages. Your business evolves. Without someone watching, small problems become expensive disasters.

Question everything. Automate the rest.

Tired of Being the One to Fix Broken Automation at 2 AM?

Our maintenance plans include:

  • 24/7 monitoring — we catch issues before you notice
  • Quick fixes — usually same-day resolution
  • Proactive updates — API changes handled automatically
  • Priority support — no waiting in queue

Starting at $300/month for basic systems. For businesses where downtime costs money, it's cheaper than scrambling.

Book Free Audit or See Maintenance Plans →

Share this article

Ready for automation that lasts?

Book your free 45-minute discovery call. We'll discuss your operations, identify waste using lean principles, and recommend which path (if any) makes sense for you.

Book Your Free Discovery Call

Get Automation Insights Delivered

Get practical tips on automation, lean operations, and business optimization delivered to your inbox.

We respect your privacy. Unsubscribe at any time.

Book a Free Audit