Automation Not Working? Here's How to Fix It (Step-by-Step)

The 8:43 AM Panic

It's 8:43 AM. You're sipping your coffee when Slack lights up:

"Hey, did we get any leads yesterday? The CRM is empty."

Your heart sinks. You check Zapier. Last successful run: 36 hours ago. You've been bleeding leads for a day and a half without knowing.

Now what?

Whether it's Zapier, Make, n8n, or custom code—when automation breaks, every minute counts. Leads are lost. Orders aren't processed. Customers don't get notifications.

This guide walks you through exactly how to diagnose and fix broken automation, even if you're not technical. We'll cover:

How to tell WHAT broke
How to tell WHY it broke
How to fix it (or decide you need help)
How to prevent it from breaking again

Let's get your automation back online.

Step 1: Confirm It's Actually Broken

Sometimes the issue isn't the automation—it's the data. Before panicking, run these quick checks:

Check the Source

Is the trigger actually happening?

No new leads might mean no traffic (not an automation problem)
No new orders might mean website is down (different problem)
No new form submissions might mean form itself is broken

Verify that the input data exists before blaming the automation.

Check Recent Changes

Did YOU change something yesterday?
Did a team member modify the workflow?
Did you update a connected app?
Did you change passwords or permissions?

Most "mysterious" breaks have a simple cause: someone changed something.

Check Status Pages

Zapier: status.zapier.com
Make: status.make.com
Google: status.cloud.google.com
Your other tools: [toolname] status page

If there's a known outage, you wait. Nothing else to do. If you're questioning whether Zapier is reliable enough for your use case, read our data-driven Zapier reliability assessment.

Check Error Logs

Zapier: Task History → Filter by "Failed" or "Held"
Make: Scenario execution history → Filter errors
n8n: Executions → Filter by failed

Look for error messages. They'll guide the next step.

Decision Point

Status page shows outage? → Wait for resolution
You made recent changes? → Revert them
Error logs show issues? → Continue to Step 2
No obvious cause? → Continue to Step 2

Step 2: Identify Where It Failed

Every automation platform has execution logs. Here's how to find them.

Zapier: Task History

Go to "Task History" in left sidebar
Filter: "Held" or "Failed"
Click most recent failed task
Read the error message
Note which step failed (it's highlighted)

Make: Scenario Executions

Open the scenario that's not working
Click "History" tab
Find failed executions (red icons)
Click to see which module failed
Expand to see what data was passed

n8n: Execution History

Click "Executions" in sidebar
Find failed execution
Click to expand
See which node failed and the error message

Common Error Messages (What They Mean)

Error Message	What It Means	Quick Fix
"Authentication failed"	API key expired or invalid	Reconnect account
"Required field missing"	Data not passing through	Check field mapping
"Resource not found"	Record deleted or moved	Verify IDs still valid
"Rate limit exceeded"	Too many requests	Add delay or reduce frequency
"Timeout"	Response took too long	Check server/connection
"Invalid format"	Data type mismatch	Add formatter step

Step 3: The 8 Most Common Culprits (And How to Fix Each)

After diagnosing hundreds of broken workflows, these 8 causes account for 95% of failures.

Culprit #1: Authentication Expired (30% of failures)

What happened: Connected app changed password, API key rotated, OAuth token expired.

How to recognize: Error message includes "authentication," "unauthorized," "401," or "forbidden."

How to fix:

Find the failing step in your workflow
Click "Reconnect" or "Reauthorize"
Log in again with correct credentials
Test the workflow
Done

Prevent recurrence:

Enable email notifications for auth issues
Use service accounts (not personal accounts)
Document which account is connected where

Culprit #2: Field Mapping Broke (25% of failures)

What happened: Source app changed field names or structure. Your automation can't find the data anymore.

How to recognize: Error shows "Required field missing," "undefined," or "null."

How to fix:

Run a test in the trigger step to see current data format
Compare to what your automation expects
Update field mappings to match new structure
Re-map affected fields
Test end-to-end

Real example: Zapier workflow pulling "Lead Source" from a form. Form owner changed field name to "Source of Lead." Automation broke because exact field name no longer existed. Fix: Update field mapping to new name. 5-minute fix.

Culprit #3: API Changes (15% of failures)

What happened: The service updated their API without warning (or you missed the warning).

How to recognize:

Error mentions API version
Error says "deprecated endpoint"
Everything worked yesterday, nothing changed on your end

How to fix:

Check the service's API changelog or developer blog
Identify what changed
Update your workflow to use new API version/endpoint
May require rebuilding parts of workflow
Test thoroughly

When to call for help: If API changes are extensive, you may need a developer. This isn't a quick fix.

Culprit #4: Rate Limiting (10% of failures)

What happened: You're making too many requests too fast.

How to recognize: Error shows "Rate limit exceeded," "429," or "too many requests."

How to fix:

Add delays between actions (Zapier: "Delay" step, Make: "Sleep" module)
Reduce frequency (run every 15 min instead of every 5 min)
Batch requests if possible
Upgrade API plan if hitting limits regularly

Prevent recurrence:

Check API rate limits when building
Add buffer (don't run at 90% of limit)
Monitor usage over time

Culprit #5: Data Format Changes (10% of failures)

What happened: Source changed how they format dates, numbers, or text.

How to recognize:

Workflow runs but data looks wrong
Errors about "invalid date format"
Numbers treated as text (or vice versa)

How to fix:

Add a formatter step before the failing action
Convert data to expected format
For dates: Use "Format" → "Date/Time"
For numbers: Use "Format" → "Numbers"
For text: Use "Format" → "Text"
Test with real data

Culprit #6: Missing Dependencies (5% of failures)

What happened: Workflow depends on another workflow or record that no longer exists.

How to recognize:

"Record not found"
"Dependency missing"
Workflow worked when linked record existed

How to fix:

Identify the missing dependency
Either recreate it or update workflow to not require it
Add error handling for future occurrences

Culprit #7: Permissions Changed (3% of failures)

What happened: Someone revoked access or changed permissions in connected app.

How to recognize:

"Access denied"
"Insufficient permissions"
"403 Forbidden"

How to fix:

Check permissions in source/destination apps
Grant necessary permissions
Reconnect if needed

Prevent recurrence:

Use service accounts with stable permissions
Document required permissions
Limit who can change permissions

Culprit #8: Edge Case Data (2% of failures)

What happened: Real-world data broke your assumptions.

How to recognize:

Works most of the time, fails occasionally
Error is inconsistent
Failed records have something unusual

How to fix:

Examine the specific record that failed
Identify what's different about it
Add handling for that case
Add validation/filtering before processing

Example: Lead capture worked fine until someone entered phone number with letters. Automation expected only digits, crashed. Fix: Add step to strip non-numeric characters before processing.

Step 4: Test Your Fix

Don't assume it's fixed. Test it.

Testing Checklist

1. Test manually first

Run the workflow with test data
Watch each step execute
Verify data flows correctly through every step

2. Test with real data

Use recent actual data (not fake test data)
Process 3-5 real transactions
Verify end result is correct in destination system

3. Test edge cases

Empty fields
Special characters
Maximum lengths
Wrong data types

4. Monitor for 24-48 hours

Watch for new failures
Check success rate
Review any errors that appear

When You Know It's Fixed

✅ Manual test passes
✅ Real data processes correctly
✅ No new errors in 24 hours
✅ End result verified in destination

If it breaks again immediately: You didn't identify the real problem. Go back to Step 2. Many recurring failures stem from common automation pitfalls that are preventable with proper setup.

Step 5: Prevent Future Breaks

Once it's fixed, prevent recurrence. This is the step most people skip—and why they end up fixing the same problems repeatedly.

Immediate Actions

1. Set up monitoring

UptimeRobot to ping workflow endpoints (free tier available)
Error notifications enabled for all critical workflows
Daily summary email of runs

2. Document what broke and how you fixed it

Future you will thank present you
Include the error messages
Include the exact solution steps

3. Add error handling

Try/catch blocks where possible
Fallback actions for common failures
Graceful degradation (partial success better than total failure)

4. Review and improve

Could this have been prevented?
Is the workflow too fragile?
Should we rebuild it differently?

Long-Term Prevention

Create a maintenance schedule:

Weekly: Review error logs
Monthly: Test critical workflows manually
Quarterly: Review and optimize

Set up proactive checks:

Health check workflow that tests your other workflows
Runs daily, alerts if any issues
Catches problems before they cause damage

This is what we do with our maintenance plans: proactive monitoring, quick fixes, updates when APIs change. For businesses where downtime costs money, it's cheaper than scrambling. We don't build and bail.

When to Call for Help

Some fixes require expertise. Know when to escalate instead of wasting hours on trial and error.

Call for Help If:

1. Error message is incomprehensible

Technical jargon you don't understand
Multiple layers of nested errors
No clear solution in documentation

2. Fix requires code

Custom API calls needed
Complex data transformation
Security/authentication issues beyond basic reconnection

3. It keeps breaking

You've "fixed" it 3+ times
Same error recurring
Different errors each time

4. Multiple workflows affected

Systemic issue (not isolated)
Might indicate architectural problem
May need to rebuild from scratch

5. Stakes are too high

Revenue-critical workflow
Customer-facing process
Can't afford trial-and-error

What to Have Ready When You Call

Error messages (screenshots help)
What you've tried already
When it started breaking
How often it breaks
Business impact (cost of downtime)

Support Options

Option	Cost	Speed	Best For
Platform support (Zapier, Make)	Free	Slow	Simple issues
Freelancer	$50-150/hour	Fast	One-time fixes
Agency	$100-200/hour	Varies	Complex rebuilds
Maintenance retainer	$300-800/month	Same-day	Ongoing reliability

Platform support is free but slow. Freelancers are fast but no ongoing relationship. A maintenance retainer is proactive instead of reactive—we catch issues before they cause damage.

Quick Reference: Troubleshooting Flowchart

Automation not working?

Check status pages → If outage, wait
Check if trigger is firing → If no trigger data, fix source
Check error logs → Note the error message
Match error to culprit list above
Apply the fix
Test thoroughly
Set up monitoring
Document what happened

Time estimates:

Authentication expired: 5-15 minutes
Field mapping broke: 15-30 minutes
Rate limiting: 10-20 minutes
API changes: 1-4 hours (or call for help)
Edge case handling: 30-60 minutes

The Key Insight

When automation breaks, here's your game plan:

Confirm it's actually broken (check status pages)
Identify where it failed (error logs)
Diagnose the culprit (8 common causes)
Test your fix (don't assume it worked)
Prevent future breaks (monitoring + maintenance)

Most failures are fixable with these steps. Some require expertise.

The key: catch failures fast. The longer automation is broken, the more expensive the fix. Every hour of downtime is leads lost, orders missed, customers frustrated.

This is why "set it and forget it" is a lie. Automation needs care. APIs change. Platforms have outages. Your business evolves. Without someone watching, small problems become expensive disasters.

Question everything. Automate the rest.

Tired of Being the One to Fix Broken Automation at 2 AM?

Our maintenance plans include:

24/7 monitoring — we catch issues before you notice
Quick fixes — usually same-day resolution
Proactive updates — API changes handled automatically
Priority support — no waiting in queue

Starting at $300/month for basic systems. For businesses where downtime costs money, it's cheaper than scrambling.

Book Free Audit or See Maintenance Plans →

Automation Not Working? Here's How to Fix It (Step-by-Step)

The 8:43 AM Panic

Step 1: Confirm It's Actually Broken

Check the Source

Check Recent Changes

Check Status Pages

Check Error Logs

Decision Point

Step 2: Identify Where It Failed

Zapier: Task History

Make: Scenario Executions

n8n: Execution History

Common Error Messages (What They Mean)

Step 3: The 8 Most Common Culprits (And How to Fix Each)

Culprit #1: Authentication Expired (30% of failures)

Culprit #2: Field Mapping Broke (25% of failures)

Culprit #3: API Changes (15% of failures)

Culprit #4: Rate Limiting (10% of failures)

Culprit #5: Data Format Changes (10% of failures)

Culprit #6: Missing Dependencies (5% of failures)

Culprit #7: Permissions Changed (3% of failures)

Culprit #8: Edge Case Data (2% of failures)

Step 4: Test Your Fix

Testing Checklist

When You Know It's Fixed

Step 5: Prevent Future Breaks

Immediate Actions

Long-Term Prevention

When to Call for Help

Call for Help If:

What to Have Ready When You Call

Support Options

Quick Reference: Troubleshooting Flowchart

The Key Insight

Tired of Being the One to Fix Broken Automation at 2 AM?

Ready for automation that lasts?

More from the Blog

7 Automation Pitfalls That Will Cost You Thousands (And How to Avoid Them)

Why the AI Agency Model is Broken (And What We Do Instead)

The "Set It and Forget It" Lie: Why Your Zapier Workflow Will Break

Get Automation Insights Delivered