What's Actually Happening

Rundeck jobs fail to execute or complete. Job runs show errors instead of successful completion.

The Error You'll See

bash
# In Rundeck UI:
Execution failed: Node step failed against: node1

Node connectivity error:

bash
Error: Failed to connect to node: Connection refused

Authentication error:

bash
Error: Authentication failed for user: rundeck

Script error:

bash
Error: Script failed with exit code: 1

Permission error:

bash
Error: Access denied: Insufficient permissions

Why This Happens

  1. 1.Node unreachable - Target node not accessible
  2. 2.SSH key issues - Invalid or missing SSH credentials
  3. 3.Job misconfigured - Wrong options or node filters
  4. 4.Script errors - Job script fails during execution
  5. 5.Permission denied - User lacks job execution permission
  6. 6.Resource exhaustion - Node out of disk, memory, or CPU

Step 1: Check Job Execution Log

```bash # View execution in Rundeck UI: # Activity -> Executions -> Click on failed execution

# Via API: curl -H "Accept: application/json" \ -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/execution/<execution-id>" | jq .

# Get execution output: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/execution/<execution-id>/output" | jq .

# Check execution state: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/execution/<execution-id>/state" | jq .

# View execution log: # In UI: Activity -> Executions -> Execution -> Log Output

# Check for error messages: # Look for "Error:", "Failed:", "Exception:" in logs

# Download execution log: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/execution/<execution-id>/output.txt" ```

Step 2: Check Node Connectivity

```bash # List configured nodes: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/project/myproject/resources" | jq .

# Check node status in UI: # Project -> Nodes -> View node status

# Test node connectivity: # In Rundeck UI: Nodes -> Click node -> Test Connection

# Manual SSH test: ssh rundeck@node1

# Check SSH configuration: cat /etc/rundeck/framework.properties | grep ssh

# SSH key location: cat /var/lib/rundeck/.ssh/id_rsa

# Verify SSH key permissions: ls -la /var/lib/rundeck/.ssh/ # id_rsa should be 600

# Test SSH as rundeck user: sudo -u rundeck ssh node1

# Check SSH config: cat /var/lib/rundeck/.ssh/config

# Node definition in resources.yaml: node1: hostname: node1.example.com username: rundeck ssh-keypath: /var/lib/rundeck/.ssh/id_rsa ```

Step 3: Verify Job Configuration

```bash # View job definition: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/job/<job-id>" | jq .

# Check job options: # In UI: Jobs -> Job -> Edit

# Job node filter: # node filter determines which nodes run the job # Examples: # name: node1 # tags: production # osFamily: linux

# Job steps: # Each step executes in order # Verify step configuration

# Workflow strategy: # node-first: Run all steps on each node # step-first: Run each step on all nodes

# Check job options validation: # Required options must be provided

# Job timeout: # Verify execution timeout is adequate

# Retry configuration: # Check retry count and delay ```

Step 4: Fix SSH Authentication

```bash # Check SSH key: ls -la /var/lib/rundeck/.ssh/id_rsa

# Generate SSH key: sudo -u rundeck ssh-keygen -t rsa -b 4096

# Copy key to target nodes: sudo -u rundeck ssh-copy-id rundeck@node1

# Or manually: cat /var/lib/rundeck/.ssh/id_rsa.pub | ssh rundeck@node1 'cat >> ~/.ssh/authorized_keys'

# Fix permissions: chmod 700 /var/lib/rundeck/.ssh chmod 600 /var/lib/rundeck/.ssh/id_rsa chmod 644 /var/lib/rundeck/.ssh/id_rsa.pub

# Test passwordless SSH: sudo -u rundeck ssh node1 'whoami'

# Use password auth instead: # In node definition: username: rundeck password: ${option.sshPassword}

# SSH config for Rundeck: cat > /var/lib/rundeck/.ssh/config << EOF Host * StrictHostKeyChecking no UserKnownHostsFile /dev/null LogLevel ERROR EOF

chown rundeck:rundeck /var/lib/rundeck/.ssh/config chmod 600 /var/lib/rundeck/.ssh/config ```

Step 5: Check Script Errors

```bash # View script error in execution log: # Activity -> Executions -> Log Output

# Common script errors:

# 1. Command not found: # Ensure command is in PATH or use full path

# 2. Permission denied: # Check execute permission: chmod +x script.sh # Check user has permission to run

# 3. Environment variable missing: # Set in job options or script

# 4. Working directory wrong: # Use absolute paths or cd to correct directory

# Test script manually: ssh node1 sudo -u rundeck /path/to/script.sh

# Check script exit code: /path/to/script.sh; echo $?

# Add error handling to script: #!/bin/bash set -e # Exit on error set -x # Print commands

# Debug script execution: bash -x /path/to/script.sh

# Check script output redirection: # Ensure stdout/stderr captured ```

Step 6: Check User Permissions

```bash # Check user ACL: cat /etc/rundeck/admin.aclpolicy

# User permissions in UI: # User Profile -> Edit Profile -> ACL

# Required permissions: # - job:run # - node:run # - execution:read

# Project ACL: cat /etc/rundeck/project.aclpolicy

# Check group membership: grep rundeck /etc/group

# Test permission via API: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/user/info"

# Add user to admin group: usermod -aG admin rundeck

# Check project role mapping: cat /etc/rundeck/framework.properties | grep role

# Create ACL policy: cat > /etc/rundeck/myproject.aclpolicy << EOF description: Project permissions context: project: myproject for: resource: - equals: kind: job allow: [create,read,update,delete,run] job: - allow: [create,read,update,delete,run] by: group: admin EOF ```

Step 7: Check Node Resources

```bash # Check node resources during execution: ssh node1 'free -m' ssh node1 'df -h' ssh node1 'uptime'

# Disk space: df -h /tmp df -h /var

# Memory: free -m cat /proc/meminfo

# CPU load: uptime top -bn1 | head -20

# Check for zombie processes: ps aux | grep Z

# Check process limits: ulimit -a

# Check logs on node: journalctl -xe tail -f /var/log/syslog

# Check Rundeck temporary files: ls -la /tmp/rundeck/

# Clean up old executions: # In Rundeck UI: Project Settings -> Execution History -> Clean Up ```

Step 8: Fix Environment Issues

```bash # Check environment variables in job: # In script: env | sort

# Set environment in job: export PATH=$PATH:/custom/bin export JAVA_HOME=/usr/lib/jvm/java-11

# Or in job definition: environment: PATH: /usr/local/bin:/usr/bin:/bin MY_VAR: value

# Check Rundeck environment: sudo -u rundeck env

# Environment in framework.properties: framework.properties: framework.server.name = rundeck framework.server.hostname = rundeck.example.com

# Node-specific environment: # In node definition: environment: JAVA_HOME: /usr/lib/jvm/java-11

# Debug environment in execution: echo "PATH=$PATH" echo "HOME=$HOME" ```

Step 9: Handle Job Scheduling Issues

```bash # Check job schedule: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/job/<job-id>/info" | jq .schedule

# Verify schedule in UI: # Jobs -> Job -> Schedule tab

# Quartz schedule syntax: # Seconds Minutes Hours Day-of-Month Month Day-of-Week # 0 0 12 * * ? # Every day at noon

# Test schedule: # Use "Run Job Now" to test manually

# Check scheduler status: curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/system/info" | jq .system.scheduler

# Failed scheduled executions: # Check execution history for patterns

# Schedule disabled: # Job -> Edit -> Schedule -> Enabled checkbox

# Concurrent executions: # Check if multiple executions conflict # Set max concurrent executions in job config ```

Step 10: Rundeck Verification Script

```bash # Create verification script: cat << 'EOF' > /usr/local/bin/check-rundeck-job.sh #!/bin/bash

JOB_ID=${1:-""} RD_URL=${2:-"http://localhost:4440"} RD_TOKEN=${3:-""}

echo "=== Rundeck Status ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/system/info" | jq '.system'

echo "" echo "=== Project List ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/projects" | jq '.project'

echo "" echo "=== Nodes ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/project/myproject/resources" | jq 'keys'

echo "" echo "=== Recent Executions ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/project/myproject/executions?max=5" | jq '.executions[] | {id, status, dateStarted}'

if [ -n "$JOB_ID" ]; then echo "" echo "=== Job Info ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/job/$JOB_ID" | jq '.'

echo "" echo "=== Job Executions ===" curl -s -H "X-Rundeck-Auth-Token: $RD_TOKEN" "$RD_URL/api/41/job/$JOB_ID/executions?max=5" | jq '.executions[] | {id, status, message}' fi

echo "" echo "=== SSH Key Check ===" ls -la /var/lib/rundeck/.ssh/

echo "" echo "=== Recommendations ===" echo "1. Verify node connectivity with SSH test" echo "2. Check SSH key permissions (600)" echo "3. Review job configuration and node filters" echo "4. Test script manually on target node" echo "5. Verify user has execution permissions" echo "6. Check node resources (disk, memory)" echo "7. Review execution log for errors" EOF

chmod +x /usr/local/bin/check-rundeck-job.sh

# Usage: /usr/local/bin/check-rundeck-job.sh <job-id> http://rundeck:4440 <token> ```

Rundeck Job Checklist

CheckExpected
Node connectivitySSH works
SSH keyValid and correct permissions
Job configCorrect options and nodes
ScriptExecutes successfully
PermissionsUser can run job
Node resourcesAdequate disk/memory
EnvironmentVariables set correctly

Verify the Fix

```bash # After fixing Rundeck job issues

# 1. Test node connectivity ssh rundeck@node1 // Connected successfully

# 2. Run job manually # UI: Jobs -> Job -> Run Job Now // Job completes

# 3. Check execution log # Activity -> Executions -> Log Output // No errors

# 4. Verify schedule # Jobs -> Job -> Schedule // Schedule active

# 5. Test SSH key sudo -u rundeck ssh node1 'hostname' // Returns node name

# 6. Check recent executions curl -H "X-Rundeck-Auth-Token: <token>" \ "http://rundeck:4440/api/41/project/myproject/executions?max=5" // Shows successful runs ```

  • [Fix Jenkins Build Stuck](/articles/fix-jenkins-build-stuck)
  • [Fix Ansible Host Unreachable](/articles/fix-ansible-host-unreachable)
  • [Fix SSH Connection Refused](/articles/fix-ssh-connection-refused)