Introduction
When an EKS node group launches EC2 instances but kubectl get nodes never shows them, the instances are failing somewhere between EC2 startup and kubelet registration. The control plane is usually healthy. The problem is that the nodes cannot complete bootstrap, authenticate to the cluster, or reach the API endpoint and required AWS services.
Symptoms
- EKS creates EC2 instances, but no new nodes appear in the cluster
- The node group stays in
Creatingor reportsNodeCreationFailure - Instances are reachable in EC2, yet kubelet never joins the cluster
- The issue started after changing IAM roles, VPC settings, endpoint mode, or custom AMIs
Common Causes
- The node IAM role is missing required EKS worker policies
- The node cannot reach the cluster endpoint because of subnet, NAT, or security group issues
- Custom user data or AMI changes broke the bootstrap process
- The cluster endpoint mode no longer matches the node subnet routing model
Step-by-Step Fix
- 1.Inspect node group health and failure details
- 2.EKS health messages usually point you toward IAM, bootstrap, or network reachability before you even log into an instance.
aws eks describe-nodegroup \
--cluster-name my-cluster \
--nodegroup-name my-nodegroup \
--query "nodegroup.[status,health.issues]"- 1.Verify the node IAM role has the expected worker policies
- 2.The node role must be able to describe the cluster, work with networking, and pull from ECR for normal managed-node operation.
aws iam list-attached-role-policies --role-name my-node-role- 1.Check that nodes can reach the EKS API endpoint
- 2.Private and public endpoint settings must align with the subnets and routes used by the node group.
aws eks describe-cluster \
--name my-cluster \
--query "cluster.resourcesVpcConfig"- 1.Inspect bootstrap and kubelet logs on a failing node
- 2.If the instance launched but still did not register, the bootstrap sequence and kubelet logs usually reveal the real blocker.
sudo journalctl -u kubelet -n 100 --no-pagerPrevention
- Keep node IAM roles on the standard EKS managed policy set unless you have a clear reason to customize them
- Validate endpoint access design whenever you move nodes into new private or public subnets
- Be careful with custom AMIs and user data that replace the normal EKS bootstrap flow
- Monitor node group health events and cluster node count together during rollouts