What's Actually Happening

Google Cloud Compute Engine instances fail to start. Instances show stuck in provisioning, or VM fails with error status.

The Error You'll See

Instance failed:

```bash gcloud compute instances describe my-instance --zone=us-central1-a

status: "STOPPED" lastStopTimestamp: "2024-01-01T00:00:00.000Z" ```

Provisioning error:

```bash gcloud compute operations list --filter="status=RUNNING OR status=DONE" --limit=10

Error: The resource 'projects/my-project/zones/us-central1-a/instances/my-instance' was not found ```

Quota exceeded:

bash
Quota 'CPUS' exceeded. Limit: 8 in region us-central1.

Why This Happens

  1. 1.Quota exceeded - Project quota limit reached
  2. 2.Boot disk issue - Boot disk corrupted or full
  3. 3.Machine type unavailable - Zone doesn't have capacity
  4. 4.Service account issue - Service account deleted or changed
  5. 5.Network issue - VPC or subnet misconfigured
  6. 6.Image not available - Image deleted or unavailable
  7. 7.Resource limits - CPU, memory, or GPU limits

Step 1: Check Instance Status

```bash gcloud compute instances list

gcloud compute instances describe my-instance --zone=us-central1-a

gcloud compute instances get-serial-port-output my-instance --zone=us-central1-a

# Check recent operations: gcloud compute operations list --filter="targetLink=my-instance" ```

Step 2: Check Project Quota

```bash gcloud compute project-info describe --project=my-project

# Check specific quota: gcloud compute project-info describe --project=my-project --format="yaml(quotas)"

# Common quotas: # - CPUS # - IN_USE_ADDRESSES # - DISKS_TOTAL_GB # - PREEMPTIBLE_CPUS

# Request quota increase: # Cloud Console -> IAM & Admin -> Quotas ```

Step 3: Check Boot Disk

```bash gcloud compute disks list

gcloud compute disks describe my-instance-disk --zone=us-central1-a

# Check disk size: gcloud compute disks describe my-instance-disk --zone=us-central1-a --format="yaml(sizeGb)"

# Check if disk attached: gcloud compute disks describe my-instance-disk --zone=us-central1-a --format="yaml(users)"

# Create disk from snapshot if corrupted: gcloud compute disks create my-new-disk \ --source-snapshot=my-snapshot \ --zone=us-central1-a ```

Step 4: Check Machine Type Availability

```bash # List available machine types: gcloud compute machine-types list --zones=us-central1-a

# Check specific machine type: gcloud compute machine-types describe n1-standard-4 --zone=us-central1-a

# If unavailable, try different zone: gcloud compute machine-types describe n1-standard-4 --zone=us-central1-b ```

Step 5: Check Serial Console

```bash # Get serial port output: gcloud compute instances get-serial-port-output my-instance --zone=us-central1-a

# Enable serial console: gcloud compute instances add-metadata my-instance \ --metadata=serial-port-enable=true \ --zone=us-central1-a

# Connect to serial console: gcloud compute connect-to-serial-port my-instance --zone=us-central1-a ```

Step 6: Check Service Account

```bash gcloud compute instances describe my-instance --zone=us-central1-a --format="yaml(serviceAccounts)"

# Check service account exists: gcloud iam service-accounts list

# Update service account: gcloud compute instances set-service-account my-instance \ --service-account=my-sa@my-project.iam.gserviceaccount.com \ --scopes=default \ --zone=us-central1-a ```

Step 7: Check Network Configuration

```bash # Check instance network: gcloud compute instances describe my-instance --zone=us-central1-a --format="yaml(networkInterfaces)"

# Check VPC: gcloud compute networks list

gcloud compute networks describe default

# Check subnet: gcloud compute networks subnets list --regions=us-central1

# Check firewall rules: gcloud compute firewall-rules list ```

Step 8: Try Different Zone

```bash # Check zones in region: gcloud compute zones list --filter="region=us-central1"

# Create instance in different zone: gcloud compute instances create my-instance \ --zone=us-central1-b \ --machine-type=n1-standard-4 \ --image-family=ubuntu-2204-lts \ --image-project=ubuntu-os-cloud ```

Step 9: Check Logs

```bash # Cloud Logging: gcloud logging read "resource.type=gce_instance AND resource.labels.instance_id=my-instance" --limit=10

# Activity logs: gcloud logging read "resource.type=gce_instance AND protoPayload.resourceName=my-instance" --limit=20

# Check for errors: gcloud logging read "severity>=ERROR AND resource.type=gce_instance" --limit=10 ```

Step 10: Recreate Instance

```bash # Stop instance: gcloud compute instances stop my-instance --zone=us-central1-a

# Create machine image: gcloud compute machine-images create my-image \ --source-instance=my-instance \ --source-instance-zone=us-central1-a

# Create new instance from image: gcloud compute instances create my-new-instance \ --zone=us-central1-a \ --machine-image=my-image

# Or restore from snapshot: gcloud compute disks create my-disk \ --source-snapshot=my-snapshot \ --zone=us-central1-a

gcloud compute instances create my-instance \ --disk=name=my-disk,boot=yes \ --zone=us-central1-a ```

GCP GCE Instance Not Starting Checklist

CheckCommandExpected
Instance statusgcloud compute instances describeRUNNING
Quotagcloud compute project-infoWithin limits
Boot diskgcloud compute disks describeHealthy
Machine typegcloud compute machine-typesAvailable
Serial outputget-serial-port-outputCheck for errors
Service accountgcloud iam service-accountsActive

Verify the Fix

```bash gcloud compute instances describe my-instance --zone=us-central1-a --format="yaml(status)"

gcloud compute instances get-serial-port-output my-instance --zone=us-central1-a

gcloud compute project-info describe --format="yaml(quotas)"

gcloud compute instances list --filter="name=my-instance"

ssh user@external-ip ```

  • [Fix AWS EC2 Instance Not Starting](/articles/fix-aws-ec2-instance-not-starting)
  • [Fix Azure VM Not Starting](/articles/fix-azure-vm-not-starting)
  • [Fix GCP GKE Cluster Not Starting](/articles/fix-gcp-gke-cluster-not-starting)