Categories
Pick a stack to drill into. Each fix is grouped by the tool that actually broke.
Kubernetes
3 fixes · Pods, nodes, networking, RBAC.
How to Fix CrashLoopBackOff in Kubernetes
A practical, step-by-step guide to diagnosing and fixing CrashLoopBackOff errors in Kubernetes pods.
Fix ImagePullBackOff and ErrImagePull in Kubernetes
Why ImagePullBackOff happens and how to fix it — registry auth, wrong tags, and private images.
Pod Stuck in Pending or Unschedulable in Kubernetes
Diagnose why your pod won't schedule — node resources, taints, affinity, and missing PVCs.
Docker
3 fixes · Images, containers, networks, volumes.
Fix Docker "permission denied while trying to connect to the Docker daemon socket"
The classic Docker socket permission error and the safe way to fix it without chmod 777.
Fix Docker "no space left on device"
Reclaim disk used by dangling images, stopped containers, build cache and volumes.
Fix docker-compose "network ... not found"
Why Docker Compose can't find an external network and how to declare or create it.
Linux
3 fixes · Systemd, networking, permissions.
Fix "Too many open files" on Linux
Raise the open file descriptor limit the right way — per process, per user, and system-wide.
Debug a Failed systemd Service
From `Active: failed` to a clean restart — how to read systemd logs and fix unit files.
Fix "Permission denied" Binding to Port 80 or 443 on Linux
Bind to privileged ports without running your app as root, using capabilities or sysctl.
GitLab CI/CD
3 fixes · Runners, pipelines, caching.
Fix GitLab CI Job Stuck on "pending"
Pipeline waits forever for a runner. The fix is almost always tags, scope, or capacity.
GitLab CI Cache Not Being Restored Between Jobs
Why your `node_modules` or `.cache/` keeps re-downloading and how to make the cache actually hit.
Fix GitLab CI "jobs config should contain at least one visible job"
Lint errors on .gitlab-ci.yml and how to debug invalid pipeline syntax fast.
AWS
3 fixes · IAM, EC2, S3, VPC, EKS.
Fix AWS S3 403 Access Denied
A checklist for S3 403 errors — bucket policy, IAM, KMS, Object Ownership and Block Public Access.
Fix EKS Worker Nodes Not Joining the Cluster
Nodes stay missing from `kubectl get nodes`. Walk through aws-auth, IAM, security groups and user-data.
Fix AWS "is not authorized to perform: sts:AssumeRole"
Both sides of an AssumeRole call need to agree — fix the trust policy and the calling principal.
Terraform
3 fixes · State, providers, drift.
Fix Terraform "Error acquiring the state lock"
What to do when Terraform refuses to plan because the state is locked — and how to avoid it.
Fix Terraform Provider Version Mismatch
Lock-file drift after a CI upgrade — how to align providers across environments.
Detect and Fix Terraform Drift
When the real cloud no longer matches your state. Reconcile without nuking resources.
Nginx
3 fixes · Reverse proxy, SSL, performance.
Fix Nginx 502 Bad Gateway
Nginx is healthy, but every request 502s. The upstream is the suspect — here's how to confirm and fix.
Fix Nginx SSL Handshake Failures
TLS errors from clients or curl, and how to read OpenSSL output to pinpoint the cause.
Fix Nginx 413 Request Entity Too Large
Allow larger uploads by raising `client_max_body_size` in the right server block.