Experience Required: Minimum 5+ years of relevant GCP support & administration experience. Total Experience can be more, but 5 years of experience in GCP is must.
Certification Required: At least one active Google Cloud certification (e.g., Associate Cloud Engineer, Professional Cloud Architect, Professional Cloud DevOps Engineer etc.)
Job Summary
We are looking for a highly skilled Level 3 GCP Support Engineer responsible for providing advanced technical support, incident management, and change management for mission-critical cloud environments hosted on Google Cloud Platform (GCP). The ideal candidate will have deep hands-on expertise across GCP services, strong automation and IaC skills with Terraform, and the ability to troubleshoot complex multi-layer cloud infrastructure issues.
\n
Key Responsibilities- 1. Incident Management (L3 Escalation)
- Act as the primary escalation point for complex incidents requiring deep technical analysis.
- Troubleshoot and resolve priority P1/P2 issues related to compute, storage, networking, databases, IAM, Kubernetes (GKE), logging/monitoring, and cloud security.
- Perform detailed root cause analysis (RCA) and implement permanent fixes to avoid repeat incidents.
- Liaise with Google Cloud Support to escalate vendor-level issues where necessary.
- Ensure SLAs, SLOs, and incident response times are consistently met.
- 2. Change Management
- Plan, validate, and implement controlled changes in production and non-production environments following ITIL processes.
- Perform impact analysis, risk assessment, and rollback planning for all assigned changes.
- Maintain change documentation and ensure compliance with organizational and audit standards.
- 3. GCP Administration & Operations
- Manage, configure, and optimize core GCP servicesβCompute Engine, Cloud Storage, VPC, Cloud SQL, Cloud Run, GKE, Load Balancers, Cloud Functions, and IAM.
- Monitor platform performance and troubleshoot reliability, performance, and capacity issues.
- Implement cost optimization strategies and assist with budget and quota management.
- Maintain logging, monitoring, and alerting via Cloud Monitoring, Cloud Logging, and third-party tools.
- 4. Infrastructure as Code (Terraform)
- Build, maintain, and enhance infrastructure-as-code (IaC) templates using Terraform.
- Standardize deployments and enforce modular, reusable Terraform code structures.
- Integrate Terraform with CI/CD pipelines for automated provisioning.
- Conduct security and compliance reviews for Terraform builds and cloud resources.
- 5. Automation & Process Improvement
- Develop scripts and automations using Python, Bash, or Cloud-native tools to reduce manual operational overhead.
- Identify opportunities for service improvements and drive optimization initiatives.
- Create runbooks, SOPs, and automation frameworks to streamline repeated tasks.
- 6. Security & Compliance
- Implement and maintain GCP security best practices, including IAM least privilege, VPC security, firewall rules, CMEK, and security posture management.
- Assist in internal/external audits and ensure cloud environment compliance with organizational policies.
- Monitor and respond to security alerts, vulnerabilities, and threats.
- 7. Documentation & Knowledge Sharing
- Maintain comprehensive documentation for infrastructure, procedures, incident reports, and change logs.
- Mentor junior engineers (L1/L2) and conduct knowledge transfer sessions.
- Contribute to technical architecture reviews and best-practice recommendations.
Required Qualifications- Minimum 5+ years of hands-on experience in Google Cloud Platform support and administration.
- Strong understanding of GCP architecture, networking, security, IAM, compute, storage, and container services.
- Proven experience handling 24Γ7 support, on-call rotations, and production-critical workloads.
- Advanced proficiency with Terraform for IaC, including modules, workspaces, and GCP providers.
- Hands-on experience with CI/CD tools such as Cloud Build, Jenkins, GitLab, or GitHub Actions.
- Strong troubleshooting skills across Linux, networks, APIs, and cloud-native services.
- At least one active GCP Associate/Professional certification is mandatory.
- Deep understanding of ITIL processes (Incident, Problem, Change Management).
- Strong communication, analytical thinking, and technical documentation skills.
Preferred Skills (Good to Have)- Experience with Kubernetes/GKE cluster operations and troubleshooting.
- Familiarity with monitoring and observability tools (Datadog, Prometheus, Grafana, ELK, etc.).
- Python/Go scripting experience.
Soft Skills- Strong problem-solving and analytical skills.
- Ability to work independently under pressure.
- High sense of ownership and accountability.
- Excellent verbal and written communication for customer-facing scenarios.
\n