Management of cloud server infrastructure, monitoring, scaling, performance tuning, capacity planning.
- Automating infrastructure by developing scripts/ programs for routine & complex tasks and infra projects.
- Troubleshooting and debugging of production issues/outages, driving, and implementation of RCAs, provide day to day support to developers/engineers/QA and automate routine tasks.
- Provisioning and management of clusters / containers / docker/ rancher.
- Setup backups / replications / archiving, implement disaster recovery.
- Regular Perf and security audits; patching & up-gradation of systems, resources; benchmarking & capacity planning. Scanning, identification, and fixing of security issues, vulnerabilities; Safeguard application & system information against accidental or unauthorized damage through an implementation of best practices and recommendations.
- 3 years of experience, Good educational background, preferably in the fields of computer science or engineering
- Hands on experience in system administration & troubleshooting, primarily on Linux platform
- Excellent exposure in provisioning, managing, optimizing cloud infrastructure in mainstream cloud provider (AWS )
- Experience of managing infrastructure with front-end & back-end applications based on technologies like NodeJS, utilizing load balancers & Web servers such as NGINX, ELB
- Experience in conceptualization and implementation of CI/CT/CD pipeline using tools like Chef/ Puppet/ Jenkins /Git /artifactory /docker /containers.
- Understanding of managing database tools and technologies (Postgre, MySQL, MongoDB, Redis), Backup and recovery for DBs, Scaling and tuning for performance
- Sound understanding, administration and implementation of Monitoring/Alerting/Trending tools and process flow (Nagios/Icinga, New Relic, PagerDuty, loggly etc)
- Knowledge of best practices while using / implementing security tools and techniques (VPN, SSH, Encryption, tunnels etc.)