Sr. Site reliability engineering- Knowledge of various API protocols, including REST, GraphQL and gRPC.
Description
Albert‘s mission is to foster a software platform that uses big data and machine learning to drastically accelerate the invention of new formulations and novel materials. We are looking for a Sr. Site Reliability Engineer to solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. This role suits those who have a passion for building robust, developer-friendly services at scale, enjoy working on cloud infrastructure as well as helping set development/deployment standards and conventions across the organization. The SRE team’s goal is to ensure that our cloud infrastructure has a wonderful developer experience and is easy to build, test, deploy, measure, and scale.
What you’ll do
- Act as a passionate representative of the Henkel and Albert product and brand.
- Regularly work (up to 50%) as a part of the development team in defining & and implementing features/ improvements in service architecture. We firmly believe that having a deep understanding of the product can help you to design an optimal solution.
- Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
- Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of all the micro-services.
- Responsible for the design and delivery of the mission-critical stack, with a focus on security, resiliency, scale, and performance.
- Authority for end-to-end performance and operability.
- Demonstrate a clear understanding of automation and orchestration principles.
- Act as an ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
You will have
- 8+ years of engineering experience, with at least 2 years spent as an SRE.
- Proven experience working on a large scale, enterprise API services, involving multi-tenancy, machine learning, microservices, and SQL/NoSQL infrastructure.
- Knowledge of various API protocols, including REST, GraphQL and gRPC.
- Knowledge of backend technologies such as NodeJS, Fast API, Swagger, and AWS services including but not limited to Elastic Caching, DynamoDB, MySQL, API gateway, Route 53, API Gateway, ALB, Lambda, S3, Cloudfront, Cloudwatch.
- Proficient in IAC (Infrastructure as Code), preferably using terraform.
- Hands-on experience with observability stack including centralized log management, metrics & tracing.
- Familiarity with CI/CD tools like CircleCI and performance testing using K6.
- A desire to bring more automation and standards to an Engineering organization.
- A desire to build high-performance APIs with lower latencies (< 200>
- Ability to work in a fast-paced environment and learn from peers and leaders.
- Ability to lead technically, mentor other engineers, and help facilitate the growth of the team through active participation in recruiting and related activities
Why Albert
We have a huge impact. Albert is a small team with a big reach. Our Platform facilitates the invention of materials for tens of thousands of customers and hundreds of thousands of applications – from coatings used on Rockets to adhesives used in Electric Vehicles to 3D printed medical devices.
Position
Site reliability engineering
Salary
- 22 – 27 Lakh/Year INR
- Payroll
- Onsite
- Bangalore, Karnataka, India
Originally Published At