Allah‑u‑Abha Rodrigues | ML Engineer • SWE

Research Overview

Modern DevOps practices are evolving beyond simple automation to incorporate artificial intelligence and machine learning capabilities. My research explores how intelligent systems can predict, prevent, and automatically resolve infrastructure issues before they impact users.

This research is driven by the increasing complexity of cloud-native applications and the need for self-healing systems that can operate reliably at scale. By combining traditional DevOps practices with AI-powered insights, we can create more resilient and efficient infrastructure.

Key Research Areas

Predictive Infrastructure Scaling

Active

Developing ML models that can predict resource needs and automatically scale infrastructure before demand spikes occur.

Automated Incident Detection and Response

In Progress

Creating intelligent monitoring systems that can detect anomalies, diagnose issues, and implement fixes automatically.

ML-Powered Deployment Strategies

Research Phase

Implementing deployment pipelines that use machine learning to optimize rollout strategies and minimize risk.

Intelligent Monitoring and Alerting

Planning

Building smart alerting systems that reduce noise and focus on actionable insights using natural language processing.

Current Findings

Scaling Accuracy

Predictive scaling models show 90% accuracy in forecasting resource needs up to 2 hours in advance, reducing over-provisioning by 35%.

Incident Response Time

Automated incident response systems reduce mean time to resolution (MTTR) by 70% compared to manual processes.

Alert Noise Reduction

ML-powered alerting reduces false positives by 80% while maintaining 99.9% detection rate for critical issues.

Implementation Case Study

HapSTR Deployment Pipeline Optimization

During my internship at HapSTR, I implemented an intelligent CI/CD pipeline that reduced deployment time from 2 days to 45 minutes (97% reduction) while improving reliability. The system incorporated:

Automated testing with 88% code coverage
Blue-green deployment for zero-downtime releases
Predictive failure detection using historical deployment data
Real-time monitoring and automated rollback capabilities

Future Directions

The next phase of this research will focus on developing more sophisticated AI models that can understand complex system interactions and make autonomous decisions. Key areas include:

Self-healing infrastructure that can automatically fix common issues
Cross-cloud optimization strategies for multi-region deployments
AI-powered capacity planning for long-term resource allocation
Intelligent cost optimization algorithms for cloud spending

Tools & Technologies

Infrastructure

AWSDockerKubernetesTerraformPrometheus

AI/ML Stack

PythonTensorFlowScikit-learnMLflowGrafana

DevOps Automation & Intelligence