0
Publications
0
Followers
0
Following
0
Questions
About
An IT Infrastructure Specialist with 13 years of experience, I have developed a profound expertise in systems
engineering, particularly in managing high-performance computing (HPC) clusters, performance tuning, and systems
automation. My skill set has evolved to encompass modern cloud technologies and DevOps practices, with a specialized
focus on integrating AI and ML infrastructures. I am proficient in AWS, Azure, Docker, Kubernetes, Git, Jenkins, and other
pivotal DevOps tools, and have substantial experience in optimizing cloud environments specifically for AI and ML
applications. This includes automating workflows and implementing advanced security measures with Palo Alto solutions.
My expertise is not just technical; it involves driving innovation and efficiency in complex IT infrastructures, ideally suited
for supporting academic institutions in developing and enhancing their educational and research programs in AI, ML, and
related rapidly advancing technological fields.
Skills & Expertise
Operating systems
Cyber Security
Linux
AWS
Cloud
Virtualization
Large scale data centers
Distributed Systems
Network Security
Databases
Storage
Python
Memory Management
Artificial Intelligence
Research Interests
cloud computing
Computer Network
Web Development Technology
Internet of Things
Virtual Reality
Cybersecurity
Networking
Connect With Me
Experience
- Data Center Launch: Played a pivotal role in the launch of Meta's first data center in Chandler, focusing on GPU- based systems and spearheading large-scale data center collaboration. • High-Performance Computing (HPC) Management: Expertly managed and optimized high-performance computing clusters to ensure robust performance and high availability. • Cluster Health Monitoring: Conducted extensive monitoring of cluster health and system alerting using advanced tools like Prometheus, Grafana, and Splunk. • Configuration Management: Utilized Chef for efficient configuration management, significantly reducing operational burdens. • Performance Tuning: Implemented strategies for low-latency performance tuning and optimization, specifically for Linux-based systems. • Kubernetes Deployment: Deployed Kubernetes to enhance microservices deployment in CI/CD pipelines, thereby improving containerization processes. • Web Application Optimization: Enhanced file systems performance for LAMP stacks and optimized Nginx and HAProxy load balancers for high-traffic web applications. • Scripting for Automation: Developed Bash and Python scripts to automate system configurations, effectively minimizing manual errors. • Systemic Issue Resolution: Conducted thorough debugging and root cause analysis, playing a key role in minimizing downtime and maximizing system reliability. • Collaboration and Knowledge Sharing: Collaborated with cross-functional teams and vendors to enhance hardware serviceability, delivered global impact by resolving tickets across multiple sites, and established an extensive knowledge base for the SRE team, leading to reduced issue resolution times.
- Played a pivotal role in the launch of Meta&'s first data center in Chandler, focusing on GPU- based systems and spearheading large-scale data center collaboration. • High-Performance Computing (HPC) Management: Expertly managed and optimized high-performance computing clusters to ensure robust performance and high availability. • Cluster Health Monitoring: Conducted extensive monitoring of cluster health and system alerting using advanced tools like Prometheus, Grafana, and Splunk. • Configuration Management: Utilized Chef for efficient configuration management, significantly reducing operational burdens. • Performance Tuning: Implemented strategies for low-latency performance tuning and optimization, specifically for Linux-based systems. • Kubernetes Deployment: Deployed Kubernetes to enhance microservices deployment in CI/CD pipelines, thereby improving containerization processes. • Web Application Optimization: Enhanced file systems performance for LAMP stacks and optimized Nginx and HAProxy load balancers for high-traffic web applications. • Scripting for Automation: Developed Bash and Python scripts to automate system configurations, effectively minimizing manual errors. • Systemic Issue Resolution: Conducted thorough debugging and root cause analysis, playing a key role in minimizing downtime and maximizing system reliability. • Collaboration and Knowledge Sharing: Collaborated with cross-functional teams and vendors to enhance hardware serviceability, delivered global impact by resolving tickets across multiple sites, and established an extensive knowledge base for the SRE team, leading to reduced issue resolution times.
- Data Center Architecture and Management: Oversaw the design, implementation, and management of data centers dedicated to regional government operations, ensuring robust, secure, and efficient infrastructure. • Primary Data Center: SwithcNap, Las Vegas, Nevada. • Secondary Data Center: QTS, Hillsboro, Oregon. • Enterprise Compute Architecture: Led the deployment of Palo Alto UCS B-200 M4, M5 hardware in primary data centers, aligning with SCAG's high-performance computing requirements. • Virtualization Solutions: Masterminded the implementation and management of VMware hypervisor 7.0.2, Vcenter Server, and Vrops 8.3, enhancing virtualization capabilities across the agency. • Active Directory and Network Services: Engineered and maintained Active Directory, DNS, DHCP, and Print Services, ensuring robust and secure network services. • AWS and Azure Cloud Integration: Managed AWS Route 53 for public DNS, S3, Glacier storage, and Azure cloud integrations, facilitating a seamless hybrid cloud infrastructure. • Linux/Windows Server Environment: Administered a mixed environment of Windows Server (2012, 2016, 2019, 2022) and Linux (RHEL-7.8), focusing on system optimization and security. • Enterprise Storage Solutions: architectured and managed storage solutions using EMC-Unity, AFF A800, and Storage Grid Netapp, ensuring data integrity and availability. • Disaster Recovery and High Availability: Developed and implemented disaster recovery strategies for compute, storage, and backup systems, ensuring high availability of critical applications. • Cloud and Virtualization Projects: Spearheaded key projects like Azure VMware Cloud Datacenter solutions, Citrix Virtual Desktop Infrastructure for remote work, and AWS-based backup solutions. • Infrastructure Modernization: Played a pivotal role in migrating infrastructure services to a hybrid cloud model, upgrading Active Directory systems, and implementing multi-factor authentication across the network.
- Data Center Design and Management: Directed the design, implementation, and management of two primary data centers, focusing on high availability and disaster recovery. Primary Data Center: AWS [US-West-1, US-West-2]. Secondary Data Center: AWS [US-East-1, US-East-2]. Virtualization and Compute Solutions: Implemented and managed Cisco UCS B-200 M3, M4, M5 blades and VMware VSphere 6.5/6.7 solutions, enhancing virtualization infrastructure. VMware and Hyper-V Management: Configured VMware vRops for comprehensive monitoring of blades and VMs, and managed Hyper-V server environments. Citrix Solutions: Supported Citrix 6.5 XenApp solutions and designed Citrix Netscaler load balancers, ensuring efficient application delivery and load balancing. Network Services Administration: Managed DNS and DHCP services, upgrading server infrastructure to Windows 2012 R2/2016. Cloud Services Integration: Worked extensively with AWS services, including Route 53, S3, Glacier, and Azure cloud integrations, facilitating a seamless hybrid cloud environment. Microsoft Ecosystem Management: Supported Exchange 2010, Office 365 migration, and managed KMS servers for Windows and Office products. Security and Compliance: Configured ZIX Server for email encryption and implemented security patches via Microsoft SCCM. Linux and Windows Server Management: Installed and configured a mix of Windows and Linux file servers, catering to diverse client requirements. Storage and Backup Solutions: Specialized in configuring EMC-Unity, AFF A800, and Storage Grid Netapp for storage solutions. Deployed Veeam and CommVault backup solutions, integrating with AWS-S3 for enhanced data protection. Active Directory Administration: Oversaw the management of all Active Directory services and domain Data Center Design and Management: Directed the design, implementation, and management of two primary data centers, focusing on high availability and disaster recovery. Primary Data Center: AWS [US-West-1, US-West-2]. Secondary Data Center: AWS [US-East-1, US-East-2]. Virtualization and Compute Solutions: Implemented and managed Cisco UCS B-200 M3, M4, M5 blades and VMware VSphere 6.5/6.7 solutions, enhancing virtualization infrastructure. VMware and Hyper-V Management: Configured VMware vRops for comprehensive monitoring of blades and VMs, and managed Hyper-V server environments. Citrix Solutions: Supported Citrix 6.5 XenApp solutions and designed Citrix Netscaler load balancers, ensuring efficient application delivery and load balancing. Network Services Administration: Managed DNS and DHCP services, upgrading server infrastructure to Windows 2012 R2/2016. Cloud Services Integration: Worked extensively with AWS services, including Route 53, S3, Glacier, and Azure cloud integrations, facilitating a seamless hybrid cloud environment. Microsoft Ecosystem Management: Supported Exchange 2010, Office 365 migration, and managed KMS servers for Windows and Office products. Security and Compliance: Configured ZIX Server for email encryption and implemented security patches via Microsoft SCCM. Linux and Windows Server Management: Installed and configured a mix of Windows and Linux file servers, catering to diverse client requirements. Storage and Backup Solutions: Specialized in configuring EMC-Unity, AFF A800, and Storage Grid Netapp for storage solutions. Deployed Veeam and CommVault backup solutions, integrating with AWS-S3 for enhanced data protection. Active Directory Administration: Oversaw the management of all Active Directory services and domain
- Advanced Data Recovery Solution: Spearheaded the development and implementation of a sophisticated data recovery solution, significantly enhancing data protection capabilities for enterprise clients.
Education
University College of Engineering (UCE)
Conferences & Seminars (1)
Technovation
No descriptions
Certificates & Licenses (2)
AWS Certified Solutions Architect Professional
AWS DevOps Engineer Professonal
dd