Go Back Research Article November, 2024

Design and Evaluation of Resource-Aware AI Services Using Serverless Functions on the Cloud

Abstract

As the demand for artificial intelligence (AI) services continues to scale, cloud-native paradigms like serverless computing have emerged as critical enablers of efficient, elastic, and cost-effective AI deployments. This study investigates the design and evaluation of resource-aware AI services using serverless functions in the cloud. We explore architectural models, resource allocation mechanisms, and scheduling techniques tailored for dynamic AI workloads. Using published benchmarks and container orchestration logs, we compare function performance under different resource-aware policies. Our findings indicate that resource-aware adaptation can reduce cold start latency by 35% and improve execution throughput by up to 42% across heterogeneous workloads. The paper contributes a lightweight evaluation framework and discusses implications for sustainability in large-scale AI inferencing environments

Keywords

Serverless Computing Cloud AI Resource-Aware Scheduling Faas Container Orchestration Edge-Cloud Continuum
Document Preview
Download PDF
Details
Volume 5
Issue 2
Pages 6-11
ISSN 3821-5947