DEV Community

Daya Shankar profile picture

Daya Shankar

404 bio not found

Joined Joined on 
Cold Starts, Model Loading, and Their Impact on Latency SLAs

Cold Starts, Model Loading, and Their Impact on Latency SLAs

Comments
10 min read

Want to connect with Daya Shankar?

Create an account to connect with Daya Shankar. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
Operational Risks of Running Large Multi-Tenant Kubernetes Clusters

Operational Risks of Running Large Multi-Tenant Kubernetes Clusters

1
Comments
10 min read
Hosted control plane: when it simplifies operations and when it adds complexity

Hosted control plane: when it simplifies operations and when it adds complexity

Comments
11 min read
Serving LLMs on IaaS: throughput vs latency tuning with practical guardrails

Serving LLMs on IaaS: throughput vs latency tuning with practical guardrails

Comments 1
9 min read
Memory Ballooning Effects in Virtualized Cloud Environments

Memory Ballooning Effects in Virtualized Cloud Environments

Comments
6 min read
Hybrid Orchestration Basics: Avoiding Single-Provider Risks in 2026

Hybrid Orchestration Basics: Avoiding Single-Provider Risks in 2026

Comments
7 min read
GPU Scheduling Deep Dive: How Cloud Providers Allocate GPUs for Multi-Tenant AI Workloads

GPU Scheduling Deep Dive: How Cloud Providers Allocate GPUs for Multi-Tenant AI Workloads

Comments
9 min read
How to Set Up Edge Infrastructure for Low-Latency Production Apps in India

How to Set Up Edge Infrastructure for Low-Latency Production Apps in India

Comments
9 min read
Managed Cloud Infrastructure: What’s Included, What’s Not, and Why It Matters

Managed Cloud Infrastructure: What’s Included, What’s Not, and Why It Matters

Comments
5 min read
Private connectivity vs VPN: when to upgrade your network architecture

Private connectivity vs VPN: when to upgrade your network architecture

Comments
6 min read
How PCIe, NVLink, and NUMA Topology Affect GPU Scheduling Outcomes

How PCIe, NVLink, and NUMA Topology Affect GPU Scheduling Outcomes

2
Comments
9 min read
The Role of Edge & Distributed Data Centers in Reducing Compute Latency

The Role of Edge & Distributed Data Centers in Reducing Compute Latency

Comments
10 min read
How GPU Cloud Providers Handle Long-Tail Job Backlogs

How GPU Cloud Providers Handle Long-Tail Job Backlogs

2
Comments
7 min read
How to Design a Scalable Architecture for Cloud Applications

How to Design a Scalable Architecture for Cloud Applications

1
Comments
6 min read
loading...