Skip to content

ECO-AIM-AI-002

Name: No inference batching

Category: AIM

Family: AI

Primary layer: ai

System layers: ai

Description

No batching increases per-request overhead and lowers throughput.

Impact

  • confidence: 0.7
  • notes: Often significant for GPU inference.
  • type: cpu

Detection

  • languages:
  • python
  • infra
  • method: trace

Remediation

  • guidance: Introduce batching at gateway/inference server.
  • tradeoffs: Added latency for low volume; needs tuning.

Pattern examples

No pattern examples provided.

Remediation examples

No remediation examples provided.