Build AWS Batch job definitions for Fargate and EC2 with container configs.
Last verified: May 2026
Build AWS Batch job definition configs with container properties, retry strategies, and timeouts.
Required Fields
jobDefinitionNametypecontainerProperties.imagecontainerProperties.resourceRequirementsOutput will appear here...AWS Batch simplifies running batch computing workloads by automatically provisioning compute resources and scheduling jobs. A job definition specifies the container image, vCPU and memory requirements, environment variables, mount points, retry strategies, and timeout settings. With support for both Fargate and EC2 compute environments, you need to choose the right platform capabilities and configure resource requirements accordingly. The Batch Job Definition Builder generates complete job definition JSON with proper container properties, retry logic, and platform-specific configurations.
Your team runs nightly ETL on a 24/7 EC2 cluster ($800/month). Most jobs run for ~2 hours. The builder generates a Fargate-based Batch job definition: 4 vCPU, 16 GB memory, retry strategy with evaluateOnExit (retry on transient errors, fail on OOM). Combined with a Fargate Spot priority job queue, the same workload runs in ~2 hours/day instead of 24/7 idle. New cost: ~$50/month. Annual savings: $9K, plus the operational simplification of not managing EC2 capacity at all.
Always use evaluateOnExit rules instead of blanket retries. Retrying on exit code 137 (OOM kill) is wasteful — the next run will OOM too unless you also bump memory. Retry on exit code 1 (transient errors), fail fast on 137 with a clear error message so the engineer knows to right-size memory.
Fargate Spot for batch is the killer cost optimization. Most batch jobs are checkpoint-able or idempotent, so Spot interruption is acceptable. Set up a job queue with Fargate Spot priority + Fargate on-demand fallback. You'll save 60-70% on compute with minimal operational impact.
Don't use Fargate for jobs needing >30 GB memory or specific GPU types — those need EC2. The vCPU/memory pairing rules on Fargate are restrictive: 1 vCPU max 8 GB, 4 vCPU max 30 GB. Beyond that, EC2 launch type with custom AMIs is the right choice.
The builder constructs Batch job definition JSON with: type (container, multinode), platform capabilities (FARGATE or EC2), container properties (image, vCPU, memory, GPU resource requirements, environment, mount points, networkConfiguration, executionRoleArn, jobRoleArn), retry strategy (attempts + evaluateOnExit rules), timeout, and parameters. Output is generated as aws batch register-job-definition commands and Terraform aws_batch_job_definition resources.
Was this tool helpful?
Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.