Fly.io Pricing in 2026: Plans, Cost & Free Trial
Every Fly.io plan, what's actually included at each tier, and whether the cost holds up against the alternatives.
Fly.io Plans & Pricing
Fly.io uses pay-as-you-go pricing for machine compute, storage, and bandwidth. Unlike seat-based or fixed-plan platforms, you pay for the actual resources your applications consume minute-by-minute. For AI applications, this means understanding your machine sizing requirements (CPU and RAM determine cost), storage needs for model weights (volumes are priced per GB/month), and bandwidth for returning AI responses. Fly.io includes a free monthly allowance that covers lightweight development and testing without any payment method required.
| Plan | Price | Best For |
|---|---|---|
| Hobby (Free) | Free | Individuals & light usage |
| Pay-as-you-go Most Popular | $5/mo | Most popular choice |
| Scale | Custom | Enterprise & custom needs |
Is Fly.io Worth the Price?
Fly.io's pricing is competitive with raw cloud providers while including the managed infrastructure (TLS, anycast networking, rolling deployments) that would cost significant engineering time to set up on raw cloud. For AI applications that need global distribution, Fly.io's multi-region deployment at no additional per-region infrastructure cost is a significant value advantage — deploying to 3 regions on AWS means 3x the EC2, load balancer, and networking costs, while Fly.io's regional scaling adds compute cost only for the additional machines. The persistent VM model means you pay for machines whether they're serving requests or not, which differs from serverless per-invocation billing. For AI APIs with consistent traffic (always processing requests), this is more economical. For APIs with very sparse traffic (occasional requests), a serverless model or RunPod Serverless may be cheaper. Understanding your traffic pattern is essential for Fly.io cost optimization.
Fly.io operates on a usage-based model without named tiers. The free monthly allowance covers 3 shared-CPU VMs (256MB RAM each), 3GB persistent volume storage, and 160GB outbound data transfer. Beyond the free allowance, compute is billed per second of machine runtime. Shared-CPU VMs start around $1.94/month for 256MB RAM. Performance CPU VMs with dedicated compute for AI workloads start higher. GPU Machines (A10 and A100) are billed at GPU rates — contact Fly.io for current GPU pricing. Volumes are billed at $0.15/GB/month. Outbound bandwidth beyond the free allowance is billed per GB.
Fly.io Free Trial — What's Included?
Fly.io's free tier (Hobby plan) provides a monthly resource allowance with no credit card required. Three shared-CPU VMs, 3GB volume storage, and 160GB bandwidth are included free every month. This allowance is sufficient for deploying a small AI application, testing the deployment workflow, and evaluating Fly.io for your production needs. Adding a payment method enables access to larger VMs and GPU Machines beyond the free allowance.
Frequently Asked Questions
How much does Fly.io cost for an AI backend?
A single always-on Python AI service on Fly.io running a performance-2x machine (2 dedicated CPU, 4GB RAM) costs approximately $31/month. Add a 10GB persistent volume for model storage at $1.50/month. Total for a minimal production AI backend: around $32-40/month depending on traffic and configuration. GPU Machines are significantly more expensive — contact Fly.io for current GPU pricing.
Fly.io does not charge for deploys or builds. You are charged only for running machine time, storage, and outbound bandwidth. Deploying a new version of your AI application (which briefly runs both old and new versions for zero-downtime switching) incurs a small charge for the overlap period, but this is minimal.
Yes. Fly.io's free monthly allowance covers 3 small VMs, 3GB storage, and 160GB bandwidth — no credit card required. This is sufficient for deploying test AI services, experimenting with the deployment model, and evaluating Fly.io before committing to paid resources.
Yes. Fly.io has a startup program that provides credits for early-stage companies. Apply through the Fly.io website — qualifying startups receive credits that offset compute costs during early product development. Check fly.io/docs for current program details and eligibility requirements.
Fly.io is generally more cost-effective than AWS for small to medium AI applications when you factor in the managed infrastructure included (TLS, networking, rolling deployments). A comparable AWS setup (EC2 + ALB + ACM + deployment tooling) costs more in both dollars and engineering time. At very large scale, AWS reserved instances and spot instances can be cheaper with dedicated infrastructure optimization.
Fly.io provides a pricing calculator at fly.io/pricing. Estimate your machine size (RAM and CPU needed for your AI model and framework), number of regions, volume storage for model weights, and expected outbound bandwidth (AI responses are verbose — a chat API handling 1,000 messages per day at 500 words each generates roughly 3.5MB of outbound data daily). For GPU Machines, request a quote from Fly.io's sales team. Running a small AI API in a single region with a 10GB volume and modest traffic typically costs $30-60/month on Fly.io.
Fly.io charges for the machines running in each region, not a per-region fee. Adding a second region means running additional machine instances in that region — the cost is the machine time in that region, same as your primary region. Inbound data transfer is free; outbound bandwidth counts toward your monthly allowance across all regions combined. Multi-region deployment on Fly.io is significantly cheaper than equivalent multi-region setups on traditional cloud providers where each region requires separate VPC, load balancer, and networking infrastructure.
When a Fly.io machine exceeds its memory allocation, the process is killed and the machine restarts. For AI services loading large models into memory, this results in a service restart and cold start delay. Prevent this by choosing a machine size with adequate memory for your model plus operating overhead — add 25-30% headroom above the model's documented VRAM/RAM requirement. Monitor memory usage in the Fly.io metrics dashboard and configure alerts before OOM conditions occur. Upgrading machine size is a configuration change that takes effect on the next deployment.
Standard Fly.io machines are not preempted — they run continuously until you stop or destroy them. This is an important advantage for AI workloads: an inference request in progress will not be interrupted by infrastructure events. Unlike AWS spot instances or RunPod Community Cloud pods that can be preempted with short notice, Fly.io machines run with consistent availability. For AI applications where inference interruption would cause user-facing errors, Fly.io's non-preemptible machines provide better reliability than spot or community cloud alternatives.
Fly.io's free plan includes 160GB of outbound bandwidth monthly; paid usage is billed per GB beyond this allowance. For AI response streaming — where LLM responses are delivered as token streams — each streaming response consumes bandwidth as it's delivered. A 1,000-token response at 4 bytes per token is about 4KB of outbound data. A high-traffic AI API handling 100,000 streaming requests per month at average 2,000 tokens would consume approximately 800MB of bandwidth — well within Fly.io's free allowance. Monitor bandwidth consumption in the Fly.io dashboard if your AI service handles very high message volumes.
Was this guide helpful?
Thanks for the signal — we'll keep this guide sharp.