Optimization4 min readApr 18, 2026

The Three Quality Tiers: Low, Medium, High, and When Each Pays Off

Three tiers, a 40x price spread, and one decision that drives most of your cost. Here is the math on when low is fine, when medium is obvious, and when high earns its keep.

OpenAI kept the three-tier structure from GPT Image 1 through 1.5, and every signal says GPT Image 2 keeps it too. Low, medium, high. Simple label, big price spread.

On 1.5, low-quality 1024x1024 is $0.005 per image. High-quality 1792x1024 is $0.20. A 40x multiplier for the same prompt. Most teams either overpay by defaulting to high or underpay by defaulting to low and regenerating, which costs more in total.

Skip the regret. Here is the decision tree.

What the tiers actually do

Tier is not resolution. Resolution is separate. Tier is the compute per image, which correlates with sampling steps, the refinement pass, and in Image 2 likely the internal embedding size.

Low runs fast and takes big swings. Good for distribution-wide aesthetic, not detail. Medium is the sweet spot for photographic and illustrative work. High gets the fiddly stuff: hands with correct finger counts, text that reads, architectural symmetry, fine fabric patterns.

Sub-3-second generation on Image 2 is medium tier. Expect high 4 to 6 seconds. Low sub-second.

The math

Assume 10,000 images a month.

Default to high. 10,000 x $0.20 = $2,000. Human-reject rate 5 percent means 500 regens at $0.20 = $100. Total $2,100.

Try low first, upgrade on failure. 10,000 x $0.005 = $50. Low failure rate 30 percent means 3,000 regens at medium $0.04 = $120. Of those, 10 percent still fail, so 300 at high $0.20 = $60. Total $230. Ten times cheaper.

A cost comparison chart across the three tiers

The math only works with an automated failure detector. For tasks like "is the text readable," a second model as judge works. For subjective brand fit, you need a human, and the savings evaporate because humans cost more than the image did.

When low is fine

Thumbnails displayed at 400px or smaller. Detail loss is invisible.
Mood boards. Generating 50 variants to pick from.
Background plates. Bokeh, blur, heavy texture.
Re-rendered inputs. If the image feeds a video generator as a first frame, you lose detail anyway.

On Image 2, low closes the gap with medium more than on 1.5. Early Arena comparisons suggest low-tier 2 looks close to medium-tier 1.5.

When medium is the default

Medium is the default for e-commerce product shots at 800 to 1200 pixels, marketing illustrations, editorial art, and anything that holds up at normal viewing distance but is not printed.

On 1.5, medium is roughly $0.04 for 1024x1024 and $0.08 for 1792x1024. Expect Image 2 pricing to land in the same band as GPT Image 1.5 until OpenAI posts something official. Medium is where 80 percent of production traffic lives.

When high earns it

Print output. Billboard, packaging, poster. Physical media exposes the medium-vs-high gap.
Dense text. Paragraphs, not headlines. High tier in Image 2 is where 99 percent glyph accuracy actually holds.
Hands, faces, fine symmetry. Close-up portraits and architecture need the refinement pass.
Hero shots. Homepage image. Pay the $0.20.

Cost-aware code

example.tsTS

1import { fal } from "@fal-ai/client";
2
3async function generateWithBudget(prompt: string) {
4  const tiers = ["low", "medium", "high"] as const;
5  for (const quality of tiers) {
6    const result = await fal.subscribe("fal-ai/gpt-image-1.5/edit", {
7      // or fal-ai/gpt-image-2 once available
8      input: {
9        prompt,
10        image_urls: ["https://example.com/base.jpg"],
11        quality,
12        image_size: "square_hd",
13        num_images: 1
14      }
15    });
16    const image = result.data.images[0];
17    if (await passesJudge(image.url, prompt)) {
18      return { url: image.url, quality };
19    }
20  }
21  throw new Error("failed even at high tier");
22}

passesJudge is your detector. Simple version: a cheap vision model with a yes-or-no question. Rigorous: CLIP similarity above a threshold, OCR accuracy for text prompts, face-landmark presence for portraits.

A diagram showing the tier escalation path with decision points

The one trap

Do not mix tiers across a batch that needs visual consistency. Tier transitions show. The refinement pass shifts color and detail signature enough that medium and high images next to each other look like separate shoots.

If you need consistency on a budget, pick medium for the whole set. Do not save on wallpaper and spend on the hero. It will look wrong.

The upgrade path

When Image 2 drops, tier usage shifts down one notch. What needed high will need medium. What needed medium will need low. Plan for 30 to 50 percent cost reduction on existing workload if you re-tune.

Back to all posts