The offline pipeline's primary objective is regression testing — identifying failures, drift, and latency before production.
I tested the new ChatGPT Images 2.0 model with 10 real-world prompts to check how the model performs in different scenarios.