Model name (e.g. "llama3.1:8b").
Input prompt.
Callback invoked per chunk; must be @safe.
Optional system prompt.
Optional base64-encoded images (multimodal).
Structured output: JSONValue("json") or JSON Schema.
How long to keep the model loaded.
Typed generation options.
Streaming text generation — calls onChunk for every response token.
Each call to onChunk receives one NDJSON chunk. The chunk contains a "response" string token and a boolean "done". The final chunk has "done": true and carries usage/timing metadata.