Every interaction with a chat model starts by creating a Chat.
Creating a Chat
The simplest entrypoint is Chat.fromPath:
import Quaynor
let chat = try await Chat.fromPath(
modelPath: "huggingface:bartowski/Qwen_Qwen3-0.6B-GGUF/Qwen_Qwen3-0.6B-Q4_K_M.gguf"
)
If you want to share one loaded model across multiple chat sessions, load the Model first:
import Quaynor
let model = try await Model.load(
modelPath: "huggingface:bartowski/Qwen_Qwen3-0.6B-GGUF/Qwen_Qwen3-0.6B-Q4_K_M.gguf"
)
let chat1 = try Chat(model: model)
let chat2 = try Chat(model: model)
Asking and streaming
chat.ask returns a TokenStream.
let stream = try chat.ask("Why is the sky blue?")
To wait for the full answer:
let full = try await stream.completed()
print(full)
To stream tokens as they arrive:
for try await token in stream {
print(token, terminator: "")
}
Chat history
Quaynor keeps the conversation history inside the Chat instance.
Read it:
let messages = try await chat.getChatHistory()
print(messages.count)
Replace it:
try await chat.setChatHistory([
.message(role: .system, content: "You are concise."),
.message(role: .user, content: "Summarize the task.")
])
System prompt and context
Set the system prompt when creating the chat:
let chat = try await Chat.fromPath(
modelPath: "/path/to/model.gguf",
systemPrompt: "You are a precise engineering assistant.",
contextSize: 4096
)
Reset the current context while optionally changing defaults:
try await chat.resetContext(systemPrompt: "You are a code reviewer.")
Or just clear the accumulated history:
try await chat.resetHistory()
Template variables
Some models expose extra chat-template switches such as reasoning toggles.
let chat = try await Chat.fromPath(
modelPath: "/path/to/model.gguf",
templateVariables: ["enable_thinking": true]
)
Update them later:
try await chat.setTemplateVariable(name: "enable_thinking", value: false)
let variables = try await chat.getTemplateVariables()
print(variables)
GPU
GPU acceleration is enabled by default. Disable it with useGpu: false when needed:
let model = try await Model.load(
modelPath: "/path/to/model.gguf",
useGpu: false
)