Here’s an enhanced blog-post-style overview of the Jan‑v1‑4B model from the Hugging Face Hub:
Jan-v1-4B is the inaugural model in the Jan family—crafted for agentic reasoning and problem-solving within the Jan App, an AI assistant platform from Menlo Research. It is fine-tuned from the Lucy model architecture, leveraging superior model scaling for enhanced performance
(Hugging Face).
Powered by Qwen3‑4B‑thinking, this model is designed for advanced reasoning and robust tool integration capabilities, making it well-suited for complex, multi-step tasks
(Hugging Face).
Users can seamlessly access Jan‑v1 by selecting it from the Jan App interface—no additional setup required
(Hugging Face).
To run Jan‑v1 locally, two popular frameworks are supported:
vllm serve janhq/Jan-v1-4B \ --host 0.0.0.0 \ --port 1234 \ --enable-auto-tool-choice \ --tool-call-parser hermesllama-server --model Jan-v1-4B-Q4_K_M.gguf \ --host 0.0.0.0 \ --port 1234 \ --jinja \ --no-context-shiftUsers are advised to apply the following parameters for optimal performance:
temperature: 0.6
top_p: 0.95
top_k: 20
min_p: 0.0
max_tokens: 2048
There’s also a GGUF-formatted version—denoted Jan‑v1‑4B‑GGUF—which offers multiple quantization options (4-bit, 5-bit, 6-bit, 8-bit), useful for efficient local deployment
(Hugging Face).
| Attribute | Description |
|---|---|
| Model Type | Open-source, 4-billion parameter agentic LLM |
| Architecture | Lucy-based, leveraging Qwen3-4B-thinking |
| Benchmarks | 91.1% SimpleQA accuracy; strong chat/instructional performance |
| Deployment | Integrated in Jan App; local support via vLLM and llama.cpp |
| Settings | Recommended inference parameters (temp, top_p/k, etc.) |
| Quant Variant | GGUF version with efficient quantization support |