Introducing Hito 2B: Structured Reasoning in a Small Model
We are releasing Hito 2B, our most capable small model yet. With a novel Cognitive Framework that organizes thinking into explicit stages, Hito 2B achieves strong reasoning performance while remaining efficient enough to run on consumer hardware.
Today we are releasing Hito 2B, a 2-billion-parameter language model that brings structured reasoning to the small model space. Fine-tuned from Qwen3.5-2B using our proprietary training methodology, Hito 2B represents a significant step forward in what compact models can achieve.
Hito 2B
Structured Nested Reasoning in a 2-Billion-Parameter Model
The Problem with Small Model Reasoning
Small language models have traditionally struggled with complex reasoning tasks. They can generate fluent text, but when faced with multi-step problems, they often lose track of their own logic, make inconsistent claims, or fail to self-correct obvious errors.
We asked ourselves: what if the model's reasoning process was visible and structured, rather than hidden in an opaque chain-of-thought? What if we could teach a model to think in stages?
Introducing the Cognitive Framework
Hito 2B uses a novel Cognitive Framework that organizes thinking into explicit, nested tags within a <think>...</think> envelope. These are not just decorative labels. They constrain the model's policy distribution, forcing it to allocate generation steps to each cognitive stage sequentially.
The framework includes five cognitive stages:
- Comprehension: Understanding the problem (
<understand>,<curious>,<connect>) - Retrieval: Accessing relevant knowledge (
<recall>,<compare>,<simulate>) - Deliberation: Working through the logic (
<logic>,<plan>,<anticipate>,<imagine>) - Verification: Checking the work (
<doubt>,<verify>,<careful>) - Metacognition: Reflecting on the process (
<reflect>,<honest>,<limits>,<emotion>)
This structured approach enables something powerful: first-class self-correction within a single response. The sequence <doubt> followed by <verify> followed by an updated <commit> allows the model to catch and fix its own mistakes in real-time, observable in the output rather than hidden across multiple turns.
Benchmark Performance
The results speak for themselves. In head-to-head comparisons with the Qwen3.5-2B base model under matched conditions:
| Benchmark | Category | Hito 2B | Base | Delta |
|---|---|---|---|---|
| GSM8K | Math word problems | 60% | 25% | +35 |
| MATH-500 | Competition math | 15% | 5% | +10 |
| ARC-Challenge | Scientific reasoning | 75% | 65% | +10 |
| HumanEval-style | Code synthesis | 95% | 90% | +5 |
| Macro average | Reasoning | 61.3% | 46.3% | +15.0 |
The +35 point improvement on GSM8K is particularly notable. This benchmark has been a persistent challenge for small models, and Hito 2B's structured reasoning approach makes a dramatic difference.
Efficiency That Matters
Perhaps surprisingly, structured reasoning also improves efficiency. By constraining the model to follow a defined cognitive path, we prevent the "unproductive expansion loops" that plague many reasoning models.
- Median thinking length: ~25% shorter than base model
- Typical response time: Under 10 seconds on hard problems (vs. 33 seconds for base)
- No quality sacrifice: Shorter responses with better answers
What Hito 2B Can Do
We have validated Hito 2B across diverse reasoning challenges:
- Abstract Reasoning: Solves ARC-AGI grid puzzles (fluid intelligence tests)
- Symbolic Mathematics: Derives competition-level algebra solutions
- Statistical Reasoning: Identifies confounding variables and correlation-causation gaps
- Bayesian Reasoning: Correctly computes posterior probabilities, overcoming base-rate neglect
- Deductive Logic: Solves Knights-and-Knaves puzzles via systematic case analysis
- Self-Referential Reasoning: Engages metacognitively on its own nature without false consciousness claims
How to Use Hito 2B
Hito 2B is available today through multiple channels:
Python (Transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"hitonet/hito-2b", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("hitonet/hito-2b", trust_remote_code=True)
messages = [{"role": "user", "content": "If x + 1/x = 3, what is x^3 + 1/x^3?"}]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True, enable_thinking=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=4000, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
Ollama (GGUF Quantizations)
# Recommended (1.4 GB)
ollama run hf.co/hitonet/hito-2b-GGUF:Q5_K_M
# Smaller footprint (1.2 GB)
ollama run hf.co/hitonet/hito-2b-GGUF:Q4_K_M
# Lossless (3.6 GB)
ollama run hf.co/hitonet/hito-2b-GGUF:F16
Hosted API
curl https://api.hitonet.com/v1/chat/completions \
-H "Authorization: Bearer $HITONET_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "hito-2b", "messages": [{"role": "user", "content": "Hello"}]}'
New users get $1 in free API credits at platform.hitonet.com.
Training Methodology
Hito 2B was trained using a two-stage proprietary pipeline:
Stage 1: Progressive LoRA Merging (PLM)
Multiple rounds of LoRA fine-tuning on curated structured-reasoning data, with each round's adapter merged into the base before the next. This internalizes the Cognitive Framework grammar while retaining base capabilities.
Stage 2: Group Relative Policy Optimization (GRPO)
A custom reward formula with explicit reasoning-answer consistency signals, trained on our proprietary reasoning dataset. This reinforces behaviors that produce measurable capability gains.
Licensing
Hito 2B is released under the Hitonet Community License:
- Personal/hobby use: Yes, with attribution
- Academic research: Yes, with attribution and citation
- Non-commercial open-source: Yes, with attribution
- Commercial use: Requires written permission (contact [email protected])
Get Started
Download Hito 2B today:
- Hugging Face: huggingface.co/hitonet/hito-2b
- GGUF Quantizations: huggingface.co/hitonet/hito-2b-GGUF
- API Access: platform.hitonet.com
- Chat Interface: chat.hitonet.com
We cannot wait to see what you build with Hito 2B.