BOTTENSOR
NPC Fin · Live Inference
NPC Model Family.
Purpose-built
for the real world.

Fine-tuned, domain-specific language models. Fast inference. Open weights. Bottensor builds specialized AI — one model per problem, shipped end-to-end.

Explore ModelsAPI Docs
7
NPC Models
32B
Flagship Size
1
Live Today
Open
Weights
Qwen2.5-32BUnslothQLoRAvLLMGPTQ 4-bitHuggingFaceOpenAI-Compatible APIOpen WeightsSpecialist ModelsFine-TunedQwen2.5-32BUnslothQLoRAvLLMGPTQ 4-bitHuggingFaceOpenAI-Compatible APIOpen WeightsSpecialist ModelsFine-Tuned
NPC Model Family

Seven models. One intelligence stack.

Purpose-built models, each fine-tuned for a single domain using QLoRA + Unsloth. Composable on their own — together they form a complete AI stack.

01
Live
NPC Fin
Finance · 32B

A 32B finance specialist fine-tuned on millions of curated market examples. Quantized for production inference and serving live today via OpenAI-compatible API.

32B BaseQLoRA SFTGPTQ 4-bit
HuggingFace →
02
In Training
NPC Fast
Distilled · Edge

A small, distilled model for latency-critical inference. Optimized for edge deployment, agent loops, and high-throughput serving where every millisecond matters.

7B / 3BDistilledEdge-Ready
03
Planned
NPC Coder
Code Generation

Production-grade code generation across the languages we use most — Python, TypeScript, Solidity, Java. Trained on private high-quality repositories.

32B BaseCode SFTTool Use
04
Planned
NPC Reason
Chain-of-Thought

Deep reasoning for complex multi-step analysis. Logic trees, step-by-step decomposition, and verifiable chains for problems that need a model to actually think.

32B BaseCoT / ToTReward Model
05
Planned
NPC Agent
Tool Use · Planning

Built for autonomous agent workflows. Tool calling, multi-step planning, and execution pipelines for real-world automation that has to run unattended.

32B BaseTool UsePlanning
06
Planned
NPC Context
Long Context

Extended context window for document understanding, codebase reasoning, and retrieval-free workflows. For when 128K isn’t enough.

32B Base1M ContextRetrieval
07
Planned
NPC Robo
Robotics · Sim2Real

Vision-language-action model for robotic control. Continuous learning policies that train in simulation and transfer to physical hardware at the edge.

VLASim2RealEdge
Flagship Model

NPC Fin

A 32B parameter finance specialist — fine-tuned on proprietary market data, quantized for production inference, and serving live today through an OpenAI-compatible API.

model_spec.json
ModelNPC Fin 32B
BaseQwen2.5-32B-Instruct
MethodQLoRA SFT → Merged → GPTQ 4-bit
Parameters32 billion
QuantizationGPTQ 4-bit (19GB)
Training Data~32K examples · ~60M tokens
Context Length4,096 tokens
API FormatOpenAI-compatible
ServingvLLM
Training Data Distribution
market_signal35%
finance_general25%
logic_tree20%
macro12%
cross_market8%
Open
NPC Fin GPTQ
Production · 19GB
Open
NPC Fin FP16
Reference · 65GB
Open
SFT Adapter
LoRA · SFT
Open
Tool-Use Adapter
LoRA · Tools
Open
NPC FinPRM
Reward Model · 7B
Inference API

Access NPC Fin — live now

OpenAI-compatible API for direct access to NPC Fin. First 100 API keys are free with 1M tokens each.

{}
OpenAI-compatible
Drop-in replacement
Fast inference
vLLM-served · sub-second
Open weights
Available on HuggingFace
Get Free API KeyTry PlaygroundAPI Docs
About

Bottensor — a Falcon Hash company.

We build small, fast, specialized AI models for problems generalists can't solve well. The NPC Model Family is our long-term project: one fine-tuned model per real-world domain, shipped with open weights and an OpenAI-compatible API.

Specialist over generalist

A 32B model trained on the right data beats a 400B generalist that has seen everything once. Every NPC model targets a single domain.

Open weights, open access

We release model weights and adapters on HuggingFace. Roughly 25% of our work is closed (data, recipes), 75% is open (weights, code, evals).

Built end-to-end

Data curation, fine-tuning with QLoRA + Unsloth, quantization, and serving with vLLM. We run the whole pipeline so we can iterate fast.

R
Ram (dude.npc)
Founder · Bottensor

7+ years software engineering. MS in Computer Science. Builds the NPC Model Family end-to-end — data, training, serving, product.

Pricing

Simple, honest pricing.

Start free. Upgrade when you ship. Pricing below is preview — final tiers locked in at GA.

Developer
Freeforever

For prototypes, hobby projects, and exploring NPC Fin.

  • 1M tokens / month
  • OpenAI-compatible API
  • Open weights on HuggingFace
  • Community support
Get Free Key
Most Popular
Pro
$49/month

For production apps and small teams shipping with NPC models.

  • 10M tokens / month
  • Higher rate limits
  • Streaming + tool use
  • Email support
Join Waitlist
Enterprise
Custom

For teams that need dedicated capacity, custom fine-tunes, or on-prem.

  • Unlimited tokens
  • Dedicated inference
  • Custom fine-tunes
  • SLA + private support
Contact Us
Stack

Built with

End-to-end AI infrastructure — from data pipelines to production inference.

Qwen2.5-32B
Unsloth
QLoRA
vLLM
Python
PyTorch
GPTQ 4-bit
HuggingFace
OpenAI API
Next.js
TypeScript
MongoDB
RunPod
A100 / H100
Vercel

Specialized models.
Shipped end-to-end.

Get an API key in under a minute. 1M tokens free, no credit card.

Get Free API KeyTry Playground