Tokenisation in AI/LLMs: How It Works and How Pricing Is Calculated
- Siddhesh Kadam

- 1 day ago
- 3 min read

Artificial Intelligence models like OpenAI GPT and Anthropic Claude are widely used in applications such as chatbots, automation tools, content generation, analytics, and more.
These models do not process text the way humans read it.
They work with tokens.
Understanding tokenisation is essential because it directly affects:
💰 Cost
⚡ Performance
📏 Input and output limits
🔹 What is Tokenisation?
Tokenisation is the process of breaking text into smaller units called tokens.
A token can be:
A full word
Part of a word
A character
A symbol
Input:
Builddevops is amazing!Possible tokens:
["Builddevops", " is", " amazing", "!"]Key points:
Tokens are not always complete words
Spaces are often included
Words may be split into smaller parts
🔹 Why Tokenisation is Needed
AI models operate on numbers, not text.
The processing pipeline looks like this:
⚙️ How It Works Internally

Everything inside the model works using numerical representations of tokens.
🔹 Types of Tokenisation
1. Word-Based
Input : "I love AI"
Output: ["I", "love", "AI"]2. Subword Tokenisation (Most Common)
Input : "unbelievable"
Output: ["un", "believ", "able"]This approach:
Handles unknown words
Reduces vocabulary size
Improves efficiency
3. Character-Based
Input : "AI"
Output: ["A", "I"]🔹 Tokenisation Algorithms
Common techniques include:
Byte Pair Encoding (BPE) – used in GPT models
WordPiece – used by Google BERT
SentencePiece – language-independent
🔹 Tokens vs Words
Text Type | Approx Tokens |
1 word | ~1.3 tokens |
1 sentence | 10–20 tokens |
1 paragraph | ~100 tokens |
👉 Rule of thumb:
1 token ≈ 4 characters in English
🔹 Context Window (Token Limit)
Each model has a maximum number of tokens it can process in one request.
This includes:
Input tokens
Output tokens
If the total exceeds the limit:
Input may be truncated
Or the request may fail
💰 How Pricing Works in LLMs
Pricing is based on the number of tokens processed.
Both are counted:
Input tokens
Output tokens
✅ Pricing Formula
Total Cost = (Input Tokens / 1000 × Input Price per 1K tokens) + (Output Tokens / 1000 × Output Price per 1K tokens)Important:
Pricing is always per 1000 tokens
Never calculate cost per single token directly
🔍 Example Calculation
Assume:
Input tokens = 100
Output tokens = 200
Pricing:
Input = $0.01 per 1000 tokens
Output = $0.02 per 1000 tokens
Step 1: Input Cost
(100 / 1000) × 0.01 = 0.001Step 2: Output Cost
(200 / 1000) × 0.02 = 0.004✅ Final Cost
Total = 0.001 + 0.004 = $0.005⚠️ Common Mistake
Incorrect calculation:
100 × 0.01 = $1Correct approach:
(100 / 1000) × 0.01🔹 Why Tokens Matter
1. Cost
More tokens increase usage cost.
2. Performance
Higher token count can increase response time.
3. Limits
Exceeding token limits can cause failures or truncated responses.
🔹 Practical Examples
Short Input
"Summarize this article"Low token usage → low cost
Large Input
"Analyze 10,000 lines of logs or text data"High token usage → higher cost
🔹 Optimisation Tips
✔️ Remove unnecessary text ✔️ Avoid repeating information ✔️ Keep prompts concise ✔️ Limit output length ✔️ Preprocess large inputs before sending
🔹 Simple Token Counting Example
[root@siddhesh ~]# cat /usr/local/bin/Token_Counting.py
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4")
text = "Builddevops is amazing!"
tokens = enc.encode(text)
print("==== Full Sentence ====")
print("Text:", text)
print("Token count:", len(tokens))
print("Tokens:", tokens)
print("\n==== Word-wise Tokens ====")
words = text.split()
for word in words:
word_tokens = enc.encode(word)
decoded = [enc.decode([t]) for t in word_tokens] print(f"\nWord: {word}")
print(f"Token count: {len(word_tokens)}")
print(f"Token IDs: {word_tokens}")
print(f"Decoded Tokens: {decoded}")
[root@siddhesh ~]#Output
[root@siddhesh ~]# python3 /usr/local/bin/Token_Counting.py
==== Full Sentence ====
Text: Builddevops is amazing!
Token count: 6
Tokens: [11313, 3667, 3806, 374, 8056, 0]
==== Word-wise Tokens ====
Word: Builddevops
Token count: 3
Token IDs: [11313, 3667, 3806]
Decoded Tokens: ['Build', 'dev', 'ops']
Word: is
Token count: 1
Token IDs: [285]
Decoded Tokens: ['is']
Word: amazing!
Token count: 3
Token IDs: [309, 6795, 0]
Decoded Tokens: ['am', 'azing', '!']
[root@siddhesh ~]#🚀 Key Takeaways
✔️ AI models process tokens, not words
✔️ Tokenisation converts text into numerical form
✔️ Pricing depends on total tokens used
✔️ Always divide tokens by 1000 for cost calculation
✔️ Managing tokens improves efficiency and cost control
Conclusion
Tokenisation is a fundamental concept in modern AI systems. It affects how models understand input, generate output, and calculate usage costs.
A clear understanding of tokens helps in designing efficient, scalable, and cost-aware AI applications.




















Comments