> ## Documentation Index
> Fetch the complete documentation index at: https://docs.isaacus.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Chunking

> How Isaacus breaks up long documents into smaller chunks

Every language model, Isaacus models included, has a limit to the number of [tokens](/tokenization) it can take at a time. This limit is known as its maximum sequence length or context window.

In some cases, we work around this limit by breaking down long texts into smaller chunks through a process called **chunking**.

We use [semchunk](https://github.com/isaacus-dev/semchunk), the most popular semantic chunking algorithm (which we developed ourselves), to chunk texts in such a way that the chunks created are unlikely to cut off right in the middle of important sentences and paragraphs.

Chunks created by the Isaacus API will often correspond to separate clauses and sections in a document.

We give you the option to customize how chunking is performed by providing **chunk size** and **chunk overlap ratio** parameters in our API.

It is worth noting that the default chunk size is the maximum input length of whatever model is being used less overhead, which includes not only [boilerplate tokens](/pricing/costs#boilerplate-tokens) but also, if a model that takes an [Isaacus Query Language](/iql) query as input is being used, the number of tokens in the longest statement in that query.

You also have the freedom to prechunk your text before sending it to an Isaacus model or to not chunk it at all, in which case, we will have to truncate your text to fit within the context window of the model if it is too long.

This code snippet shows you how you can use our `semchunk` algorithm to chunk text like we do:

```python theme={null}
import semchunk

# NOTE We use a low chunk size here for demonstration purposes.
chunker = semchunk.chunkerify('isaacus/kanon-2-tokenizer', chunk_size=3)

text = "The client is happy."

chunks = chunker(text)

print(chunks) # expected output: ['The client is', 'happy.']
```

If you ever encounter issues with the way your text is being chunked, you can always create an issue on the [`semchunk` GitHub repository](https://github.com/isaacus-dev/semchunk/issues) or [reach out to us directly](https://isaacus.com/support).
