Boilerplate tokens
The first difference between the number of tokens inputted into an API endpoint and the number of tokens inputted into a model is that boilerplate tokens can be added to inputs after they are received by the API endpoint. Boilerplate tokens are typically, but not always, used to structure inputs into whatever format that the model expects. The table below shows the number of boilerplate tokens that are added to inputs for each of our models, alongside a description of what those tokens are used for.Model | Number of boilerplate tokens | Description |
---|---|---|
kanon-universal-classifier | Statements are formatted alongside input texts in the format <|startoftext|>{statement}<|endoftext|>{text}<|endoftext|> . | |
kanon-universal-classifier-mini | Statements are formatted alongside input texts in the format <|startoftext|>{statement}<|endoftext|>{text}<|endoftext|> . |