> ## Documentation Index
> Fetch the complete documentation index at: https://docs.isaacus.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Embedding

> Quantify the relevance and semantic similarity of data

**Embedding** or **vectorization** is the process of converting content into sets of numbers that, when compared mathematically with each other, quantify how similar they are in meaning.

Embeddings are used to power semantic search engines, text classification, and cluster analysis, as well as the retrieval component of retrieval-augmented generation (RAG) applications.

Isaacus currently offers the world's most accurate legal embedding model, [Kanon 2 Embedder](/models/introduction#embedding), available through our [embedding endpoint](/api-reference/embeddings/embedding).

For a complete specification of the parameters and response format of our embedding endpoint, please refer to the [API reference documentation](/api-reference/embeddings/embedding).

## Usage

Our [embedding endpoint](/api-reference/embeddings/embedding) takes one or more texts as input and outputs an embedding for each text.

The code snippet below demonstrates how you could use our embedding endpoint to assess the semantic similarity of search queries to a document. Please consult our [quickstart guide](/quickstart) first if you haven't set up your Isaacus account and API client.

<CodeGroup>
  ```python Python theme={null}
  import httpx # NOTE `httpx` is already a dependency of `isaacus`.
  import numpy as np # NOTE you may need to `pip install numpy`.

  from isaacus import Isaacus

  # Create an Isaacus API client.
  # NOTE see https://docs.isaacus.com/quickstart to learn how to get an API key.
  client = Isaacus(api_key="PASTE_YOUR_API_KEY_HERE")

  # Download the GitHub terms of service as an example.
  tos = httpx.get("https://examples.isaacus.com/github-tos.md").text

  # Embed the terms of service.
  document_response = client.embeddings.create(
      model="kanon-2-embedder",
      texts=tos, # You can pass a single text or a list of texts here.
      task="retrieval/document",
      # dimensions=1792, # You may optionally wish to specify a lower dimension.
  )

  # Embed our search queries.
  query_responses = client.embeddings.create(
      model="kanon-2-embedder",
      texts=[
          "What are GitHub's billing policies?", # This is a relevant query.
          "What are Microsoft's billing policies?", # This is an irrelevant query.
      ],
      task="retrieval/query",
      # dimensions=1792, # You may optionally wish to specify a lower dimension.
  )

  # Unpack the embeddings.
  document_embedding = document_response.embeddings[0].embedding

  query_embeddings = query_responses.embeddings
  relevant_query_embedding = query_embeddings[0].embedding
  irrelevant_query_embedding = query_embeddings[1].embedding

  # Compute the similarity between the queries and the document.
  relevant_similarity = np.dot(relevant_query_embedding, document_embedding)
  irrelevant_similarity = np.dot(irrelevant_query_embedding, document_embedding)

  # Log the results.
  print(f"Similarity of relevant query to the document: {relevant_similarity * 100:.2f}")
  print(f"Similarity of irrelevant query to the document: {irrelevant_similarity * 100:.2f}")
  ```

  ```javascript JavaScript theme={null}
  import { Isaacus } from 'isaacus';

  // Define a function to compute the dot product of two vectors.
  function dot(a, b) {
      let sum = 0;
      for (let i = 0; i < a.length; i++) {
          sum += a[i] * b[i];
      }
      return sum;
  }

  // Create an Isaacus API client.
  // NOTE see https://docs.isaacus.com/quickstart to learn how to get an API key.
  const client = new Isaacus({ apiKey: "PASTE_YOUR_API_KEY_HERE" });

  // Download the GitHub terms of service as an example.
  const tos = await client.get("https://examples.isaacus.com/github-tos.md");

  // Embed the terms of service.
  const document_response = await client.embeddings.create({
      model: "kanon-2-embedder",
      texts: tos, // You can pass a single text or an array of texts here.
      task: "retrieval/document",
      // dimensions: 1792, // You may optionally wish to specify a lower dimension.
  });

  // Embed our search queries (relevant + irrelevant).
  const query_responses = await client.embeddings.create({
      model: "kanon-2-embedder",
      texts: [
          "What are GitHub's billing policies?", // This is a relevant query.
          "What are Microsoft's billing policies?", // This is an irrelevant query.
      ],
      task: "retrieval/query",
      // dimensions: 1792, // You may optionally wish to specify a lower dimension.
  });

  // Unpack the embeddings.
  const document_embedding = document_response.embeddings[0].embedding;

  const query_embeddings = query_responses.embeddings;
  const relevant_query_embedding = query_embeddings[0].embedding;
  const irrelevant_query_embedding = query_embeddings[1].embedding;

  // Compute the similarity between the queries and the document.
  const relevant_similarity = dot(relevant_query_embedding, document_embedding);
  const irrelevant_similarity = dot(irrelevant_query_embedding, document_embedding);

  // Log the results.
  console.log(`Similarity of relevant query to the document: ${(relevant_similarity  * 100).toFixed(2)}`);
  console.log(`Similarity of irrelevant query to the document: ${(irrelevant_similarity * 100).toFixed(2)}`);
  ```
</CodeGroup>

The output should look something like this:

```
Similarity of relevant query to the document: 52.87
Similarity of irrelevant query to the document: 24.86
```
