TemperStack
Intermediate12 min readUpdated Mar 18, 2026

How to process text embeddings classification on n8n

Quick Answer

Text embeddings classification in n8n involves creating a workflow that connects to an AI service to generate embeddings, then uses classification nodes to categorize text. You'll need to configure AI service credentials, set up data preprocessing, and create classification logic using comparison nodes.

Prerequisites

  1. Basic understanding of n8n workflows
  2. OpenAI API key or similar AI service credentials
  3. Knowledge of text classification concepts
  4. Familiarity with JSON data structures
1

Set up AI Service Credentials

Navigate to Settings > Credentials in your n8n workspace. Click + Add Credential and select your AI service (OpenAI, Cohere, or Hugging Face). Enter your API key and test the connection. Save the credential with a descriptive name like OpenAI-Embeddings.
Tip
Store multiple AI service credentials as backup options for better reliability.
2

Create Input Data Node

Add a Manual Trigger or Webhook node to start your workflow. Configure the input to accept text data by adding a JSON payload structure:
{
  "text": "Your text to classify",
  "categories": ["category1", "category2"]
}
3

Add Text Preprocessing Node

Insert a Code node after your trigger. Add JavaScript code to clean and prepare your text:
const cleanText = $json.text
  .toLowerCase()
  .replace(/[^\w\s]/g, '')
  .trim();

return {
  originalText: $json.text,
  cleanedText: cleanText,
  categories: $json.categories
};
Tip
Text preprocessing improves embedding quality and classification accuracy.
4

Configure OpenAI Embeddings Node

Add an OpenAI node and select Get Embeddings operation. Set the Input Text field to {{ $json.cleanedText }}. Choose the embedding model (recommended: text-embedding-ada-002). Select your previously created credentials.
Tip
Use consistent embedding models throughout your workflow for better comparison results.
5

Create Reference Embeddings

Add another OpenAI node to generate embeddings for your classification categories. Use an Item Lists node to split categories, then connect to the OpenAI node. Set the input to {{ $json }} to process each category name into embeddings.
6

Calculate Similarity Scores

Add a Code node to compute cosine similarity between text and category embeddings:
const textEmbedding = $('OpenAI').first().json.embedding;
const categoryEmbeddings = $('OpenAI1').all();

function cosineSimilarity(a, b) {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

const similarities = categoryEmbeddings.map(cat => ({
  category: cat.json.input,
  similarity: cosineSimilarity(textEmbedding, cat.json.embedding)
}));

return { similarities };
Tip
Cosine similarity values range from -1 to 1, with higher values indicating better matches.
7

Determine Classification Result

Add a final Code node to select the highest similarity score:
const similarities = $json.similarities;
const bestMatch = similarities.reduce((best, current) => 
  current.similarity > best.similarity ? current : best
);

return {
  originalText: $('Code').first().json.originalText,
  predictedCategory: bestMatch.category,
  confidence: bestMatch.similarity,
  allScores: similarities
};
Tip
Set confidence thresholds to handle uncertain classifications appropriately.
8

Add Output and Error Handling

Connect your final node to a Webhook Response or Set node to output results. Add Error Trigger and Stop and Error nodes to handle API failures gracefully. Configure retry logic in the OpenAI nodes with Retry on Fail: 3 times.

Troubleshooting

API rate limit errors
Add Wait nodes between API calls or implement exponential backoff in your Code nodes. Consider upgrading your API plan for higher rate limits.
Low classification accuracy
Improve text preprocessing by handling special characters, stemming, or lemmatization. Use more descriptive category names or add example texts for each category.
High API costs
Cache embeddings using a Redis or Database node to avoid recalculating identical text embeddings. Batch process multiple texts in single API calls.
Workflow timeout issues
Enable Save Intermediate Results in workflow settings. Split large text processing into smaller chunks using SplitInBatches node.

Related Guides

More n8n Tutorials

Other Tool Tutorials

Ready to get started with n8n?

Put this tutorial into practice. Visit n8n and follow the steps above.

Visit n8n