Skip to main content
Hero Dark Pn The Embedding_Siyabasa API provides high-quality text embedding models specifically designed for the Sinhala language. Generate embeddings for Sinhala words, phrases, and sentences using our latest model UgannA_SiyabasaV2. These language-specific embeddings power advanced NLP tasks such as semantic search, text classification, and document clustering, delivering more accurate and context-aware results than traditional keyword-based approaches. Key features:
  • Language-specific: Optimized exclusively for Sinhala text
  • 300-dimensional embeddings: Rich semantic representations
  • FastText architecture: Proven performance for morphologically rich languages

What Are Embeddings?

Text embeddings are numerical representations—or vectors—of words and sentences. They capture the semantic relationships and contextual meaning of language, allowing machines to process and understand text in a way that’s similar to human comprehension. Models trained specifically on a single language provide a more nuanced and accurate understanding than broad, multilingual models.

Why Specialized Sinhala Embeddings Matter

While multilingual models exist, they often underperform for languages with unique linguistic characteristics like Sinhala. Our research shows that language-specific embeddings provide significantly better results for:
  • Semantic understanding of Sinhala’s rich morphology
  • Contextual accuracy in Sinhala sentence structures
  • Domain-specific applications tailored to Sinhala content

API Console