AI tends to make things up. Thatās unappealing to just about anyone who uses it on a regular basis, but especially to businesses, for which fallacious results could hurt the bottom line. Half of workers responding to a recent survey from Salesforce say they worry answers from their companyās generative AI-powered systems are inaccurate.
While no technique can solve these āhallucinations,ā some can help. For example, retrieval-augmented generation, or RAG, pairs an AI model with a knowledge base to provide the model supplemental info before it answers, serving as a sort of fact-checking mechanism.
Entire businesses have been built on RAG, thanks to the sky-high demand for more reliable AI. Voyage AI is one of these. Founded by Stanford professor Tengyu Ma in 2023, Voyage powers RAG systems for companies including Harvey, Vanta, Replit, and SK Telecom.
āVoyage is on a mission to enhance search and retrieval accuracy and efficiency in enterprise AI,ā Ma said in an interview. āVoyage solutions [are] tailored to specific domains, such as coding, finance, legal, and multilingual applications, and tailored to a companyās data.ā
To spin up RAG systems, Voyage trains AI models to convert text, documents, PDFs, and other forms of data into numerical representations called vector embeddings. Embeddings capture the meaning and relationships between different data points in a compact format, making them useful for search-related applications, like RAG.

Voyage uses a particular type of embedding called contextual embedding, which captures not only the semantic meaning of data but the context in which the data appears. For example, given the word ābankā in the sentences āI sat on the bank of the riverā and āI deposited money in the bank,ā Voyageās embedding models would generate different vectors for each instance of ābankā ā reflecting the different meanings implied by the context.
Voyage hosts and licenses its models for on-premises, private cloud, or public cloud use, and fine-tunes its models for clients that opt to pay for this service. The company isnāt unique in that regard ā OpenAI, too, has a tailorable embedding service ā but Ma claims that Voyageās models deliver better performance at lower costs.
āIn RAG, given a question or query, we first retrieve relevant info from an unstructured knowledge base ā like a librarian searching books from a library,ā he explained. āConventional RAG methods often struggle with context loss during information encoding, leading to failures in retrieving relevant information. Voyageās embedding models have best-in-class retrieval accuracy, which translates to the end-to-end response quality of RAG systems.ā
Lending weight to those bold claims is an endorsement from OpenAI chief rival Anthropic; an Anthropic support doc describes Voyageās models as āstate of the art.ā
āVoyageās approach uses vector embeddings trained on the companyās data to provide context-aware retrievals,ā Ma said, āwhich significantly improves retrieval accuracy.ā
Ma says that Palo Alto-based Voyage has just over 250 customers. He declined to answer questions about revenue.
In September, Voyage, which has around a dozen employees, closed a $20 million Series A round led by CRV with participation from Wing VC, Conviction, Snowflake, and Databricks. Ma says that the cash infusion, which brings Voyageās total raised to $28 million, will support the launch of new embedding models and will let the company double its size.

