Labeling and annotation platforms might not get the attention flashy new generative AI models do. But theyāre essential. The data on which many models train must be labeled, or the models wouldnāt be able to interpret that data during the training process.
Annotation is a vast undertaking, requiring thousands to millions of annotations for the larger and more sophisticated datasets in use. To help ease the burden, Eric Landau and Ulrik Hansen founded Encord, which they describe as a ādata developmentā platform for companies managing and preparing their data for AI models.
Now, the company has an additional $30 million in its coffers thanks to a Series B round led by Next47. Bringing Encordās war chest to $50 million, the new capital will be put toward doubling the size of Encordās product, engineering and AI research teams over the next six months and expanding the companyās San Francisco offices, they said.
āBy the end of the year, we expect to grow our team to 100 employees, up from 70 currently,ā he added. āWe now have dual headquarters in London and San Francisco, with team members across the globe.ā
Landau first started working with big data systems, conducting research into particle physics while an undergraduate student at Stanford. Hansen worked in global markets at J.P. Morgan, where he dealt in emerging market derivatives.
Hansen says that the seed of the idea for Encord came while he was working on data-intensive AI projects during a computer science masterās program at Imperial College London. Frustrated by the time-consuming nature of data curation and labeling, Hansen met with Landau, whom he knew from the entrepreneurial scene in London, about ways they might solve the data problem together.
āCombining Hansenās software development expertise with my insights from quantitative research to automate data development, we launched the first iteration of Encordās product during Y Combinator in the spring of 2021,ā Landau told Dakidarts. āThe Encord platform equips enterprises with tools to prepare their data for AI and assess how effectively that data supports their models.ā
With the size of the data annotation and labeling market estimated to grow to $3.6 billion by 2027, Encord is one of many vendors competing for contracts. Besides the elephant in the room ā Scale AI ā there are startups like Datasaur, which lets customers create models automatically from sets of labels; Heartex, which is building an open source data ādevelopmentā platform; and data annotation tooling provider Dataloop.
Encord stands apart, Landau says, with the versatility of its platform.
Using Encord, teams can explore and visualize datasets ā including image, video and voice datasets ā pulled in from private and public cloud storage and compare the performance of different models trained on the same sets. The platform attempts to detect model accuracy issues and suggest additional training data that could help to rectify those issues.
āUnlike piecemeal solutions that only address specific parts of your data stack, Encord lets you consolidate all your data workflows in one platform,ā Landau said. āThrough this consolidation, companies gain traceability that sheds light on the often opaque āblack boxā of AI, helping to understand why a model makes specific decisions.ā
Encordās strategy seems to be working well so far. The company has 120 customers, including Philips, buzzy AI startup Synthesia and healthcare providers Cedars-Sinai and Northwell Health, as well as contracts with unnamed military and government agencies. Landau claims that Encord increased revenue 4x over the last year and that it could be cash-flow positive by 2025 if it werenāt continuing to grow headcount.
āWeāre feeling the opposite of a slowdown,ā Landau said. āThat being said, we are aware of the broader market conditions and have taken a conservative approach to deploying capital.ā
Other participants in the new funding round included Y Combinator, CRV and Crane Venture Partners.