Maybe U2 had it right: There is something even better than the real thing鈥攁t least as far as data is concerned.
The still nascent market of synthetic data鈥攐r artificially manufactured data鈥攕eems to be having a moment right now.
In the last year-plus, several large companies, including , and , have all talked openly of their use of synthetic data.
Then last October, acquired New York-based synthetic data generator . The next month, chip giant said it was creating an for training AI networks.
Search less. Close more.
Grow your revenue with all-in-one prospecting solutions powered by the leader in private-company data.
Interest in the space has even worked its way into the venture capital world. Only about two dozen companies in the space have received funding in the last two years, according to 附近上门 data. But in the last several months some startups have raised some significant rounds, including:
- San Diego-based synthetic data creator closed a $50 million Series B funding round led by in October.
- Austria-based synthetic data generator raised a $25 million Series B led by in January.
- Israel-based , a platform using synthetic data for visual AI applications, closed a $50 million Series B led by last month.
While those rounds may not be huge, they are substantial, considering synthetic data is a concept few understand.
What is synthetic data?
In the simplest terms, synthetic data is information that is artificially manufactured and not actually created by real-world activities and events. That鈥檚 important because developing AI and ML (machine learning) projects require immense amounts of data. However, real-world data can be expensive and difficult to collect.
鈥淎 lot of data collection is done manually,鈥 said , CEO and co-founder of Datagen. 鈥淭hat can be very, very slow.鈥
Real-world data can also be biased and 鈥渄irty鈥濃攑rone to incorrect labeling or other human error since it鈥檚 gathered manually.
Synthetic data eliminates many of those issues while also being easier and faster to collect, and allowing developers to more quickly produce the algorithms and AI models they need.
鈥淚t鈥檚 completely revolutionized the way developers work,鈥 Zuk said.
Why now?
Zuk said last year proved pivotal. 鈥淧rior to 2021, very few companies understood synthetic data,鈥 he said. 鈥淏ut then big companies started to publish outcomes using it. That changes things.鈥
Before last year, Datagen mainly got customers through outbound sales calls, Zuk said. That changed last year as it became clear many big tech companies were adopting the new kind of data.
鈥淲e started getting about 10 inbound sales requests a week,鈥 he said.
The market does seem to be catching up. estimates that by 2024, . The market for synthetic data generation grew to more than $110 million last year and is expected to get to $1.15 billion by 2027, according to a report published by research firm .
Privacy push
Another driving force behind synthetic data鈥檚 growth the past few years is privacy. While companies may be drowning in data, they can鈥檛 always use it.
鈥淵ou may not be able to use the data you have because of regulations,鈥 said , CEO and co-founder of . 鈥淪ynthetic data avoids that issue.鈥
That is especially important considering two of the leading sectors using AI and synthetic data鈥攆inance and health鈥攁re also highly regulated.
“I think health has fueled synthetic data鈥檚 growth,鈥 he said. 鈥淣ot just are there regulations around privacy issues, but health care data also can be extremely rare.鈥
Getting money
Investors also rarely understood synthetic data, despite its evolution over two decades.
Now, 鈥渋nvestors understand it much better,鈥 Golshan said. 鈥淲e could have raised 2x the amount we did if we had wanted it.鈥
His company gets one one or two VC inbound calls on a weekly basis asking when it will raise a a Series C round, he added.
Zuk agreed his experience raising Datagen鈥檚 Series B last month was much different than when it locked down its $18.5 million Series A in February 2021.
鈥淎ll the tier one investors in the U.S. were happy to take the first call,鈥 he said with a laugh. 鈥淚n 2018, maybe one out of every 10 took the call.鈥
, a partner at , which led Datagen鈥檚 Series B, said when he first started to learn about synthetic data years ago, he thought it was counter intuitive to use such data for AI/ML models.
However, as those models create simulations of the real world, using simulated data did not seem unreasonable, he added.
Vitus said that while some data experts still question synthetic data, he sees a future for the industry. 鈥淚t鈥檚 an idea whose time has come,鈥 he said.
With large companies including Amazon and Microsoft already showing interest in synthetic data, others are likely to follow. Data platform , valued at more than $7 billion, just announced .
鈥淚 think it’s logical to think companies like and will come in,鈥 Golshan said. 鈥淚鈥檓 sure there will be others.鈥
Image: iStock
Stay up to date with recent funding rounds, acquisitions, and more with the 附近上门 Daily.


67.1K Followers