SEOUL, Oct. 28 (Korea Bizwire) — In recent years, artificial intelligence (AI) assisted diagnosis has emerged in many medical fields, such as gastric cancer, one of the leading causes of cancer-related deaths in South Korea, based on endoscopic and other computerized tomographic images.
But it is hard to secure enough quality endoscopic images from real clinical cases to run an AI diagnosis program as symptoms of gastric cancer are found in so many different places and forms. Private issues also matter.
Lee Won-seop, CEO of CN.AI Inc., a South Korean startup that generates synthetic data for AI, said the artificially manufactured information is the key to solve the data-lacking problem.
“Having the right and enough data is the most important and challenging part of building AI,” he said in an interview with Yonhap News Agency earlier this week.
“My company creates synthetic data based on statistics of the original to help companies collect quality data for their AI engines.”
Lee, who started his engineering career at Samsung Electronics Co. about 10 years ago, cited his company’s project to design an AI-powered gastric cancer diagnosis program with the Samsung Medical Center a year ago.
He had received around 5,000 endoscopic images on 13 divided sections of stomachs, but it was far behind the 200,000 images required for system programing. And some sections had no data at all.
To make up for the shortage, his company digitally generated thousands of necessary images of lesions in gastric tissues.
“We’ve collected image data for about one year. We needed images both with cancer and without cancer, and we wanted enough data for each section,” explained the 36-year-old. “We filled in the blanks with synthetic data.”
Synthetic data refers to information that is artificially generated by computer simulations or algorithms as an alternative to real-world data.
It has been welcomed by a variety of fields, especially by AI engineering, as collecting quality data from the real world is complicated, expensive and time-consuming.
A rise of autonomous vehicles focused a spotlight on the synthetic data industry a few years ago as digitally generated driving scenarios are considered essential to build a safe autonomous driving program.
South Korea also has been experiencing the booming trend, and CN.AI, launched in 2019, was the first mover. It was the only company in the industry when it started operations three years ago, but now there are some five rivals in the country.
“We don’t only generate synthetic data but also program AI solutions for our clients,” Lee said. “Some big companies want just synthetic data, but most want us to design their AI engine using synthetic data.”
His company posted 1.3 billion won (US$918,000) in sales in 2020 and 1.4 billion won in 2021. This year, the number is predicted to rise to 1.8 billion won.
It had nine business partners last year and has 35 this year, ranging from large companies and government institutions to medical centers.
Lee said his company is now eyeing the fast-growing global synthetic data market, which will expand to $26.1 billion in 2024.
“We are planning to go overseas,” he said. “We are working on establishing a branch in Silicon Valley and have hired the branch president to attract investors.”