A team from KU’s computer science and engineering department is tops at an international artificial intelligence (AI) question answering challenge, beating Google
Professor Jaewoo Kang’s research team develops an AI question answering (QA) model for the biomedical domain
BioBERT, an AI model pre-trained on 18 million biomedical articles
The team’s BioBERT article published in Bioinformatics, the world’s leading journal in bioinformatics
▲ PhD student Wonjin Yoon (Korea University), Professor Georgios Paliouras (BioASQ organizer),
and graduate student Minbyul Jeong (Korea University),
from left to right, pose for a commemorative photo at the BioASQ Awards.
A research team of the KU Computer Science and Engineering Department (President Jin Taek Chung) won the BioASQ challenge, an international competition for AI models that answer biomedical questions, defeating the Google team and last year’s winner, the Fudan University team. The five-member KU team consists of graduate students Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim, and Minbyul Jeong and their supervisor, Professor Jaewoo Kang.
Celebrating its 7th anniversary this year, the BioASQ challenge is the oldest annual competitive event for biomedical QA systems and is sponsored by Google, the U.S. National Institutes of Health (NIH) and the European Union (EU). The BioASQ 7b Phase B challenge, in which the team participated, consists of questions to which answers can be found in a given article.
For example, in the challenge, an article about colon cancer is provided and the team is asked, "What are the gene mutations involved in the recurrence of colon cancer?" Results are evaluated using the correct answer predetermined by experts and are announced via a review by the experts.
The KU team’s success was thanks to the AI model BioBERT, which is an extension for the biomedical domain of the deep learning-based Bidirectional Encoder Representations from Transformers (BERT) model, developed through a collaboration between Professor Kang’s team (jointly led by Dr. Jinhyuk Lee and PhD student Wonjin Yoon) and Naver Corp.’s Clova AI Research team (Researcher Sungdong Kim)
The BioBERT model is designed to understand the meaning of words in the context of a sentence. Pre-training on 18 million biomedical articles is performed for the model to acquire contextual information of the words necessary to understand articles requiring domain-specific knowledge. Based on such information, the model finds answers to biomedical questions in a given paper.
The BioBERT paper was officially published in Bioinformatics, the world’s most prestigious journal in bioinformatics, in August. Since its initial internet publication in late January, the article has been cited more than 40 times, a very unusual record of citations for an article before official publication. It is also drawing extensive attention from academics and others, being cited by the world’s leaders in AI research such as Google, Carnegie Mellon University (CMU), and AllenAI.
Professor Kang's research team (Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim and Minbyul Jeong) participated in the BioASQ event after optimizing the model for the challenge and ranked first on all five tests, outperforming the Google team and the defending champion Fudan University team. The final results of the competition are available at http://bioasq.org/participate/seventh-challenge-winners.
This study has great implications in that it significantly improved the performance of the existing biomedical QA system by pre-training it on biomedical corpora. Expectations are high that this model can lead to the development of a clinically meaningful decision support tool.
The results of the challenge were announced at the BioASQ workshop held on September 20 in Wurzburg, Germany, garnering much attention from the wider academic community and industry, including Google and global pharmaceutical companies.