Biobanks, as a bridge between clinical medicine and research, are essential systems to support advanced medical research, such as precision medicine. A biobank consists of an organized form of storage for biomaterial resources and the corresponding data. Advanced biotechnologies, computational tools, and large-scale databases to explain individual variability are needed to achieve precision medicine for disease prevention and treatment [1,2]. Biological samples and data such as demographic, genetic, clinical, and environmental information in biobanks must be protected for donors’ privacy and security with anonymization and identification procedures [1]. An international standard regarding these issues is provided by the EU Data Protection Directive [1]. Contemporary biobanks also have the important tasks of applying new technologies in biology and IT/ICT (information technology/information and communications technology) systems, implementing a standard coding system for biomaterials (such as the Standard PREanalytical Coding [SPREC] for biospecimens), and resolving ethical and legal issues.
A well-known DNA bank network is the Electronic Medical Records and Genomics (eMERGE) network in the United States (https://emerge-network.org/) funded by the National Human Genome Research Institute (NHGRI). This nationwide network is based on DNA biorepositories with large-scale Electronic Medical Record (EMR) data to support genomic research. The organizers of this biobank emphasized not only biomaterial quality, but also data quality, as exemplified by phenotype data, some of which were extracted from EMR data using specially-developed algorithms. Through this network, researchers in genomics, statistics, ethics, informatics, and clinical medicine have been deeply involved in projects aiming to implement genomic medicine.
The UK Biobank (http://www.ukbiobank.ac.uk/) has been performing an ongoing prospective cohort study with the goal of contributing to people’s health. It collects biomaterials and large-scale biomedical data, including imaging data. Between 2008 and May 2019, 689 studies were published from the biobank [3], but the authors stated that based on the published studies, precision medicine still needs to surmount additional challenges. The biobank currently provides researchers all over the world with large-scale data. It is expected that this data provision will make a major contribution to precision medicine.
The National Biobank of Korea (NBK, http://nih.go.kr/biobank/cmm/main/mainPage.do) was established in 2002. Since 2008, the NBK has been operating the Korea Biobank Project (KBP) with support from the Ministry of Health and Welfare in collaboration with hospital-based biobanks. Starting with the phase 1 KBP, the fourth 5-year project began this year with disease-specific biobank networks including 10 main biobanks and 24 collaborative biobanks, aiming to make progress towards the implementation of precision medicine. The KBN portal, as a kind of catalog of the biobank, was established in 2019 to operate and manage the system efficiently. The fourth project is focusing on data collection using IT/ICT and informatics tools, as well as well-defined biomedical samples for advanced future research.
Informaticians and data scientists will be increasingly needed to improve biobank systems and perform research for precision medicine, and Healthcare Informatics Research will be able to publish core knowledge to support these goals. Biobank data are becoming more important with the increasing variety of data forms—ranging from text to images and life-logs—throughout the world. Data resources have shown splendid value in terms of spurring data-driven thinking and research.