Healthc Inform Res Search


Healthc Inform Res > Volume 28(4); 2022 > Article
Sharmin, Chow, and Govia: Development of a Secondary Dental-Specific Database for Active Learning of Genetics in Dentistry Programs



Dental students study the genetics of tooth and facial development through didactic lectures only. Meanwhile, scientists’ knowledge of genetics is rapidly expanding, over and above what is commonly found in textbooks. Therefore, students studying dentistry are often unfamiliar with the burgeoning field of genetic data and biological databases. There is also a growing interest in applying active learning strategies to teach genetics in higher education. We developed a secondary database called “Genetics for Dentistry” to use as an active learning tool for teaching genetics in dentistry programs. The database archives genomic and proteomic data related to enamel and dentin formation.


We took a systematic approach to identify, collect, and organize genomic and proteomic tooth development data from primary databases and literature searches. The data were checked for accuracy and exported to Ragic to create an interactive secondary database.


“Genetics for Dentistry,” which is in its initial phase, contains information on all the human genes involved in enamel and dentin formation. Users can search the database by gene name, protein sequence, chromosomal location, and other keywords related to protein and gene function.


“Genetics for Dentistry” will be introduced as an active learning tool for teaching genetics at the School of Dentistry of the University of Alberta. Activities using the database will supplement lectures on genetics in the dentistry program. We hope that incorporating this database as an active learning tool will reduce students’ cognitive load in learning genetics and stimulate interest in new branches of science, including bioinformatics and precision dentistry.

I. Introduction

Genes regulate the development of teeth and their surrounding oral structures, and more than 200 genes are involved in tooth development [1,2]. The embryonic development of the head and neck is controlled by a complex network of genes and protein regulators [3]. Alterations of these genetic regulators lead to congenital disorders of teeth and developmental anomalies [4]. With advances in gene sequencing technology, knowledge and genetic data are rapidly expanding. Primary databases such as GenBank have been developed to store large datasets, including sequences, interactions, and structural data from all organisms [5]. Meanwhile, secondary databases such as PROSITE are generated by deriving information from primary databases to meet specific research needs [6]. Currently, there are no secondary databases that compile dental-specific genomic datasets.
Students studying dentistry at the University of Alberta learn about the genetics of tooth and facial development, contributions of genes in oral health, hereditary tooth disorders, and genetic regulation of developmental anomalies. However, they are not introduced to expanding sources of genetic data and biological databases. Genetics and its newly emerging branches will dictate the future of medicine and dentistry. Bioinformatics, a division of genetics, is concerned with annotating and analyzing genetic data using computer tools. Precision dentistry aims to integrate knowledge obtained from gene analysis to provide oral health care tailored to match individual genetic profiles [7]. However, current teaching methods do not expose students to cutting-edge developments in genetics and bioinformatics.
There is a growing interest in using active learning strategies to teach genetics in higher education [8]. However, at the dental school of the University of Alberta, genetics is taught traditionally from the textbook only. Students often experience high cognitive loads with the genetics content as they navigate the complexities of gene regulation in the absence of active learning approaches. Science education through lectures alone is less effective [9] and a leading cause of students losing interest in science at the undergraduate level [10]. Many educational institutions worldwide are supplementing or replacing lectures with active-learning activities in their classrooms [11,12]. Active learning is supported by constructivist learning theory, according to which learning is a process of “making meaning,” which occurs more proficiently when learners build their understanding [13,14]. This learning approach allows students to achieve deep levels of understanding and enables them to analyze, evaluate, and synthesize ideas [15].
Teaching genetics in the dentistry program is hindered by a two-sided problem: the absence of a dental-specific genomic database and the lack of active learning strategies. In this context, we aimed to develop a dental-specific secondary database that can serve as a dental-focused genetic knowledge base and an active learning tool for teaching genetics in the dentistry program. Teaching with a database will educate students about the growing field of genetics and bioinformatics and reduce the cognitive load of memorizing complex gene regulations in tooth and oral development. In this manuscript, we report the development of a prototype (phase 1) of a secondary database called “Genetics for Dentistry.”

II. Methods

Our database currently includes human-specific genomic and proteomic data related to two cellular processes: amelogenesis (enamel formation) and dentinogenesis (dentin formation). Our endeavors to perform systematic data collection and database development are described below.

1. Data Collection and Curation

The first step was identifying the list of human genes involved in enamel and dentin formation from the genomic database of the National Library of Medicine [5]. Information related to each gene and encoded protein, chromosome location, mutations, and disease data were collected from literature and database searches. The primary sources used for data collection are listed in Table 1 [5,1621]. The extracted data were archived in a spreadsheet. Duplicates were removed. The authors SG and NS were involved in data collection, and NS and AC checked the data for accuracy.

2. Data Validation and Database Modeling

Literature searches were conducted to validate the information included in the spreadsheet. Genes were organized according to their cellular function (dentin or enamel formation). Genes or proteins with no documented dental- or oral-specific function were removed from the list. After data collection, the spreadsheet was converted to a commaseparated values (CSV) file format.

3. Incorporation of Data in the Database

We used Ragic to create “Genetics for Dentistry.” Ragic is an online platform for database development [22]. Figure 1 represents a schematic diagram of the database development process. The color theme of the headings and background of the database were chosen for aesthetic harmony. We added options for users to search the database by cellular process (dentin formation, enamel formation), protein sequence, gene name, chromosome location, and keywords related to gene and protein function. Three-dimensional protein structure and metabolic pathway data are incorporated into the database as external tools.

4. Expansion and Application of the Database as an Active Learning Tool

In the second phase of database development, we will incorporate information from human genes involved in various stages of tooth development (bud-bell stage and tooth eruption). The database will be piloted among the students at the University of Alberta’s School of Dentistry as an active learning tool for teaching genetics. In-class group activities and projects will be designed to use the database. Questionnaires will be distributed to discern students’ perceptions of their acquired knowledge of genetics and how the active learning tool contributed to their genetics learning. The flowchart of the research plan and progress is shown in Figure 2.

III. Results

“Genetics for Dentistry” currently archives information from 59 genes. This number represents all human genes listed in GenBank (until January 2022) as being involved in amelogenesis (enamel formation) and dentinogenesis (dentin formation). Users can search the database by cellular process (dentin formation or enamel formation), chromosome location, gene name, protein sequence, or keywords (Figure 3A). We have enabled multiple filter options, allowing users to refine their searches. A user, for example, can search the database for genes involved in a specific function and located in a specific chromosomal site. Once a gene is selected, the user can access its gene sequence, gene ID, alternative gene symbols, chromosome and cytogenic location, and description of its function in tooth development (Figure 3C). As proteomic data, users can access the protein encoded by a given gene, learn about the role of the protein in tooth development, obtain the protein sequence, read about mutations identified in the gene, and consult the supporting literature (Figure 3). “Genetics for Dentistry” also enables users to directly analyze the 3D protein structure and cellular pathways from the AlphaFold, WikiPathways, and Reactome pathway knowledge bases (Figure 3D). Ragic offers options to publish the database on a website and share the database URL and data collection URL separately.

IV. Discussion

Genomic and proteomic data are rapidly expanding, generating new insights in genomic medicine. A profound understanding of genetics and bioinformatics is needed to analyze the rapidly expanding datasets. Dental students at the University of Alberta study the genetics of tooth and facial development through didactic lectures only, with no active learning opportunities. We have developed a secondary database called “Genetics for Dentistry” to use as an active learning tool for teaching genetics in dentistry programs. The benefits of active learning are well-established in higher education. Freeman et al. [23] conducted a meta-analysis of 225 studies and reported that active learning improves examination performance and reduces failure rates. Students’ performance on concept inventories and other assessments was also improved by active learning. Similar studies found that active learning positively impacts students’ ability to retain and understand new material [24]. Active learning strategies such as the immediate feedback assessment technique, group projects, genetic sequence analyses, and an interactive application called “Quantitative Genetics in Shiny” have been successfully applied in undergraduate genetics courses [2527].
Adams et al. [26] implemented and evaluated the analysis of actual genetic data to enhance students’ understanding of pharmacogenomics. In an evaluation of this active learning approach, 60% of the students reported a better understanding of pharmacogenomics because of the opportunity to analyze data. Similar to Adams et al. [26], we aim to introduce “Genetics for Dentistry” as an active learning tool for dentistry students. Group projects will be designed to lead the students to analyze and annotate dental-specific genomic and proteomic data from the database. Active interactions with the actual genomic data will stimulate deep learning and enhance students’ understanding of the complex genomic regulation of tooth and oral development. Traditional genetics teaching expects students to remember genomic information that is rapidly expanding and not easily retained. Learning to analyze data from a dental-specific database, in contrast, will improve their understanding and be a life-long skill that they can use to analyze and annotate complex genomic data. Activities involving the database will supplement the genetics lectures.
Active learning strategies are not free of limitations. Developing an interactive learning environment can be time-consuming, and not all students in a large class may meaningfully participate [28]. Despite these limitations, we hope that active interactions with tooth-related genetic data will improve students’ understanding of complex genetic regulation and pique interest in precision dentistry and bioinformatics.


This project is funded by the School of Dentistry Education Research Fund (SDERF), Educational Research and Scholarship Unit, University of Alberta.


Conflict of Interest

No potential conflict of interest relevant to this article was reported.

Figure 1
Schematic overview of the steps to build the secondary database “Genetics for Dentistry.”
Figure 2
Flowchart representing the overall research plan. This manuscript describes the initial stage of database development (phase 1) (highlighted in red).
Figure 3
Snapshot from the “Genetics for Dentistry” database. (A) The database enables users to search data by cellular process (enamel or dentin formation), gene name, protein sequence, protein function, or chromosome number. (B) The home page of the database, with the search options on the left (red box). (C) Selecting an individual gene will open its page, showing details of all the information. Protein sequences are shown in a red box. This page also provides external links to the gene sequence, protein page from NCBI (National Center for Biotechnology Information), original literature, metabolic pathways, and the three-dimensional structure of the protein. The access to the external links is enlarged and shown in (D).
Table 1
Contents of the “Genetics for Dentistry” database and its primary sources
Database content Description Primary source
Information about the gene, access to DNA sequence Users can access the gene name, symbol, gene ID, general description of the gene, and the DNA sequence. National Library of Medicine [5,21] (
Chromosome location The location of the gene in the chromosome with its specific cytogenetic position is listed in the database. OMIM: Online Mendelian Inheritance in Man [20] (
Protein sequence Users can learn the names of protein(s) encoded by the gene and obtain the entire protein sequence. National Library of Medicine [21] (
Three-dimensional protein structure The dataset provides access to the three-dimensional protein structure and homology models. AlphaFold Protein Structure Database [16] (
Cellular pathways Users have access to cellular and metabolic pathways where the listed protein is known to play roles. Reactome pathway knowledgebase [17] (
WikiPathways [18] (
Function of the protein in oral and tooth development The dental and oral-specific functions of the gene are listed in the database. PubMed [19] (
Mutations Mutations identified in the gene that cause dental and oral diseases are listed in the database. OMIM: Online Mendelian Inheritance in Man [20] (
PubMed [19] (
Dental and oral disease If a mutation is known to cause an oral or dental disorder, a description of that physiological condition or developmental anomaly is listed in the database. OMIM: Online Mendelian Inheritance in Man [20] (
PubMed [19] (

“Genetics for Dentistry” is a secondary database developed by deriving dental-specific genomic information from multiple primary sources. The table summarizes the database contents, description, and primary source of that information.


1. Thesleff I, Pirinen S. Dental anomalies: genetics. Encyclopedia of life sciences. Chichester, UK: John Wiley & Sons Ltd; 2005.
crossref pdf
2. Klein ML, Nieminen P, Lammi L, Niebuhr E, Kreiborg S. Novel mutation of the initiation codon of PAX9 causes oligodontia. J Dent Res 2005 84(1):43-7.
crossref pmid
3. Hooper JE, Feng W, Li H, Leach SM, Phang T, Siska C, et al. Systems biology of facial development: contributions of ectoderm and mesenchyme. Dev Biol 2017 426(1):97-114.
crossref pmid pmc
4. Smith CE, Poulter JA, Antanaviciute A, Kirkham J, Brookes SJ, Inglehearn CF, et al. Amelogenesis imperfecta; genes, proteins, and pathways. Front Physiol 2017 8:435.
crossref pmid pmc
5. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res 2014 42(Database issue):D32-7.
crossref pmid pmc
6. Sigrist CJ, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, et al. PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 2002 3(3):265-74.
crossref pmid
7. Divaris K. Precision dentistry in early childhood: the central role of genomics. Dent Clin North Am 2017 61(3):619-25.
crossref pmid pmc
8. Smith MK, Wood WB. Teaching genetics: past, present, and future. Genetics 2016 204(1):5-10.
crossref pmid pmc
9. Wood WB. Innovations in teaching undergraduate biology and why we need them. Annu Rev Cell Dev Biol 2009 25:93-112.
crossref pmid
10. Seymour E, Hewitt NM. Talking about leaving: why undergraduates leave the sciences. Boulder (CO): Westview Press; 1996.
11. Smith MK, Vinson EL, Smith JA, Lewin JD, Stetzer MR. A campus-wide study of STEM courses: new perspectives on teaching practices and perceptions. CBE Life Sci Educ 2014 13(4):624-35.
crossref pmid pmc
12. Lewin JD, Vinson EL, Stetzer MR, Smith MK. A campus-wide investigation of clicker implementation: the status of peer discussion in STEM classes. CBE Life Sci Educ. 2016 15(1):ar6.
crossref pmid pmc
13. Hernandez-Serrano J, Choi I, Jonassen DH. Integrating constructivism and learning technologies. Spector JM, Anderson TM. In: Integrated and holistic perspectives on learning, instruction and technology. Dordrecht, Netherlands: Springer; 2000 103-28.
14. Greening T. Building the constructivist toolbox: an exploration of cognitive technologies. Educ Technol 1998;38(2):23-35.
15. Anderson LW, Krathwohl D. A taxonomy for learning, teaching, and assessing: a revision of Bloom’s taxonomy of educational objectives complete edition. New York (NY): Longman; 2001.
16. Jumper J, Hassabis D. Protein structure predictions to atomic accuracy with AlphaFold. Nat Methods 2022 19(1):11-2.
crossref pmid
17. Fabregat A, Sidiropoulos K, Garapati P, Gillespie M, Hausmann K, Haw R, et al. The reactome pathway knowledgebase. Nucleic Acids Res 2016 44(D1):D481-7.
crossref pmid pmc
18. Pico AR, Kelder T, van Iersel MP, Hanspers K, Conklin BR, Evelo C. WikiPathways: pathway editing for the people. PLoS Biol 2008 6(7):e184.
crossref pmid pmc
19. National Library of Medicine. National Center for Biotechnology Information [Internet]. Bethesda (MD): National Library of Medicine; c2022 [cited at 2022 Sep 30]. Available from:
20. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005 33(Database issue):D514-7.
crossref pmid pmc
21. NCBI Resource Coordinators. Database resources of the national center for biotechnology information. Nucleic Acids Res 2016 44(D1):D7-19.
crossref pmid pmc
22. Ragic! Make everyone a data expert [Internet]. Seattle (WA): Ragic Inc; c2021 [cited at 2022 Sep 30]. Available from:
23. Freeman S, Eddy SL, McDonough M, Smith MK, Okoroafor N, Jordt H, et al. Active learning increases student performance in science, engineering, and mathematics. Proc Natl Acad Sci U S A 2014 111(23):8410-5.
crossref pmid pmc
24. Prince M. Does active learning work? A review of the research. J Eng Educ 2004 93(3):223-31.
25. Lee WT, Jabot ME. Incorporating active learning techniques into a genetics class. J Coll Sci Teach 2011;40(4):94-100.
26. Adams SM, Anderson KB, Coons JC, Smith RB, Meyer SM, Parker LS, et al. Advancing pharmacogenomics education in the core PharmD curriculum through student personal genomic testing. Am J Pharm Educ 2016 80(1):3.
crossref pmid pmc
27. Neyhart JL, Watkins E. An active learning tool for quantitative genetics instruction using R and shiny. Nat Sci Educ 2020 49(1):e20026.
28. Carbone E. Students behaving badly in large classes. New Dir Teach Learn 1999;(77):35-43.


Browse all articles >

Editorial Office
1618 Kyungheegung Achim Bldg 3, 34, Sajik-ro 8-gil, Jongno-gu, Seoul 03174, Korea
Tel: +82-2-733-7637, +82-2-734-7637    E-mail:                

Copyright © 2024 by Korean Society of Medical Informatics.

Developed in M2community

Close layer
prev next