Healthc Inform Res Search


Healthc Inform Res > Volume 17(1); 2011 > Article
Lee and Park: Development of Data Models for Nursing Assessment of Cancer Survivors Using Concept Analysis



Sharing of cancer-related information among healthcare professionals is crucial to ensuring the quality of long-term care for cancer survivors. Appropriate distribution of the essential facts can be achieved using data models. The purpose of this study was to develop and validate suitable data models for use in the nursing assessment of cancer survivors.


The models developed in this study were based on a modification of concept analysis developed by Walker and Avant. Our approach involved determining the purpose of the analysis, identifying data elements, defining these elements and their uses, determining critical attributes, value sets, and cardinalities, and ultimately constructing data models which were examined externally by domain experts.


We developed 112 data models with 112 data elements, 29 critical attributes, 102 value sets, and 6 data types for the assessment of cancer survivors. External validation revealed that the data elements, critical attributes, and value sets proposed were comprehensive, relevant, and sufficiently useful to encompass nursing issues related to cancer survivors.


Data models developed in this study will contribute to ensuring the semantic consistency of data collected from cancer survivors, which will improve the quality of nursing assessments and in turn translate to improved long-term patient care.

I. Introduction

There are approximately 28 million cancer survivors living with cancer worldwide [1]. The number of cancer survivors will increase steadily in the coming years as the average age of the world's population increases. A cancer survivor is defined as anyone who has been diagnosed, is living with, or has recovered from cancer, and family members who are affected by a diagnosis of cancer [2].
There is a growing need to improve the physical, social, psychological, and spiritual well-being of cancer survivors during the course of deciding treatment options, enduring treatments, and surviving the disease [2,3]. The data elements required to assess the physical, social, psychological, and spiritual well-being of cancer survivors need to be continuously tracked and integrated into the cancer survivor's follow-up care, even years after becoming cancer free [4,5].
The sharing of cancer-related information among healthcare professionals is key to the quality of long-term cancer care for cancer survivors [5]. The cancer-related information can only be shared if they are represented in a way that all healthcare professionals can understand [6]. The ability to exchange clinical data between different computer systems and maintain data consistently in a longitudinal electronic recording system is also important for ensuring the quality of long-term cancer care [5]. The ability of all healthcare professionals and multiple systems to understand the clinical data is known as semantic interoperability. One way to ensure semantic interoperability is to model data by specifying the following [7]:
  • What are the key data elements?

  • What are the critical attributes?

  • What is possible value set for each attribute?

  • Are the attributes optional or mandatory?

  • What other rules need to be expressed?

Ongoing research on data modeling has been underway in various countries including Australia, the Netherlands, USA, and Korea, and through international standard development organizations such as health level seven (HL7). The names for these models include the openEHR archetype in Australia [8], the detailed clinical model in the netherlands, HL7 [9], clinical element models in the USA, the intermountain healthcare [10], and the clinical contents model in Korea [11]. However, these works have been limited to the medical domain [8-11]. Although nurses and physicians sometimes handle the same situations, they often view these situations in different ways, and so data models for the medical domain cannot be used in the nursing domain. It has thus been emphasized over the past few years that the nursing profession should develop its own data models [7].
The benefits of data models are that they allow an accurate correspondence of clinical data in a consistent, safe, and meaningful way, and they can adapt to the changing information needs of different healthcare professions and institutions [12]. As an increasing number of people are affected by cancer, and various workforces participate in cancer management, a data model is needed to collect and share clinical data to enable improvements in the quality and efficiency of cancer care. The purpose of this study was thus to develop and validate data models for the nursing assessment of cancer survivors using concept analysis.

II. Methods

The development of the data models was guided by a modification of the concept analysis developed by Walker and Avant [13], which was chosen because its process [14-16] is well suited for the development of data models (Figure 1).
The concept analysis of Walker and Avant comprises the following steps [13]: 1) identifying the concept, 2) determining the purposes of the analysis, 3) defining the concept and its uses, 4) determining the critical attributes, 5) constructing the cases, 6) identifying the antecedents and consequences, and 7) defining the empirical referents. The concept analysis usually has a single purpose, such as the generation of a theoretical model or of a measurement instrument for a particular concept of interest. Thus, identifying a concept is followed by determining its purpose. However, for developing the data model, we determined the purpose of the analysis before we identified any data element, because we analyzed more than one data element with a particular purpose, such as exchanging and sharing clinical data in an electronic health record (EHR) system. Since we dealt with more than one data element, we defined multiple data elements in this study, and their uses in step 3. In step 4 we determined not only the critical attributes, but also the value sets, data types, and cardinalities of the critical attributes. In step 5 we constructed data models by connecting data elements with critical attributes, value sets, data types, and cardinalities. In step 6 we identified antecedents and consequences to provide further clarity of the data elements by internal validation. However, we did not express antecedents and consequences in the data models. Step 7, defining the empirical referents, was omitted in this study because the data models themselves were already measurable, having specified critical attributes with value sets, data types, and cardinalities.

1. Determining the Purposes of the Analysis

The purpose of this analysis was to develop data models for the nursing assessment of cancer survivors that will enable the collection and sharing of clinical data among healthcare professionals and between healthcare institutions.

2. Identifying the Data Elements

Data elements for the nursing assessment of cancer survivors were identified from clinical nursing statements used to describe sign and symptom, and nursing diagnosis in the electronic nursing records of cancer patients who were hospitalized or visited outpatient department in a tertiary hospital by extracting key concepts. For example, we extracted the key concept "discomfort" from the following statements: "discomfort is present," "decreased pharyngolarynx discomfort," "complains of discomfort," and "no discomfort after eating." In addition, we reviewed the medical or nursing dictionary, nursing literature such as textbooks, and nursing and medical articles on oncology, nursing terminology classifications such as international nursing diagnosis and the international classification for nursing practice (ICNP), and consulted nurse experts to supplement the data elements. The extracted data elements were classified into physical, psychological, and spiritual domains based on previous research [2].

3. Defining the Data Elements and Their Uses

Data elements were defined to clarify what we mean when referring to text definitions of concepts in the ICNP, or formal definitions of concepts from the systematized nomenclature of medicine-clinical term. We also used medical and nursing dictionaries to obtain definitions of the data elements. We identified the uses of the data elements utilized in nursing practice by reviewing nursing forms, nursing statements, and the research published in nursing and medical articles.

4. Determining the Critical Attributes, Value Sets, and Cardinalities

We determined the critical attributes of data elements, which are qualifiers or modifiers to represent data elements in more detail, by reviewing nursing statements, nursing forms, and the relevant literature. We then identified possible value sets for critical attributes by referring to nursing statements and the relevant literature. The cardinality of each attribute was determined. Nurse experts participated in determining the critical attributes, value sets, and cardinalities. Finally, the data type of each attribute was classified based on the HL7 data type list [17], such as "Integer (INT)," "String (ST)," "Physical Quantity (PQ)," and "Ratios (RTO)" [18].

5. Constructing Data Models

We constructed data models by linking each data element with critical attributes, value sets, data types, and cardinalities. We present each data model in table form.

6. Validation of Data Models

Two nursing terminology experts and nine nurse informaticists reviewed the process of data model development as well as the data model themselves, and they also reviewed the clarity of the data models by identifying antecedents and consequences. The two nursing terminology experts have been engaged in teaching and research related to the terminology for more than 10 years, and the nine nursing informatists have had more than 5 years of experience as clinical nurses, three of them in internal medicine units, three of them in surgical units, and three in oncology nursing units. Currently, four of them work as nurse informaticists at an EMR center in a tertiary hospital. Five of them have doctoral degree in nursing informatics and four have master's degrees in the same.
In addition, an expert panel of eleven clinicians comprising four oncology nurses, five nurse researchers, one clinical doctor, and one social worker verified the face validity of the data models. The members of the expert panel have worked in oncology for at least five years. Six of them have doctoral degrees and five have master's degrees. The questions used to check the face validity were developed based on the criteria published for evaluating the content and modeling structure of health terminology in previous studies, such as "usefulness" [19], "reusability" [19,20], "non-ambiguity" [21-23], "comprehensiveness" [19,21-23], "non-redundancy" [19,21,22], and clinical relevancy" [19]. During this external validation, 11 items-questions on clinical relevancy, usefulness, reusability, non-ambiguity, comprehensiveness, and non-redundancy were asked, and the responses to which were scored on a 5-point Likert scale (from 1 = strongly agree to 5 = strongly disagree). We presented data distribution in a frequency.

III. Results

1. Identifying and Defining the Data Elements

A total of clinical nursing statements of 98 patients whose care time ranged 2 days to 20 months since surgery were analyzed to extract data elements. The mean patients' age was 51 years (± 5.5) and fifty three patients (54.1%) were male. Fifty four patients (55.1%) were GI cancer (i.e., stomach, colon), twenty nine (29.6%) were breast cancer, ten (10.2%) were gynecological cancer, and five (5.1%) were other types of cancer (i.e., lung, liver). In total, 112 data elements were identified. Forty-four data elements (39.3%) were extracted by extracting key concepts from clinical nursing statement. Sixty four data elements (57.1%) were identified from literature review, and four data elements (3.6%) were identified from the experts' evaluation. Table 1 presents the final data elements that we identified for the nursing assessment of cancer survivors. Sixty data elements were classified as physical domains, 37 as psychological domains, 10 as cognitive domains, 4 as social domains, and 4 as spiritual domains. Posttraumatic growth and adaptation cannot be classified as one domain because it has multiple characteristics with psychological, social, cognitive, and spiritual aspects. Data elements were also classified into three groups based on the direction of judgment (i.e., positive, negative, or neutral) (Table 2).

2. Defining Critical Attributes, Value Sets, Cardinalities, and Data Types

Table 3 lists the critical attributes that we identified to express the data elements, and the frequency of these critical attributes used therein. In total, 29 critical attributes were identified. Occurrence, progression, duration, severity, and frequency appeared in more than 60% of the data elements, while interpretation, onset, and anatomical site appeared in about 20%. Example value sets of these attributes are presented in Table 4. In total, 102 value sets were identified. In the model development process, we identified two data elements that are used interchangeably in nursing practice with different critical attributes or value sets. Examples are "discharge" and "drainage," and "weight loss" and "emaciation." The data element "drainage" has the critical attribute "device" with a value such as "Hemovac" or "rubber." However, the data element "discharge" does not have a critical attribute "device." "Weight loss" and "emaciation" differ with regard to the critical attribute "severity". The data element "emaciation" does not require the critical attribute "severity" because it means excessive leanness. This shows that data modeling improves the accuracy of nurses' documentation of data.
The cardinalities of the attributes, that is whether the attribute was optional or mandatory, were defined. The data type of each attribute was classified based on the HL7 data type list [17]. For example, the data type of precise numbers that are the result of counting and enumerating (e.g., -1, 0, and 3398129) is "integer number", and that of "quantities," which are measured, or computed from other real numbers (e.g., 56.3 and 165.5), is "real number," that of coding as order form (i.e., 0 = rarely, 1 = sometimes, 2 = often, and 3 = always) is "coded ordinal," that of coding as text form (i.e., localized and generalized) is "coded text," that of recording freely as text form is "text", and that of date and time (i.e., yyyymmddhhmm) is "date & time."

3. Constructing Data Models

Using the 112 data elements, 29 critical attributes, 102 value sets, and 6 data types, we developed 112 data models. Table 5 presents a representative example of a data model for pain. The critical attributes of pain had the following value sets:
  • Severity - absent, tolerable, mild, moderate, and severe, or from 0 (no pain) to 10 (very severe pain) on a visual analog scale.

  • Progression - acute and chronic.

  • Duration - seconds, minutes, and hours (over which the pain persists).

  • Frequency - very rarely, sometimes, often, and always.

  • Onset - gradual, sudden, and intermittent.

  • Time sequence - intermittent, continuous, and waxing and waning.

  • Regularity - regular and irregular.

  • Occurrence - yyyymmddhhmm.

  • Anatomical site - free text.

  • Characteristic - prick, ache, burn, throb, dull, and sharp.

  • Radiation - yes and no.

The data type of severity and frequency is "coded ordinal," that of progression, onset, time sequence, regularity, characteristic, and radiation is "coded text," that of occurrence is "date & time", and that of anatomical site is "text." Cardinality determined "severity" as mandatory (Table 5).
We grouped together data models with the same critical attributes. 57 groups of data models in accordance with combinations of attributes were made; for example, distress, sadness, and loneliness belong to the same group. Data models for these data elements have the same critical attributes, duration, frequency, occurrence, progression, and severity (Table 6).

4. Validation of Data Models

Some domain experts suggested that several cardinalities of the data models be revised. One of the domain experts indicated the value set of "absent, tolerable, mild, moderate and severe" of 'severity' critical attribute in the data element of pain gave a limited illustration. In the clinical practice, most pain is assessed using a 0-10 scale which should allow a more detailed analysis. Based on this, the value set of 0-10 visual analogue scale was added as one of value sets of severity critical attributes of pain model.
Expert panel of clinicians suggested developing new data models of body image, posttraumatic growth, insight, and knowledge. More than 80% of expert panel of clinicians rated the 112 data models using a response of "strongly agree" or "agree" to the questions of "usefulness," "reusability," "nonambiguity," "comprehensiveness," and "nonredundancy." For "clinical relevancy", 70.1% responded "strongly agree" or "agree" to the question "Is the data element clinically meaningful?" (Table 7).

IV. Discussion

Healthcare professionals of various types in a variety of hospitals need to follow up cancer patients to monitor or prevent recurrences or secondary cancer even after successful treatment of the primary cancer. The ability to share cancerrelated information among many healthcare professionals and different hospitals is a prerequisite for maintaining the quality of cancer care. In order to share and exchange clinical data, it should be semantically interoperable. One way of ensuring semantic interoperability is to develop data models of the clinical data.
Concepts should be analyzed to develop data models. In the area of oncology nursing, concept analyses have been limited to specific concepts such as "cancer symptom cluster" [24], "symptom experience" [25], "cancer survivorship" [26], "psychological distress" [27], "suffering" [28], and "symptom disclosure" [29]. Common signs and symptoms of cancer survivors, and survivorship issues related to nursing assessment have not been analyzed to date. From this background, we developed 112 data models for the nursing assessment of cancer survivors using concept analysis by analyzing nursing documentations, reviewing the literature, and consulting nurse experts.
We extracted data elements describing physical problems from nursing documents and describing psychological and social problems from a literature review of articles, which are usually overlooked on nursing practice [30]. Similarly, we extracted data elements describing positive judgments (i.e., spiritual interests and posttraumatic growth) from the literature review and as a result of the suggestions of our external domain experts. Using a data model describing positive spiritual or psychological changes makes it possible for nurses to document positive outcomes in nursing practice.
The data models developed in this study have cancer-specific attributes. For example, fatigue in a healthy population can be described as "acute fatigue," which can be relieved by sleep and rest. However, cancer-related fatigue can only be chronic because it is present over a long period of time and cannot be not completely relieved by sleep and rest [1]. Thus, the data model for fatigue in healthy people has an attribute "progression" to describe acute and chronic fatigue; however, the equivalent model for cancer-related fatigue does not need the attribute "progression" because cancer-related fatigue is always chronic [1].
The data models, data types and cardinalities of the critical attributes developed in the present study were found to be valid. Even though we developed questions to check the face validity of the models based on earlier studies [19-23], the reliability and validity of the questionnaires were not evaluated vigorously. Thus, we would like to suggest a further study to evaluate the reliability and validity of the questions used to test face validity. Although we evaluated the applicability of the model indirectly by having nurses with clinical experience in oncology evaluate the model, we suggest a further study to test the direct applicability of the model to oncology nursing practice.
Nursing statements used in the current electronic nursing record (ENR) system in Korea comprise simple phrases that describe the judgment on a key data element (e.g., pulse deficit and severe numbness). However, using a data model with critical attributes, value sets, data types, and cardinalities for the ENR allows the key data elements to be documented in more detail and consistently. This will improve the quality of nursing records and, in turn, make the nursing record reusable for research and future practice.
In this study, we were able to develop data models by connecting data elements, critical attributes, and value sets and specifying the data types and cardinalities of the critical attributes. Data models can be used in ENR or EHR systems. The outcomes of this study will contribute to standardized nursing assessment for cancer survivors and improve use of data in clinical practice and research.


This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (No. 2009-0074695 and No. 2010-0010468).


No potential conflict of interest relevant to this article was reported.


1. American Cancer Society. Cancer facts & figures 2008. 2008. Atlanta: American Cancer Society; p. 57.

2. Twombly R. What's in a name: who is a cancer survivor? J Natl Cancer Inst 2004;96:1414-1415. PMID: 15467027.
3. Centers for Disease Control and Prevention (CDC). Cancer survivorship: United States, 1971-2001. MMWR Morb Mortal Wkly Rep 2004;53:526-529. PMID: 15215740.
4. Feuerstein M, Findley P. The cancer survivor's guide: the essential handbook to life after cancer. 2006. New York: Marlowe & Co; p. 1.

5. Whippen D, Deering MJ, Ambinder EP. Advancing high-quality cancer care: cancer biomedical informatics grid supports personalized medicine and the electronic health record. J Oncol Pract 2007;3:208-211. PMID: 20859412.
crossref pmid pmc
6. Graybeal J. Achieving semantic interoperability. The MMI guides- navigating the world of marine metadata. 2009. cited at 2011 Mar 15. Marine Metadata Initiative; Available from:

7. Hovenga E, Garde S, Heard S. Nursing constraint models for electronic health records: a vision for domain knowledge governance. Int J Med Inform 2005;74:886-898. PMID: 16115795.
crossref pmid
8. Beale T. In: Baclawsk K, Kilov H, Archetypes: constraint-based domain models for future-proof information systems. editors. Eleventh OOPSLA workshop on behavioral semantics: serving the customer. 2002. Boston: Northeastern University; p. 16-32.

9. Goossen WT. Using detailed clinical models to bridge the gap between clinicians and HIT. Stud Health Technol Inform 2008;141:3-10. PMID: 18953119.
10. Huff SM. The GE-intermountain healthcare alliance: a new paradigm for EHR development 2009;Proceeding of the 4th EHR Symposium; Seoul, KR; p. 14-28.

11. Kim Y, Huff SM, Ahn SJ, Cho KH, Koh YT. Clinical contents medol: definition, methods, and practical use Proceedings of the 6th Asia Pacific Association for Medical Informatics (APAMI); 2009 Nov 22-24. Hiroshima, JP; p. W-03.

12. Johnson SB. Generic data modeling for clinical repositories. J Am Med Inform Assoc 1996;3:328-339. PMID: 8880680.
crossref pmid pmc
13. Walker LO, Avant KC. Strategies for theory construction in nursing. 1988. Norwlk, CT: Appleton & Lange.

14. Matteson P, Hawkins JW. Concept analysis of decision making. Nurs Forum 1990;25:4-10. PMID: 2235655.
crossref pmid
15. Hawks JH. Power: a concept analysis. J Adv Nurs 1991;16:754-762. PMID: 1869724.
crossref pmid
16. Henneman EA, Lee JL, Cohen JI. Collaboration: a concept analysis. J Adv Nurs 1995;21:103-109. PMID: 7897060.
crossref pmid
17. Schadow G, Brion P, McKenzie L, Grieve G, Pratt D. HL7 version 3 standard. Data types - abstract specification, release 1. 2004. cited at 2010 Mar 15. Health Level Seven™ Inc.; Available from:

18. Dolin RH, Alschuler L, Boyer S, Beebe C, Behlen FM, Biron PV, Shabo Shvo A. HL7 clinical document architecture: Release 2. J Am Med Inform Assoc 2006;13:30-39. PMID: 16221939.
crossref pmid pmc
19. Chute CG, Cohn SP, Campbell JR. A framework for comprehensive health terminology systems in the United States: development guidelines, criteria for selection, and public policy implications. ANSI healthcare informatics standards board vocabulary working group and the computer-based patient records institute working group on codes and structures. J Am Med Inform Assoc 1998;5:503-510. PMID: 9824798.
crossref pmid pmc
20. Bakhshi-Raiez F, Cornet R, de Keizer NF. Development and application of a framework for maintenance of medical terminological systems. J Am Med Inform Assoc 2008;15:687-700. PMID: 18579838.
crossref pmid pmc
21. Kim TY, Coenen A, Hardiker N. A quality improvement model for healthcare terminologies. J Biomed Inform 2010;43:1036-1043. PMID: 20723616.
crossref pmid
22. Cimino JJ, Clayton PD, Hripcsak G, Johnson SB. Knowledge-based approaches to the maintenance of a large controlled medical terminology. J Am Med Inform Assoc 1994;1:35-50. PMID: 7719786.
crossref pmid pmc
23. Campbell JR, Carpenter P, Sneiderman C, Cohn S, Chute CG, Warren J. Phase II evaluation of clinical coding schemes: completeness, taxonomy, mapping, definitions, and clarity: CPRI work group on codes and structures. J Am Med Inform Assoc 1997;4:238-251. PMID: 9147343.
crossref pmid pmc
24. Kim HJ, McGuire DB, Tulman L, Barsevick AM. Symptom clusters: concept analysis and clinical implications for cancer nursing. Cancer Nurs 2005;28:270-282. PMID: 16046888.
crossref pmid
25. Armstrong TS. Symptoms experience: a concept analysis. Oncol Nurs Forum 2003;30:601-606. PMID: 12861321.
crossref pmid
26. Doyle N. Cancer survivorship: evolutionary concept analysis. J Adv Nurs 2008;62:499-509. PMID: 18373612.
crossref pmid
27. Ridner SH. Psychological distress: concept analysis. J Adv Nurs 2004;45:536-545. PMID: 15009358.
crossref pmid
28. Fochtman D. The concept of suffering in children and adolescents with cancer. J Pediatr Oncol Nurs 2006;23:92-102. PMID: 16476783.
crossref pmid
29. Sun Y, Knobf MT. Concept analysis of symptom disclosure in the context of cancer. ANS Adv Nurs Sci 2008;31:332-341. PMID: 19033748.
crossref pmid
30. Wen KY, Gustafson DH. Needs assessment for cancer patients and their families. Health Qual Life Outcomes 2004;2:11PMID: 14987334.
crossref pmid pmc
Figure 1
The process of Walker and Avant's concept analysis and its modification for data model development.
Table 1
Data elements (n = 112) that were identified for the nursing assessment of cancer survivors
Table 2
The grouping of data elements of nursing assessment for cancer survivors based on the direction of meaning judgment of data elements
Table 3
The critical attributes that we identified to express the data elements, and the frequency of these critical attributes used therein

Values are presented as number (%).

Table 4
Example value sets of the critical attributes
Table 5
A representative example of a data model for pain
Table 6
Groups of data models in accordance with combinations of critical attributes
Table 7
External validation: responses of domain experts.

Values are presented as number (%).

aThis question was asked to 11 external domain experts about 112 data models. Except for this case, total answer is 1,232.


Browse all articles >

Editorial Office
1618 Kyungheegung Achim Bldg 3, 34, Sajik-ro 8-gil, Jongno-gu, Seoul 03174, Korea
Tel: +82-2-733-7637, +82-2-734-7637    E-mail:                

Copyright © 2023 by Korean Society of Medical Informatics.

Developed in M2community

Close layer
prev next