I. Introduction
Generative artificial intelligence (AI)-powered large language models (LLMs) have considerable potential to advance nursing by facilitating learning, enhancing digital literacy, and promoting critical thinking [
1,
2]. Integrating AI-assisted chatbot technologies into problem-based learning environments can offer nurses valuable practical experience [
3]. Nursing practice generally adheres to a structured, patient-centered process that includes five sequential stages: assessment, diagnosis, planning, implementation, and evaluation [
4]. Adherence to this process is essential for ensuring safe and cost-effective nursing care. Among these stages, nursing diagnosis involves assessing patients’ health conditions and identifying nursing-related problems.
In contrast to medical diagnoses, nursing diagnoses enable nurses to evaluate patient conditions within their professional scope, identify nursing problems, and develop appropriate care plans [
5]. The Joint Commission on Accreditation of Healthcare Organizations (JCAHO) mandates the active use of nursing diagnoses as part of accreditation requirements for healthcare institutions in the United States [
6]. Furthermore, nursing diagnosis is a globally adopted approach in nursing practice, and the Korean Ministry of Health and Welfare has incorporated it as a key component of institutional accreditation standards [
7]. However, formulating nursing diagnoses requires extensive expertise and experience, which makes the process particularly challenging for novice and inactive nurses. Consequently, this process is frequently time-consuming and may suffer from a lack of quality assurance. Moreover, accurate nursing diagnoses demand thorough data collection and analysis; however, workforce shortages often prevent nurses from gathering the required patient information effectively [
8].
Various studies report that nursing documentation consumes between 25% and 41% of nurses’ total working hours [
7,
9,
10]. The shift to electronic health records has further increased the documentation workload, and although documentation efficiency typically improves with experience, high attrition rates have been observed during the adaptation period [
11].
Large language models (LLMs) are AI language models that utilize extensive neural networks, often containing billions of parameters [
12]. LLMs learn from vast amounts of unlabeled text using self-supervised learning, and they have demonstrated remarkable performance across multiple tasks, thereby driving innovation in natural language processing research [
13]. This advancement underscores the need for general-purpose models capable of addressing a wide range of tasks, rather than relying solely on specialized supervised models [
14]. In November 2022, OpenAI released Chat Generative Pretrained Transformer 3.5 (ChatGPT-3.5), based on the GPT-2 model, and has subsequently introduced newer versions. ChatGPT is an AI-based LLM that continually improves using supervised and reinforcement learning techniques. Unlike traditional search engines that offer generalized information, ChatGPT has gained notable attention for its ability to generate personalized responses to specific queries [
15].
Research on generative AI for medical documentation is expanding rapidly. For instance, some studies have demonstrated that surgical reports, which formerly took over 15 minutes to generate, can now be produced in under 10 seconds using AI [
16]. Other studies have shown that AI-generated discharge summaries can reduce the documentation burden on healthcare professionals [
17,
18]. Applying this AI-driven automation to nursing documentation could help clinical nurses reduce their documentation workload and increase the time available for direct patient care [
19,
20]. However, the real-world application of generative AI in hospital settings remains in its infancy, and privacy concerns have hindered the large-scale implementation of AI models trained on actual clinical data for nursing and medical documentation.
To address these challenges, this study aims to develop and evaluate a generative AI-based nursing diagnosis recommendation system that utilizes virtual patient data. The objective is to improve the efficiency and feasibility of nursing processes, including diagnosis, intervention, and evaluation. The specific objectives of this study were as follows:
1) Compare the time required for nursing documentation between conventional manual electronic nursing records (ENRs) and AI-assisted nursing diagnosis recommendations.
2) Compare the quality of nursing documentation generated manually by nurses with that produced using the generative AI-based system.
3) Evaluate the AI-generated nursing documentation in terms of accuracy, comprehensiveness, usability, ease of use, and fluency.
II. Methods
1. Participants
The participants in this study were nurses with clinical experience who voluntarily agreed to participate after fully understanding the study’s purpose and procedures. The specific inclusion and exclusion criteria are as follows. The required sample size was calculated using the G*Power 3.1 program (
https://www.gpower.hhu.de). Assuming a medium effect size (0.5), a significance level of 0.05, and a statistical power of 0.9, a minimum of 36 participants was required. With an anticipated dropout rate of 10%, 40 participants were ultimately recruited [
21,
22].
Participant recruitment was conducted in collaboration with the College of Nursing at Seoul National University and the Collge of Nursing at Ajou University in South Korea. To facilitate recruitment, the research team provided the institutions with comprehensive details on the study’s purpose, inclusion and exclusion criteria, participation schedule, and other pertinent information. Additionally, the institutions were asked to post a recruitment notice link on their online bulletin boards and social media platforms. Interested individuals could indicate their willingness to participate by using the provided link or QR code.
Applicants were required to complete and submit an application form that included their name, age, gender, prior research participation experience, nursing license status, clinical experience, and contact information. Participant names and contact details were accessible only to the research team. The principal investigator and co-researchers coordinated the participation schedules through individual communication with the applicants. Recruitment and testing were conducted online between August 1, 2024, and August 20, 2024.
Inclusion criteria are defined as registered nurses aged 21 to 50 years; holders of a valid South Korean nursing license; and nurses with at least 3 months of clinical experience in a general hospital or higher-level medical institution. Exclusion criteria are defined as nurses with less than 3 months of clinical experience in a medical institution; and nurses who had tested a generative AI-based ENR system within 4 weeks prior to participating in this study.
2. Procedure
The study procedure consisted of two phases: traditional nursing documentation and generative AI-assisted nursing documentation. All evaluations used a standardized method across all participants. After obtaining informed consent, participants completed a pre-survey to provide their demographic information. Participants were then asked to document ENRs based on a clinical scenario related to a disease they were familiar with, chosen from 110 provided virtual patient scenarios. In the second phase, participants used a generative AI-based nursing diagnosis recommendation system to document nursing records. The time taken for each documentation method was recorded in seconds.
1) Step 1 (Traditional nursing documentation)
In this phase, participants documented ENRs based on a selected clinical scenario using their clinical nursing experience. Participants selected one of the following familiar nursing documentation methods: NANDA (North American Nursing Diagnosis Association), SOAPIE (subjective, objective, assessment, plan, intervention, evaluation), Focus DAR (data, action, response), or narrative documentation.
The study was conducted through Zoom meetings, where participants also completed a questionnaire collecting demographic information such as gender, age, nursing license status, and clinical nursing experience. Individual Zoom sessions were conducted with each of the 40 nurse participants rather than group sessions. Each session began with a 5-minute pre-survey, followed by a 10-minute explanation of the ENR system. Participants then proceeded to complete the documentation tasks. Following completion of both documentation methods, a 5-minute post-survey and a 10-minute individual interview were conducted. The Smart-ENR Standard version—an ENR system developed by DKMediInfo (
https://www.smartenr.com/) for training nursing students and newly licensed nurses—was used to document nursing assessments, diagnoses, interventions, and outcomes for virtual patients (
Figure 1).
2) Step 2 (Generative AI-based nursing documentation recommendation)
In this phase, participants documented ENRs using a generative AI-based nursing diagnosis recommendation system that utilizes virtual patient data. This step aimed to evaluate the effectiveness of the AI-assisted nursing documentation approach. Participants who completed Step 1 then used the SmartENR AI version, developed by the research team. They first entered the patient’s basic demographic information and selected a nursing documentation method. Next, nurses provided a brief, one-line description of the patient’s condition in the prompt field. The generative AI model, trained on existing nursing records, then generated a recommended nursing record based on the provided information. The SmartENR AI version is a cloud-based system that integrates the ChatGPT-4.0 API—a large language model generative AI customized for the South Korean nursing documentation environment. It automatically generates ENRs in various formats, including NANDA, SOAPIE, Focus DAR, and Narrative documentation (
Figure 2).
The process of using the generative AI-powered system was as follows:
- Select a nursing documentation format from the left-side menu.
- Enter the basic information of the virtual patient using an option-based input system.
- Provide a brief description of the patient’s condition in the designated prompt field and click the “Generate Nursing Record” button.
- Review the AI-generated nursing diagnosis recommendations and, if necessary, modify them based on clinical judgment before copying and pasting the content into the ENR system.
- Participants were allowed to revise the generated content freely according to their clinical judgment.
- The total time was measured from the moment the prompt was entered, through the AI generation and copy-paste process, until the completion of any revisions and the clicking of the save button.
3) Step 3 (Usability evaluation)
After completing all nursing documentation tasks, participants completed a usability evaluation. The evaluation involved a survey that assessed the system’s accuracy, completeness, applicability, ease of use, and fluency. The multiple-choice questionnaire included five items.
4) Step 4 (Open-ended questions)
Following the multiple-choice survey, participants answered open-ended questions during individual interviews. Their opinions were solicited on the following three items:
- How would you evaluate the nursing records generated by the generative AI system?
- What advantages did you perceive in the AI-generated nursing records?
- What disadvantages or areas for improvement did you identify in the AI-generated nursing records?
3. Statistical Analysis
Participants who met the inclusion and exclusion criteria and completed at least one session of the program were included in the analysis. Those who discontinued or did not participate were considered unassessable and excluded. Descriptive statistics were used to analyze participants’ demographic characteristics, including age, gender, and nursing experience. These statistics included measures such as mean, standard deviation, frequency, and percentage. The time required to complete nursing documentation and the quality of the records were assessed using the conventional ENR system. The same parameters were then evaluated after using the generative AI-based nursing diagnosis recommendation system. The paired t-test was conducted to compare pre-and post-intervention differences in documentation time and record quality, thereby verifying statistically significant changes.
4. Ethical Considerations
This study adhered to the principles of the Declaration of Helsinki to ensure the safety and ethical protection of research participants. The study protocol was reviewed and approved by the Institutional Review Board of a Ministry of Health and Welfare-designated public institution (Approval No. P01-202407-01-049).
The informed consent form provided detailed information on the study’s purpose, procedures, potential risks and benefits, and data privacy measures. The anonymity and confidentiality of the research data were strictly maintained. All personal information collected was securely managed by the research team and was explicitly designated for research purposes only. Participants received a small compensation for their participation.
IV. Discussion
This study aimed to develop a generative AI-powered nursing documentation system by training an AI model on nursing records generated using virtual patients. A pilot test was conducted with clinically experienced nurses to compare the documentation time and usability evaluation between traditional electronic nursing documentation and AI-generated nursing documentation.
Through this study, we identified the potential of generative AI to effectively reduce nurses’ workload. In particular, the deep learning–based nursing diagnosis recommendation system was shown to enhance nurses’ workflow efficiency and reduce documentation time [
2]. The majority of study participants responded positively to the AI-assisted nursing documentation system, with nurses reducing their documentation time by an average of 38.6% compared to traditional methods.
A previous study [
16] on surgical records written by physicians reported a 99% reduction in documentation time, decreasing from 7.1 minutes to 5.1 seconds. However, that measurement considered only the AI generation time, and on average, 2.1 edits were required. Similarly, this study found that although AI-generated nursing records were created rapidly, additional time was needed for nurses to compose effective prompts using their clinical knowledge and to transfer the generated content into the ENR system. The time required for editing AI-generated nursing records also varied among nurses. These findings suggest that while generative AI has the potential to significantly reduce documentation time, optimizing prompt design and integrating AI more seamlessly into nurses’ workflow will be crucial for maximizing its efficiency. One key factor contributing to the additional time required in this study, compared to previous research [
16], was that most participants were using the practice ENR system for the first time. Had the study been conducted using an ENR system already familiar to the participants, documentation time might have been considerably shorter.
In the usability evaluation, the ease-of-use score was 4.80 ± 0.61 on a 5-point scale, indicating that many participants believed the system would be highly beneficial for novice nurses with less than 1 year of experience. This finding aligns with previous research [
17] evaluating AI-generated discharge summaries, where the ease-of-use category received the highest score. These results suggest that generative AI has significant potential for application in healthcare education [
1,
3]. Currently, the 1-year turnover rate for novice nurses in tertiary hospitals in South Korea reaches 50%, highlighting a major workforce issue [
23]. The implementation of generative AI-assisted electronic nursing documentation systems is expected to support nurses in their documentation tasks and potentially reduce the turnover rate among new nurses.
The findings of this study indicate that nurses with 3–5 years of clinical experience documented nursing records significantly faster than those with less than 1 year of experience. However, as clinical experience exceeded 5 years, documentation time gradually increased. This trend was observed in both self-written ENRs and AI-assisted nursing records. Although this study was conducted as a pilot test with a limited number of participants, future research should involve a larger sample size to explore the correlation between clinical experience and documentation efficiency more comprehensively.
We also observed that documentation time varied according to the type of documentation method used. Because different healthcare institutions employ varying electronic medical record (EMR) systems, the structure and format of nursing documentation can differ markedly between hospitals. This inconsistency often necessitates additional training when nurses transition between institutions [
7]. To address this challenge, future AI-powered nursing documentation systems should be designed to accommodate multiple formats, ensuring adaptability across diverse clinical settings. Furthermore, robust privacy safeguards must be integrated into the system’s design from the outset to prevent data security breaches [
2].
Interviews with participants revealed challenges with entering detailed patient information. For effective clinical implementation, it is crucial to integrate hospital data APIs that allow real-time retrieval of patient data. Access to up-to-date medication regimens, vital sign records, laboratory results, and imaging reports would enable generative AI to generate more precise and contextually relevant nursing documentation, thereby improving both usability and accuracy.
This study has several limitations. First, the AI model was trained solely on nursing records generated from virtual patients rather than real patient data, due to strict privacy regulations that prohibit the use of actual patient records outside healthcare institutions. To overcome this limitation, generative AI models could be deployed on hospital servers for On-premise training—a concept gaining traction with recent technological advancements. Second, the pilot test used a nursing student training ENR system instead of a fully operational hospital EMR system. Future studies should be conducted in real clinical settings using institution-specific EMR systems with professional nurses to enhance the applicability of the findings. Third, because participants selected clinical scenarios with which they were familiar, variability in documentation performance may have arisen from differences in case complexity. Fourth, future research should track and compare the number of edits made by participants to better assess the efficiency and usability of AI-generated documentation.
The future of nursing practice is likely to involve widespread adoption of generative AI to reduce nurses’ workload and promote a more efficient and professional clinical environment. This study provides foundational evidence supporting the integration of generative AI with nursing practice to enhance workflow efficiency.
Ultimately, enhancing the accuracy of AI-generated nursing records by training models on diverse nursing documentation data is critical for effective clinical adoption. Moreover, developing AI systems that produce immediately usable nursing records with minimal modifications will be essential. Such advancements will allow nurses to devote more time to direct patient care, thereby elevating the overall quality of nursing services.