# Bayesian Approach to Predicting Acute Appendicitis Using Ultrasonographic and Clinical Variables in Children

## Article information

## Abstract

### Objectives

Ultrasound has an established role in the diagnostic pathway for children with suspected appendicitis. Relevant clinical information can influence the diagnostic probability and reporting of ultrasound findings. A Bayesian network (BN) is a directed acyclic graph (DAG) representing variables as nodes connected by directional arrows permitting visualisation of their relationships. This research developed a BN model with ultrasonographic and clinical variables to predict acute appendicitis in children.

### Methods

A DAG was designed through a hybrid method based on expert opinion and a review of literature to define the model structure; and the discretisation and weighting of identified variables were calculated using principal components analysis, which also informed the conditional probability table of nodes.

### Results

The acute appendicitis target node was designated as an outcome of interest influenced by four sub-models, including Ultrasound Index, Clinical History, Physical Assessment, and Diagnostic Tests. These sub-models included four sonographic, three blood-test, and six clinical variables. The BN was scenario tested and evaluated for face, predictive, and content validity. A lack of similar networks complicated concurrent and convergent validity evaluation.

### Conclusions

To our knowledge, this is the first BN model developed for the identification of acute appendicitis incorporating imaging variables. It has particular benefit for cases in which variables are missing because prior probabilities are built into corresponding nodes. It will be of use to clinicians involved in ultrasound examination of children with suspected appendicitis, as well as their treating clinicians. Prospective evaluation and development of an online tool will permit validation and refinement of the BN.

## I. Introduction

Ultrasonography is often used in the diagnosis of acute appendicitis in children, particularly in those cases with an atypical or equivocal clinical presentation. The role of medical imaging in clinical decision making can be complex because the information provided on imaging referrals may lack comprehensive clinical history, leaving sonographers and radiologists with an incomplete clinical context [1234]. Clinicians performing ultrasound examinations may glean information through discussions with patients and their parent/carer, who may have limited knowledge of clinical data, such as blood test results and vital signs. Advanced integration of Electronic Health Records (EHRs) and information systems can obviate this lack of clinical information and better inform radiology findings by enabling consideration of the broader clinical picture [5]. As access to EHRs increases, large amounts of data now available present clinicians with additional challenges regarding optimal integration with information obtained during medical imaging. Sonographers evaluating right lower quadrant pain in children may now have access to relevant clinical data that may not have previously been provided by referrers. The development of a predictive model incorporating traditional sonographic variables and relevant clinical information will facilitate a clearer understanding of the role of ultrasound in the diagnosis of appendicitis in children and a more systemic approach to incorporating imaging findings into a broader clinical context.

A Bayesian network (BN) is a directed acyclic graph (DAG) that depicts variables as nodes, with relationships between nodes indicated by connecting directional arrows or arcs that connect them, making visualisation of complex associations between variables possible [6]. The variables denoted by nodes can be continuous, but they are commonly discretised into a smaller number of possible states for network and computing simplicity. Care needs to be taken during this reduction of states to avoid oversimplification and the loss of useful information to the model. The states may represent categorical variables (positive/negative), ordered variables (a Likert scale), or thresholds of a continuous variable (temperature). The states attributed to each node can be defined by automated learning through analysis of a dataset, information in publications, standards of practice, and expert opinion, permitting flexibility in BN design [7]. The likelihood of a node's state is quantified by a conditional probability table (CPT) that is calculated using existing clinical datasets, other data, published studies, and/or expert opinion. The condition of a preceding or parent node will influence the likelihood of a child node's state; since the network is acyclic, the relationship between nodes can only be in one direction. Nodes without parents are known as root nodes, and their CPT determined by their prior probability or the probability of its various states in the relevant population. These CPTs can then be multiplied through the network structure to obtain the overall probability of the outcome of interest. Moreover, quantification of various scenarios can be tested on the outcome of interest node through the application of Bayes' theorem to the network node states and their conditional probabilities [8].

Development of a BN as a predictive model for the diagnosis of acute appendicitis in children will permit the inclusion of important ultrasound variables familiar to sonographers. These may be combined with clinical variables made available through increased access to EHRs. This will provide a more complete representation of important variables, including ultrasound data, and a graphical depiction of the broader clinical context of children with suspected appendicitis referred for ultrasound. There is potential to further develop a paediatric appendicitis BN as a platform for machine learning and artificial intelligence to make use of available EHR data and streamline triage for children with right lower quadrant pain by informing the BN with their clinical data and identifying patients for whom there is a higher probability of requiring urgent treatment.

Bayesian modelling incorporating medical imaging to inform clinical decision support incorporating medical imaging has previously been applied to a variety of clinical areas and imaging modalities, such as mammographic diagnosis of breast cancer [9], ultrasound diagnosis of thyroid nodules [10], and computed tomography diagnosis of chronic obstructive pulmonary disease [11]. Although BNs have also previously been used to model the diagnosis of acute appendicitis [12], to our knowledge, this study is the first to incorporate ultrasonographic variables.

## II. Methods

Where possible, data used to quantify numerical parameters for the nodes were obtained from a prior study conducted in an Australian paediatric hospital (Human Research Ethics Committee Approval No. HREC/15/QRCH/125) [13]. Children up to 16 years of age referred for ultrasound investigation of the right lower quadrant were recruited with written informed consent from their parent/guardian. Data were collected on study worksheets by sonographers performing the 230 eligible ultrasound studies and collated by the principal investigator. These data included the following ultrasonographic variables: appendix diameter, appearance of peri-appendiceal mesentery, appendiceal wall hyperaemia, and the presence of an appendicolith. Other variables, including clinical history (duration of symptoms, nausea, and temperature), and blood test results (C-reactive protein levels, white blood cell, and neutrophil counts), were collected through review of enrolled patients' EHRs by the principal investigator. All collected data were stored and coded in a spreadsheet.

Some variables of interest were not captured in this study (anorexia, pain migration, and rebound tenderness), and some variables suffered from large amounts of missing data, particularly with respect to patient history and availability of blood test results. This made automated learning of the network structure and parameters impractical; therefore, the DAG was designed through a hybrid learning method [1415]. This hybrid approach included an initial manual design process based on the elicitation of expert opinion as well as a review of the relevant literature and published data used to define the model structure, sub-models, and variables selected [1617]. The outcome of interest in the model was the likelihood of a child referred for ultrasound of suspected appendicitis, designated the target node acute appendicitis. The target node was influenced by four sub-models: Ultrasound Index, Clinical History, Physical Assessment, and Diagnostic Tests. Clinical variables not included in the prior study were determined via review of appendicitis scoring systems used in children including the Paediatric Appendicitis Score (PAS) and the Alvarado Score [1819]. The second design stage involved the discretisation and weighting of the identified variables, which were calculated using principal components analysis where sufficient data were available, which also informed the conditional probability table of nodes.

The Ultrasound Index comprised important primary and secondary sonographic criteria. The primary sonographic criteria consisted of those that necessitated direct visualisation of the appendix: Appendix Diameter (mean outside diameter in mm), ‘X_{mod}’ and Wall Hyperaemia (absent, present), ‘X_{wall}’. Secondary sonographic signs included Mesentery Appearance (echogenic, normal), ‘X_{mes}’ and Appendicolith (present, absent) , ‘X_{lith}’.

The Clinical History sub-model was based on prior publications and important factors identified in the PAS [19]. Nodes included Anorexia (present/absent), ‘X_{anor}’, Nausea (present/absent), Pain Migration to the right iliac fossa (present/absent), ‘X_{migr}’; and Duration of Symptoms, which was a continuous variable (hours), ‘X_{dur}’. The Physical Assessment sub-model included a patient's temperature with a binary febrile threshold of 38℃ (normal/elevated), ‘X_{feb}’ and rebound tenderness (absent/present), ‘X_{tend}’. The Diagnostic Tests sub-model included neutrophil count with a threshold of 7.5 × 10^{9}/L (normal/elevated), ‘X_{neut}’, white cell count with a threshold of 10 × 10^{9}/L (normal /elevated), ‘X_{wcc}’, and C-reactive protein with a threshold of 3 mg/L (normal/elevated), ‘X_{crp}’ [19202122].

The CPT values for the four sub-model indices were calculated by analysing their respective contributing parent variables using principal components analysis (PCA) for the Diagnostic Tests sub-model with only continuous variables (X_{wcc}, X_{neut}, and X_{crp}). Categorical principal components analyses (CATPCA) were conducted for the remaining submodels [2324]. The CPT of the target acute appendicitis node was calculated by allocating equal importance to each of the four parent nodes and a 90% diagnostic accuracy considering a possible 5% false-positive and false-negative error rate [25]. GeNIe Modeler version 2.2 software (Bayes-Fusion LLC, Pittsburgh, PA, USA) was used to create a DAG that visually represents the BN. All statistical analysis was performed with IBM SPSS Statistics version 22 (IBM Corp., Armonk, NY, USA).

Confidence in model validity was considered, with BN structure, parameterisation, and network behaviour evaluated to ensure they were appropriate for the scope of the model [26]. The model was then assessed for the following: nomological validity (the model fits in the broader topic domain in the literature), face validity (the model structure, node discretisation, and parameters fit with expert opinion), content validity (the model consists of relevant factors and relationships and reflects all known possibilities from experts and the literature), concurrent validity (the BN or a sub-section may behave in an identical way to part of another network), convergent validity (similarities exist between models that are in similar domains), discriminant validity (differences exist between models of very different domains), and predictive validity (predictive behaviour of the model is similar to that of the system that it is modelling) [26]. Opportunities for potential applications and adaptations of the model were then considered for future projects.

## III. Results

To determine the CPT for the Ultrasound Index sub-model, a CATPCA was performed using data collected for the four identified primary and secondary ultrasound examination variables. Using varimax rotation and Kaiser normalisation, the first component of the two-dimensional model had an internal consistency coefficient (Cronbach's alpha) of 0.783 and yielded an eigenvalue of 2.372, indicating that 59.30% of the variance was accounted for by this component. The second component had an internal consistency coefficient of 0.352 and an eigenvalue of 1.158, accounting for 28.95% of variance. Together, the two components explained 88.25% of variance (Figure 1). The appendix diameter variable (X_{mod}) was discretised from a continuous variable into five categories: <2.5 mm, 2.7–5.0 mm, 5.2–7.4 mm, 7.5–10 mm, and 10.4–19.0 mm. To improve the face and content validity of the discretisation, these were rounded to the nearest millimetre to reflect empirical values in the BN node (<3 mm, 3–6 mm, 6–8 mm, 8–10 mm, >10 mm). Component loadings from the two CATPCA components were used to calculate the CPT values for the Ultrasound Index sub-model = 0.74 (0.820X_{wall} + 0.940X_{mes} + 0.555X_{lith} + 0.864X_{mod}) + 0.26 (−0.230X_{wall} − 0.203X_{mes} + 0.892X_{lith} − 0.131X_{mod}).

The CPT for the Clinical History sub-model was determined using published literature. Duration of symptoms (X_{dur}) was considered a binary variable using a cut-off at 36 hours, with the relatively brief timeframe of appendicitis pathogenesis in children and the risk of perforation increasing after that time period [1927282930]. Other nodes were informed through the PAS scoring system [19]. They included anorexia (present/absent), ‘X_{anor}’, nausea (present/absent), ‘X_{naus}’, and pain migration to the right iliac fossa (present/absent), ‘Xmig’ [25]. Their respective positive and negative predictive values (PPV and NPV) from the literature were used to determine the latent probability of each variable to calculate CPT for the sub-model = log[CHppv(0.94X_{naus} + 0.69X_{dur} + 0.70X_{migr} + 0.88X_{anor}) − CHnpv(0.73X_{naus} + 0.13X_{dur} + 0.97X_{migr} + 0.89X_{anor}) + 4].

The CPT for the Physical Assessment sub-model was calculated by running a CATPCA on the two identified variables, X_{tend} and X_{feb}. The first component of the two-dimensional model had an internal consistency coefficient (Cronbach's alpha) of 0.324 and yielded an eigenvalue of 1.046, indicating that 52.30% of the variance was accounted for by this component. The second component had an internal consistency coefficient of 0.095 and an eigenvalue of 0.954, accounting for 47.70% of variance. Component loadings from the two CATPCA components were used to calculate the CPT values for the sub-model = 0.52 (1.092X_{feb} − 0.019X_{tender}) + 0.48 (0.018X_{feb} + 1.024X_{tender}).

For the Diagnostic Tests sub-model, the suitability of PCA was assessed prior to analysis. Inspection of the correlation matrix demonstrated that all of the variables had at least one correlation coefficient greater than 0.3. The overall Kaiser-Meyer-Olkin measure was 0.584 and Bartlett's test of sphericity was statistically significant (*p* < 0.001), indicating that the data was likely factorizable. PCA revealed that the components explained 75.43% (X_{wcc}), 23.41% (X_{neut}), and 1.16% (X_{crp}) of the total variance respectively, and cumulatively explained 100% of the total variance. Component loadings were used to calculate CPT values for the sub-model = 0.959X_{wcc} + 0.955X_{neut} + 0.657X_{crp}.

A DAG incorporating the variables and their respective CPT values was designed (Figure 2). Nodes were connected via directed arrows or arcs based on literature and the opinions of experts, including a specialist paediatric radiology consultant, a specialist paediatric sonographer, a senior paediatric emergency medical officer, and a professor in statistics, who were able to inform the face and content validity. The BN was evaluated with scenarios to test confidence in predictive validity in cases (Table 1). This involved setting the node states to reflect common clinical scenarios in this patient cohort: general abdominal pain; diffuse bowel inflammation, such as ileitis or colitis; a low probability of appendicitis (Figure 3); and a high probability of appendicitis with all nodes in the network set to states for an optimal positive outcome (Figure 4). No models with integration of ultrasound variables were identified in the literature, making assessment of concurrent and convergent validity difficult. However, a BN without imaging variables consisting of 10 nodes was identified and found to have a similar structure and behaviour to the BN described in this manuscript, with eight nomologically identical nodes found in both networks [12].

## IV. Discussion

To our knowledge this is the first time a BN model integrating ultrasound variables has been created for the diagnosis of paediatric acute appendicitis integrating ultrasound variables. Through careful variable selection and design using dimension reduction techniques, PCA and CATPCA, this model provides a representation of important factors in this patient cohort. The pervasiveness of EHRs and information sharing presents an opportunity for information about a patient's broader clinical context to be made more available to those who may be involved in their clinical pathway. The availability of large volumes of clinical data may present problems. It can be time consuming and impractical for each clinician to continually re-evaluate the information available to them as test results are returned and clinical assessments are made or revised. The potential to evaluate this information constantly through the application of a BN model developed into an online tool may permit patients who have higher probability of having appendicitis to be seen more quickly at triage or at other stages of their clinical journey.

Much of the information related to the nodes within this BN is already available, but it is often fragmented and not easily accessible. Patients and families may be asked for clinical history information by emergency clinicians that is not provided on an imaging referral. Sonographers may repeat these questions and use that information to tailor their examination, yet the information may not be passed onto a radiologist who would find it valuable in formulating a conclusion to a report. The structure of the proposed BN permits it to be employed as a tool to prompt consideration of important variables by sonographers and radiologists. Although this model is limited in scope for application to children referred for ultrasound of suspected appendicitis, it provides a potential platform for a broader model encompassing a greater number of variables, such as surgical or medication related considerations. Limitations of this study included the difficulty of conducting automated learning of network structure and parametrisation due to the sparse dataset available from our prior study and a lack of open access data to use for this purpose. Moreover, the paucity of published BNs in the same or similar clinical domains, if present, may have increased confidence in validation of our BN. Notwithstanding this, as discussed, the model construction approach adopted in this paper is well established.

The use of a visual representation of a decision-making model like a DAG in a BN is an elegant means to enable appreciation of a broader clinical picture. A recent survey of sonographers who perform paediatric appendicitis ultrasound in Australasia highlighted that they often feel removed from clinical decision-making considerations, and a better appreciation of the degree of suspicion of children to have appendicitis would assist them in performing their examinations [31]. Appendiceal sonography may require dedicated time and focus that can conflict with demands for activity and efficiency [32]. If sonographers and radiologists reporting these examinations were better informed of clinical covariates and their influence on the probability of appendicitis, they could better target their time and resources to cases that are more likely to be positive and therefore expedite surgical review.

The BN is able to accommodate missing data through the use of prior probabilities assigned to nodes informing the overall network. This feature of the network expands the scope of potential utilisation to clinicians outside hospitalbased practice where pathology services may not be readily available, and the rate of sonographic visualisation of the appendix is known to be lower than in dedicated paediatric centres [3334]. Therefore, children who present for ultrasound without blood test information, and in whom the appendix cannot be identified sonographically, can still have their probability of appendicitis determined by the network. This probability would be based on evidence from patients in prior studies and the literature (node prior probabilities), considered along with the data available at the time of their examination (clinical history, physical assessment, and sonographic mesentery appearance), to personalise the diagnostic outcome. For example, if the appendix was not identified and could not be measured or assessed for hyperaemia, the only nodes that would be updated with new information and potentially a defined state would be X_{mes} and X_{lith}. Changing the state of these nodes would then influence the Ultrasound Index CPT accordingly. The latent probabilities of X_{mod} and X_{wall} would not change their influence on the Ultrasound Index CPT would remain unchanged and based on the prior probabilities that determined their weighting.

Potential future applications may see this model integrated with an EHR or radiology information system (RIS) to provide alerts to clinicians when a diagnosis probability rises above certain thresholds. More efficient triage may be possible if patients and families are able to answer clinical history questions through an interactive device at reception or triage; their responses could begin to prepopulate the values in the BN to prioritise assessment or ordering of diagnostic or imaging examinations that may be useful. An online tool based on this BN will be developed in collaboration with emergency clinicians and surgeons. Future work is planned to evaluate the BN in a prospective study that will improve validation and permit refinement of the model.

## Acknowledgments

This manuscript is a component of a PhD thesis for Tristan Reddan at the Queensland University of Technology.

## Notes

**Conflict of Interest:** No potential conflict of interest relevant to this article was reported.