### I. Introduction

### II. Methods

### 1. Data Collection

### 2. Statistical Analyses and Model Development

### 3. Preprocessing

### 4. Binary Logistic Regression Modeling

*p*of classifying a participant as having DM (

*p*= 1) or not (

*p*= 0) based on individual characteristics is modeled by Equation (2):

*p*specifies the probability of DM, and

*β*

*are the regression coefficients associated with the reference group, and*

_{i}*x*

*are the explanatory variables. The univariable binary logistic regression modelling was conducted between each explanatory variable and the outcome status (DM/non-DM). Some variables with a*

_{i}*p*-value <0.2 were included in the multivariable model while other variables such as ethnicity and sex were selected a priori based on the literature and the experience of the research team. Coefficients and adjusted odds ratios (with 95% CIs) were calculated for each explanatory variable.

### 5. Artificial Neural Network

*x*

*…….*

_{i}*x*

*) are trained with target data in order to give an output (*

_{n}*y*

*) [13,14].*

_{i}*w*

*were continuously assigned to the corresponding input features and the gradient of the loss function with respect to each weight,*

_{ij}*x*represents the inputs and

*f*(

*x*) is the activation function.

### 7. Algorithm Steps for the Implementation of Models

Step 1: Load appropriate libraries

Step 2: Import the dataset

Step 3: Do random sampling

Step 4: Normalize the data using min-max normalization

Step 5: Fit the models

Step 6: Perform 10-fold cross-validation

Step 7: Predict outcomes using the models

Step 8: Evaluate the models’ performance

### III. Results

*p*< 0.001). Furthermore, participants with a family history of diabetes had 3.6-fold higher odds of DM than participants with no family history of DM (AOR = 3.56; 95% CI, 1.91–6.79;

*p*< 0.001). Participants with high blood pressure and poor oral health also had higher chances of having DM. Sex, ethnicity, fish consumption, and vigorous activity were not statistically significant predictors of DM (

*p*> 0.05) (Table 3). The equation for the LR model is shown in Supplementary Table S2.

*p*< 0.001), and between LR and ANN (

*p*< 0.001). No statistically significant differences were observed between DT and ANN (

*p*= 0.217) (Supplementary Table S4).