### I. Introduction

*k*-means clustering, have been researched to improve their classification or prediction results [78]. Xu et al. [7] suggested a novel cluster-based analysis method to measure perceived stress using physiological signals, which accounts for the inter subject differences with a k-means clustering process. This method shows better evaluation accuracy than traditional methods without clustering. Sani et al. [8] introduced a method to classify stress subjects based on electroencephalography signal using SVM with a classification rate of 83.33% using radial basis function kernel function. However, these techniques require complex and stochastic signal-processing of physiological signals, which are not appropriate for the construction of prediction models based on big data and the development of deep learning technology.

### II. Methods

### 1. Dataset

### 2. Statistical Analysis

*t*-test was conducted between the low-stress and high-stress groups to compare the respective averages of age, sleep time, pulse rate, SBP, DBP, height, weight, and BMI. A chi-square test was conducted to analyze the relationship between gender, drinking and smoking variables, and stress.

*p*< 0.05 for both the

*t*-test and chi-square test was used to determine the appropriate variables to classify stress.

### 3. Deep Belief Network

*v*,

*h*) of the visible nodes h and hidden nodes h can be represented by the following energy function: where

*v*

*is the binary state of visible node*

_{i}*i*,

*h*

*is the binary state of hidden node*

_{j}*j*,

*w*

*is the weight between nodes*

_{ij}*i*and

*j*,

*b*

*is the bias term of visible node*

_{i}*i*, and

*b*

*the bias term of hidden node*

_{j}*j*.

### 4. Deep Learning Platform

*Multilayer-Configuration*object including optimal functions, such as

*sigmoid*active function.

### III. Results

### 1. Dataset Characteristics

*t*-test was used for continuous variables and the chi-square test for categorical variables. The significance level of both tests was

*p*< 0.05. Variables with a

*p*-value less than 0.05 were gender, age, sleep time, pulse rate, SBP, height, weight, drinking, and smoking, making a total of nine variables related to stress.

### 2. DBN Model Design

### 3. DBN Model

### IV. Discussion

*p*< 0.05 for each item (physical activity and lifestyle). The input variables of gender, age, sleep time, pulse rate, SBP, height, weight, drinking and smoking showed a statistically significant relationship with stress-related physical activity and lifestyle data. In other words, stress can be classified with a DBN model consisting of these nine input variables and two output variables (low-stress, high-stress). Setting hyperparameters in DBN research requires iterative processes and a large number of steps. In addition, the fine adjustment of the hyperparameters leads to changes in the output values, such as sensitivity, specificity, and accuracy; however, the variation is not large, and the result, which is proportional to the set value, cannot be output. Therefore, in this study, we grouped the results of the output values together as profiles during the process of observing output values with various hyperparameter values. As a result, it was confirmed that profile 4, having the best accuracy and specificity achieved excellent results. The accuracy of profile 4 was 66.23%, which is similar to that of NB (65.23%), DT (63.28), and SVM (66.02%). As the results show, SVM is time consuming because it has an accuracy similar to the DBN model, but it requires the labelling of each piece of data (correct answer) through supervised learning techniques. On the other hand, the DBN model can save time because the semi-supervised learning technology allows the use of unlabeled training samples. Therefore, although the accuracy values of the SVM and DBN model are similar, the performance of DBN model is better considering human labeling time. However, the results of this study have the following limitations. The stress classification model implemented in this work cannot be used to investigate the degree of the two subdivided cases of stress. To design a more accurate stress classification method, the degree of stress must be studied in greater detail.