Review of machine learning and deep learning application in mine microseismic event classification

Purpose. To put forward the concept of machine learning and deep learning approach in Mining Engineering in order to get high accuracy in separating mine microseismic (MS) event from non-useful events such as noise events blasting events and others. Methods. Traditionally applied methods are described and their low impact on classifying MS events is discussed. General historical description of machine learning and deep learning methods is shortly elaborated and different approaches conducted using these methods for classifying MS events are analysed. Findings. Acquired MS data from rock fracturing process recorded by sensors are inaccurate due to complex mining environment. They always need preprocessing in order to classify actual seismic events. Traditional detecting and classifying methods do not always yield precise results, which is especially disappointing when different events have a similar nature. The breakthrough of machine learning and deep learning methods made it possible to classify various MS events with higher precision compared to the traditional one. This paper introduces a state-of-the-art review of the application of machine learning and deep learning in identifying mine MS events. Originality. Previously adopted methods are discussed in short, and a brief historical outline of Machine learning and deep learning development is presented. The recent advancement in discriminating MS events from other events is discussed in the context of these mechanisms, and finally conclusions and suggestions related to the relevant field are drawn. Practical implications. By means of machin learning and deep learning technology mine microseismic events can be identified accurately which allows to determine the source location so as to prevent rock burst.


Introduction
Microseismic (MS) monitoring is a useful short-term rock burst prediction tool that can forecast the occurrence of rock burst by extracting useful signals which propagate from the fracturing process of rock masses [1].Monitoring design, acquired data processing and locating event sources are the crucial technical concerns for creating a reliable measure to maintain rock mass stability for the assessment of mine seismic hazard [2], [3].In order to establish an adequate early warning of rock burst, accurate classification of original MS data that are swamped by noises and other unwanted events should be accurately processed [4].Nevertheless, MS signals are different from the natural earthquake signal which has low magnitude and is highly influenced by various background noise sources characterised by an abrupt amplitude, comprised of human walking, vehicle sounds, electromagnetic interference and blasting vibrations, giving the appearance of MS events [5], [6].Due to this reason, differentiating actual MS events from other various events is always a complicated task.MS classification is often conducted by an experienced analyst, which is time-consuming and always suffers from the subjective views of professionals [7].Therefore, extraction of the genuine rock fracturing signal from collected MS signals has always been the topic of discussion among various researchers.
In the course of years, many classification methods have been introduced to deal with the problem of MS and seismic field [8]- [11].For instance, Yu et al. used fractal wave characteristics of MS wave and distinguished various types of MS events [12].Jiang et al. classified the background noises, MS events and electromagnetic interferences based on the multi-channel joint recognition method of a single event [13].Jeffery et al. extracted the frequency domain characteristics, duration characteristics and statistical characteristics of MS events during the development of shale gas in the Cold Lake area of Alberta and constructed a model of MS event classification and recognition based on principal component analysis [14].Dargahi-Noubary proposed non-stationary random model to establish a model for identifying underground nuclear explosions and natural earthquakes [15].Arrowsmith et al. built spectral modulations to distinguish delay fired mine blasts from other events with an enhanced algorithm [16].However, classification of MS signals relates to various factors, and it is not feasible to fully make use of MS data relying on such conventional method due to the massive quantity of small MS events and because automatic discovery of events is frustrated by the contamination of the MS signal, the surrounding noise and mining activities, which also yields unsatisfactory result [17].Hence, recently, machine learning and deep learning methods have been gaining more concern and have been widely utilised for the automatic identification and classification of signals in various seismic fields [18]- [22], because they reduce the computational burden and provide high predicting accuracy.Nevertheless, in mining, MS is still in the developing stage.
In this paper we have presented an overview of the work done by various researchers applying machine learning and deep learning methods in mining to identify and classify MS events.The review is organised in five sections: after the introduction part in section 2, previously applied methods of discriminating MS events are shortly explained.Brief historical development of machine learning and deep learning methods are presented in the third section for a general understanding of the methods, the fourth section provides classification of the events performed by different scholars using machine learning and deep learning approaches (which is our prime concern), and the final section represents conclusion and suggestions in the related field.

Former approaches to microseismic event classification
MS event classification and detection are based on the identification of differences between effective signals and environmental noise.The most universally accepted automatic event recognition method in MS data processing is called STA/LTA (the ratio of long and short-term average).The STA/LTA method was first used for seismic phase identification in the field of natural earthquake research [23].With the emergence and development of MS monitoring technology, this method has also been used for automatic identification of MS events [24].The advantage of the STA/LTA method is that the principle is simple, easy to implement, and can meet the requirements of real-time processing.However, this method is usually more effective for events with a higher signal-to-noise ratio (SNR), and it is often impossible to obtain satisfactory results for low SNR events, therefore later various ideas were proposed to improve the efficiency, which is briefly elaborated.
Cao et al. summarised the important waveform characteristics of different MS signals through the time-frequency analysis technology [25].Analysing signal features like frequency characteristics, signal duration energy release, signal attenuation etc., various mine MS activities were classified.It became helpful to reveal the source mechanism and rock burst prediction work.Zhao et al. employed the mathematical model using a Fisher discriminant analysis [26].The results confirmed that the proposed method succeeded in correct classification of the regular and blast events with accuracy above 97.1%.Li et al. investigated various characteristics of mine earthquake and blasting signals through Hilbert Huang Transform (HTT) and analyzed wave duration and instantaneous energy attribute of both signals under different circumstances.However, no clear line of demarcation was obtained between the characteristics of both types of signals [27].Pan et al. using the STA/LTA method and waveform information identified and calibrated MS events and adopted the subsequent step on establishing structural model energy-absorbing coupling support subjected to rock burst [28].Li et al. made a comprehensive analysis of blasting vibrations using HTT and wavelet analysis method.They discussed the feature extraction and time-frequency distribution of blasting vibration signals and found that HTT is more adaptable when analysing non-stationary signals [29].Yoones and Mirko employed new method power spectral density (PSD) and applied it on the recorded raw data to automatically detect weak events obscured by background noise [30].The method is more robust than STA/LTA because no prior bandpass filtering is necessary to enhance the SNR.Similar kinds of events were also detected by Wang et al. who constructed target function of event detection, which can detect and restore clean MS events simultaneously [31].Tselentis and others put the idea of statistical Chi-squared test for event detection and automatic phase picker of primary wave based on Kurtosis criterion [32].With the help of Fourier transform, Zhang et al. investigated the spectral differences between mine tremor, blasting and earthquake [33].The experimental analysis revealed existing differences between corner frequency and the maximum spectral value of compression wave and the shear wave between earthquakes and blasting.Hu et al. established mathematical model using Fisher discriminant analysis and discriminated two different blasting and MS events 98.59% accurately and established that blasting signals containing mechanical vibration noises and a non-blast event recorded by the system as one event that occurred closely following the blast are the two factors that impacted 1.49% of misclassification [34].
As the subsurface geological environment in mining areas is complicated, and the propagation medium is not continuous and homogeneous, the MS signals in mining scenario reflect more complex characteristics than naturally occurring earthquakes.Therefore, distinguishing genuine MS signals from polluted signals makes it more difficult because the underground monitoring area is small, and most of the monitoring transducers are a single component in nature.The recorded signals are always vague due to the overlapping of multiple types of events.Traditionally identifying methods always present a low accuracy problem and cannot precisely distinguish the event easily due to such deficiency.This gap can be only filled using machine learning and deep learning approaches that have shaded many traditionally applied methodologies in the field of signal processing in performance and reliability.

Fundamental concepts of machine learning and deep learning methods
Concepts of machine learning and deep learning methods commenced with the research into the artificial neural network.Aristotle described "Associationism" in 300 BC, which is the first attempt in human history to understand the brain.However, this approach in modern science began in mid-1900 when Wallen McCulloch et al. put forward the concept of a mathematical model of a neural network which mimics the functionality of the brain to compute the theory of neural networks [35].Frank Rosenblatt proposed the idea of a single-layer neural network called "Perceptron" in 1957, which allows neurons to learn and operate elements within the training set single at a time, later Minsky et al. pointed inability of Perceptron to resolve XOR and NXOR problems and mentioned its limitations in performing certain functions [36], [37].Despite having some flaws in the algorithm, this infancy approach still turned a trailblazer for modern machine learning methods.A study done by Hubel and Wiesel regarding the cat's visual cortex rendered more insights on understanding the complexity of neural network models [38].However, due to less capable single-layer neural network on solving some sophisticated functions, it diminished the neural network research for a decade during 1970s.Rumelhart et al. brought about the idea of Backpropagation algorithm (BP) that utilises a hidden layer in the neural network again in mid-1980s.It became a powerful tool in solving pending complexity like XOR and parity problems, the principal mechanism of this algorithm being to effectively train the neural network through a chain rule method [39].Moreover, this attempt revived the neural network research again.Yann LeCun et al. published a biologically inspired image recognition model Convolution neural network (CNN) based on the BP algorithm [40].His paramount model played a significant role to establish the foundation for modern computer vision.During the last decade of 1900s, different statistical approaches were introduced to construct classification algorithm like Support vector machine (SVM) and boosting [41], [42].In comparison with other conventional machine learning methods, these methods are memory efficient and convenient to use but have insufficient learning capability in noisy data [43], [44].Hochreiter et al. developed Long short-term memory (LSTM) recurrent neural network to handle the exploding and vanishing gradient problem [45].This milestone revolutionised the approaching prospects in Machine learning.Along with computer hardware, technological improvement computational burden became no obstacle, and more capable algorithms powered a benchmark of artificial neural network.Hinton and Simon introduced deep learning, a subset of machine learning that has a multilayer neural network which is more analogous to the human brain in intelligence [46].The employed model drastically changed the future of the artificial neural network and achieved high performance in the field of computer vision, image classification, speech recognition and handwritten digit identification [47], [48].Supported by big data, cloud computing and other advanced functionality, deep-learning reflects the future of machine learning [49].Furthermore, the advent of sophisticated Generative adversarial network (GAN), deep reinforcement learning augmented the era of deep learning into another level [50].Figure 1 shows the historical development of machine learning and deep learning methods.

Event classification using machine learning and deep learning methods
Machine learning and deep learning methods need input data to create the classifier model.Hence, scholars have liberated several parameters obtained from the original waveform and the seismic sources to construct the trained model.

Figure 1. Historical evolution of machine learning and deep learning
Various waveform-based parameters from the original waveform such as time and frequency variant parameters, spectral ratio, maximum frequency, the ratio of P and S wave amplitudes, Signal duration etc. [51] and the source parameters like seismic moment, seismic energy, time of occurrence, stress drop etc. are utilised to construct the event classifier to distinguish various events from the Microseismic data [52].Manual classification of MS event is a tedious and time consuming task that requires experienced professionals and may suffer from the biased opinion of the observer [53].Recently machine learning and deep learning methods have gained popularity in the field of Mine Microseism to distinguish MS events from unwanted events.Taking benefit of Machine and deep learning models, researchers have applied it on automatic classification of Mine MS events.

Classification model based on waveform related parameters
Zhu et al. constructed a support vector machine (SVM) network model for the classification of MS events [54].Using fractal box counting dimension of the frequency band as a signal feature of blasting, electromagnetic and MS signals, 23-dimensional feature vector was established and SVM model was trained to classify and identify randomly selected 300 sets of data.The maximum accuracy of 94% in the case of MS signals and 100% for electromagnetic signals were achieved through the proposed method.
Shang et al. employed the knowledge of Empirical mode decomposition (EMD) and Singular value decomposition (SVD) to extract the feature of mine signals [55].This model uses SVD that solves the singular value of matrix composed by the components of real intrinsic mode function (IMF) which was obtained through EMD.Finally, Support vector machine (SVM) utilised EMD-SVD based feature vector to precisely identify the events of Yongbashan mine in China.A comparison was made between SVM and other machine learning classifiers like Backpropagation neural network (BPNN) and Bayes method, but SVM displayed a better performance (93%).Shang et al. used an updated model of previous work applying Back propagation neural network (BPNN) for classification [56].Necessary details as input parameter for BP were extracted by merging two methods: Frequency slice wavelet transform (FSWT) and SVD, given the method confirmed 91% of the signals recognized precisely.In order to prove the superiority of the adopted methodology, 70 training and 50 test samples of MS and blasting datasets were used and also trained with other three different machine learning models -Logistic regression (LR), Bayes and Fisher classifier, but BPNN method based on FSWT-SVD stayed optimal.
Table 1 shows the comparison of various models.Li proposed a model of Shang et al. [56] and classified MS and blasting signal based on local mean decomposition (LMD) and pattern recognition method [57].He inserted two primary product function (PF) components of energy spectrum and correlation coefficient as the feature vector input of pattern recognition that was obtained through disintegrating MS signals based on LMD, EMD and Discrete wavelet transform (DWT) method.Four machine learning classifiersartificial neural network (ANN), support vector machine (SVM), logistic regression (LR) and Naive Byes (NB) were embedded with each method, and the final model was trained with 1-100 sets of rock fracturing and blasting data and tested with 101-200 sets of data, the classification output based on correlation coefficients of LMD remained outstanding over EMD and DWT.Similarly, the energy spectrum of LMDbased result of ANN and SVM proved significantly better than LR and NB with the highest accuracy above 90%.
Jia et al. built a classifier model of Least Square-Support vector machine (LS-SVM) to identify low SNR MS events from noise events using Multiscale permutation entropy (MPE) [58].A similar amount of MS and noise samples were taken, and the feature vector was extracted performing MPE calculation to train the classifier.Auto-detection of the events was conducted using 70 groups of data as a training set and 30 groups as a test set.In the ten random experiments, the classifier detected all events accurately with prediction accuracy more than 90%.Similarly, LS-SVM classifier displayed better result than traditional STA/LTA and AIC method in identifying 62 events from 15 waveforms with better precision.Figure 2 shows the performance of different classifiers.The proposed method has real practical value in industry, however, the appropriate number of samples should be taken, which reduces the computational timing.
Lin et al. classified and recognised MS multichannel waveforms with the maximum accuracy of 91.13% based on Deep convolution neural network (DCNN) and Spatial pyramid pooling (SPP) [59].With the Data acquired from the Dongguashan copper mine in China, multiple waveforms were used as an input sample, to finish training and testing set.However, due to the inconsistency in channel number obtained from each MS signal, SPP was inserted to normalise the feature maps of the convolution layer of the final layer, and MS, blasting and noise events were classified.

Figure 2. Identification accuracy of the three different classifier methods
The proposed method is feasible because no manual feature extraction is needed.In his later proposed model, Lin updated the classifier model combining DCNN with SVM to achieve precision higher than the previous method [60].The constructed DCNN structure model comprehends and automatically recognises the feature of the multi-channel waveform, and SVM is used to automatically classify the multichannel waveforms.To validate the performance, the DCNN model was combined with other classifierslike KNN, random forest and SVMfor the same dataset of waveform image.However, the SVM showed an excellent result over other classifiers within less limit of time.
Zhao and Gross applied supervised machine learning algorithms such as SVM and distinguished MS event from noise events [6].16 input attributes extracted from 71 original time and frequency domain features were used to train the final SVM model based on neighbourhood component analysis.Multiple optimisation models trained by various kernel functions such as Linear, Gaussian, quadratic and cubic were compared, but SVM trained through Gaussian kernel displayed a remarkable performance with the prediction accuracy of 95 and 92% for MS events and noise events respectively.
Peng and other researchers developed Gaussian mixture hidden Markov model (GMM-HMM) based on the Mel frequency cepstral coefficient (MFCC) [61].The proposed model uses feature parameter vector of 24 dimensions acquired from MFCC to train the model based on GMM-HMM, MS signals containing various events (blasting, noise, electromagnetic interference) of Dongguashan copper mine in China that were classified accurately.Similarly, Peng again employed Deep learning method for the automatic classification of MS events [53].Reliability of 35 features in terms of frequency and time domain was uplifted using Generic algorithm (GA)-optimized correlation-based feature selection (CFS) GA-CFS method and final 11×50 feature matrix was put as CNN input.Well trained CNN classifier identified MS records of Huangtoupo copper and zinc Mine in China with full accuracy of 98.2%.The model was also compared with 7 other traditional machine learning classifiers, but none of them could beat the accuracy of CNN.
Chen et al. demonstrated how to use the technique of CNN and K-means clustering (KC) to classify seismic events and automatic arrival picking [62].The results of CNN architecture in classifying signals outperformed the traditional Multilayer perceptron (MLP), and KC-CNN based model achieved a more remarkable output than KC in arrival picking even with MS data containing noise.
Choi and Pyun studied four types of recorded mine signals such as blasting, cleaning, drilling and noise and extracted the various attributes from signals to make the supervised machine learning model [63].To train the model, manual labelling was done on 1796 signals which contain 317 blastings, 901 drillings, 461 cleanings and 117 noises respectively.80% data were used as training set, whereas 20% were left for the test set, the training set was also cross-validated into 5 groups to avoid overfitting.Out of 22 trained models, most reliable 4 models like quadratic discriminant, boosting tree with an ensemble, bagging tree and RUS Boosted tree with ensemble were selected and verified applying test data.However, the bagging tree with ensemble worked more suitably than 3 other models.The proposed model can be useful in realtime monitoring to replace manual inspection.
Song et al. used Stockwell Transform to convert genuine time-domain mine MS signals into a two-dimensional timefrequency image which later was fed as input parameters to train the CNN model [64].The mechanism used two different pixel size image datasets of 450×350 and 180×140 examined through the CNN model.The final outcome verified that small pixel size 180×140 obtained maximum accuracy compared to the bigger one when the weight tensor shape was upgraded to optimal in distinguishing blasting and rock fracturing signals.Moreover, the proposed method depicts the pixel size of the image having a high influence on the time required for the model to be trained.

Classification model based on source-related parameters
Multivariate Gaussian distribution implemented by Malovichko proved to be more efficient in separating ordinary and blasting events [3].Discriminator takes advantage of four seismic and source parameters like original time of blasting, radiation pattern, high and low-frequency radiation ratio etc.The technique was calibrated to check the performance of discriminating features for four mines -A, B, C and D of Australia, the discriminator showed better separation of the blast and regular events in a dataset of mines A and C. The larger-scale application of discriminator revealed almost 20% (1431 out of 7035) of seismic events reclassified as blasts.
Vallejos and McKinnon applied LR and ANN to categorise mine seismic and blasts events of two mines -Nikel rim south and Kidd creek of Canada [65].Relying on 13 significant parameters, they established classifying indicators and efficient classifying model built with LR and ANN that categorise the two different events even in the nonlinear case.26811 training and 11490 testing sets in Nickel rim south mine and 12762 training and 5469 testing sets of Kidd creek mine were utilised to verify the results, the final product of both the logistic and neural network model performed equally in distinguishing blast and seismic events with low misclassification with the datasets of Nikel rim south mine.However, the neural network slightly degraded in performance when Area under the Roc curve (AUC) value was lower than 0.9.
Out of 19 parameters, Dong et al. utilised 6 relevant parameters to set up probability density distribution and three Machine learning classifiers logistic regression.Fisher and Bayesian classifiers were utilised to develop discriminator embedding easy function [66].The proposed method was surveyed with the data of three mines in Australia and Canada to check its feasibility.Further, backtest, cross-validated outcomes, receiver operating characteristic (ROC) curve and applied result analysis revealed Logistic regression discriminator of one mine out of three surpassing other classifiers in its quality to dissociate blasts and seismic events.However, the applicability of this model is only confined to small events, larger events still require manual inspection.Dong et al. proposed amendment regarding some source parameters for the above method [67].Logistic and Log-Logistic models were used to modify the density function of the blasting occurrence time and the adjacent blasting time difference in [67].The multi-parameter analysis method is better for identifying MS events, but its characteristic parameters are complex, and it is more challenging to implement.
Shang et al. took 1600 events of the dataset from Yongshaba mine in China and applied a new methodology using ANN based on dimensionality reduction algorithm and principal component analysis (PCA) to distinguish MS events and quarry blasts with 22 seismic source parameters [68].The newly obtained data by reducing dimension through PCA were taken as an input for ANN and different classifiers such as NB, LR and Fisher classifiers adopted for comparison.The final outcomes of PCA-based method and without PCA-based method were compared with the Dong et al. data [67].However, the proposed PCA-ANN based idea overweighs other classifiers and demonstrates better results than the methodology of Dong et al. [67].The predicted result shows that the introduction of some new parameters boosts the performance of Discriminators.
Zhou et al. constructed a discrimination model based on backpropagation neural network (BPNN), which utilises the feature vector computed by multiscale permutation entropy (MPE) [69].The final solution was compared and also analyzed through a ROC curve with other Machine learning methods -SVM and NB.However, the proposed method manifests 91.67% of overall accuracy over other methods in classifying rock rupturing and blasting events.
Pu et al. encapsulated 10 machine learning models such as SVM, BPNN, LR, Gaussian process classifier (GPC) etc. in recognition of Microseismic/blasting activity [70].Strength of each model was evaluated by exerting five different performance indicators, and ultimately considered conclusion through Fuzzy comprehensive evaluation model reveals that, out of 10 models, LR stood superior while GPC appeared inferior.Figure 3 exhibits the reliability of different adopted models.

Conclusions and suggestions
Current application of machine learning and deep learning methods have become universal in the plethora of fields and overshadowed human intelligence in executing tasks.It has reduced the engineering burden of manual separation of useful rock fracturing signals and displayed its highly progressive superiority over other traditional and empirical methods.Machine learning and deep learning are prospective ways to simulate the nonlinear relationship between various mine seismic events, even in complex underground conditions.These models are feasible and convenient to use; the reason is that they do not need any prior understanding of any input/output variables and require less computation time.However, there still seem to be various ways where further improvement can be made on existing models.
Due to the complex geological underground environment, the recorded signal captures different signals that are similar in nature to genuine MS signals.Commonly applied machine learning models are binary classifiers which only distinguish MS events or non-MS events.Therefore, a single classifier model can be upgraded to a multi-classifier model that can additionally separate the collected events into further subclasses like blasting events, vehicle noises, electromagnetic interferences and human-made sounds which substantially reduce manual inspection in real-time monitoring.
The plethora of the research in classifying events mainly focuses on studying the signal characteristics and separating them with non-useful ones.Few types of research can be found that classify useful events and automatically pick arrival time which is the crucial parameter to locate the hypocenter.There is still further room for such improvement using machine learning.Extraction of genuine MS signals is the key factor which gives the necessary information about the occurrence of rock burst.However, the present research only focuses on rock fracturing from other signals.The main motto of collecting MS signals are to find rock burst occurrence time and location of the fracturing place that is why a subsequent step can be imposed on creating a model that finds the relationship between MS signals and rock burst occurrence.
Most of the Machine learning algorithms that have been used in mine Microseism are shallow machine learning models such as Logistic regression, support vector machine, decision tree, random forest etc.Even though these models are corresponding in terms of convenience and cherish quick computation, they can only simply correlate the relation between useful events and non-useful events, and manual feature extraction is still necessary before training the model.Hence, further application of Deep learning should be put into practice, since it extracts the features automatically, and models can be trained with a large number of datasets.The current implementation of Deep learning only uses Convolution neural network (CNN) in classifying MS signals which have achieved a better result.However, other algorithms such as Recurrent neural network (RNN), and Long Shortterm memory (LSTM) can be employed, which may result in better accuracy.The advantage of RNN is that it can enhance the early warning system in real-time.Due to its robust architecture model, it can comfortably readjust to highly dynamical internal seismic source inside a mine.Even if mine tremor properties change over time, the system trained with data from past campaigns should be able to provide efficient monitoring.Combination of machine learning models such as Generative Adversarial Network (GAN) and Random Forest has already become prevalent in real-time seismic signal discrimination of natural earthquakes.The benefit of such model is that it can directly use waveform data, while the output error rate is negligible.Similar model can be anticipated in Mine Microseismic for real-time signal detection.
Moreover, CNN has shown great success in single label image classification.Combination of two architectures -such as CNN-RNN -might be quite effective for multi-label image classification.It can be useful for identifying multi-label mine seismic waveform images of different events, because direct waveform image data can be fed into deep learning methods, and this is pending for future research.In future experiments, data mining in classifying Microseism is worth exploring due to its function, according to which it extracts knowledge from extensive data.
In a nutshell, utilizing machine learning and deep learning have shown great significance and achievement in Microseismic monitoring research, although the research using these methods is still in infancy stage.But tremendous success can be obtained in addressing the issue if more advanced algorithms are involved.

Figure 3 .
Figure 3. Classification accuracies for ten machine learning models