摘要:本文是企业管理中的聚类分析论文,聚类分析是能够解决我们的研究问题一种技术,它能够使用户确定自然基础结构的复杂数据集。在这样做时,我们可以分辨出企业的类型和特定的董事组织、公司的质量水平。
at xi arises from it using the Maximum A posteriori (MAP) principle.
The MAP principle is given by,
where:
EM algorithm seeks to find maximum likelihood estimates through iteration of the expectation and maximization steps.
The mth iteration of the Expectation step is given by,
and the maximum likelihood estimate of the mth iteration (denoted by m) is updated using the conditional probabilities as conditional mixing weights. This leads to the Maximization step as given by,
where:
As in K Means clustering, different initial values of the parameters may lead to different local maxima of the maximum likelihood estimate function. As such, to ensure that we can get a maximum that is either the global maximum or a local maximum that is closest to the global maximum, for each k (where k = 2 to 35), we repeat the algorithm 500 times using 500 random sets of initial parameters. The optimal solutions are the clusters solutions which result in the highest maximum likelihood estimate.
1.1.1.1.2.Cluster Validation
As cluster analysis is an unsupervised technique, cluster validation is a necessary step to evaluate results of cluster analysis in an objective and quantitative manner.
The main validation objectives are:
Determination of clustering tendency
Determination of the number of clusters
Evaluate how well a clusters result represent the natural group structure underlying the data based on information intrinsic to the data alone (i.e. internal validation) (Handl, Knowles, & Kell, 2005);
Evaluate clusters results based on comparison with known class labels which correspond to the natural group structure underlying the data (i.e. external validation) (Handl, Knowles, & Kell, 2005)
As clustering techniques are known to find clusters even when there is no underlying cluster structure, objective 1 is fundamental for cluster analysis. Objective 2 is imperative because the number of clusters is an essential parameter in two clustering techniques that we employ. To the best of our knowledge, this is the first time a cluster analysis is performed on the market for directors. Hence, there are no established class labels that correspond to the natural cluster structure. Thus we will only carry out an evaluation based on internal validation measures.
Assessment of Clustering Tendency
In comparison to other validation steps, assessing clustering tendency is a step prior to actual clustering of the data. In our study, we utilize self-organizing maps (SOMs) to assess the clustering tendency of our data. SOM Toolbox for Matlab (Vesanto et al., 1999) is employed to perform SOM training and visualization. An SOM consists of neurons as components that are organized on a regular low dimensional grid. Each neuron is represented by a weight vector of the same dimensions as the input vectors. Connections between adjacent neurons are by a neighbourhood relation, which dictates the topology, or structure, of the map. The SOM training algorithm moves the weight vectors around so that the map is organized in a way whereby neurons of similar weight vectors are grouped together.Visualization of SOM is performed through the U-matrix. By visualizing distances between neighbouring map units, U-matrix allows the creation of
本论文由英语论文网提供整理,提供论文代写,英语论文代写,代写论文,代写英语论文,代写留学生论文,代写英文论文,留学生论文代写相关核心关键词搜索。