英语论文网

留学生硕士论文 英国论文 日语论文 澳洲论文 Turnitin剽窃检测 英语论文发表 留学中国 欧美文学特区 论文寄售中心 论文翻译中心 我要定制

Bussiness ManagementMBAstrategyHuman ResourceMarketingHospitalityE-commerceInternational Tradingproject managementmedia managementLogisticsFinanceAccountingadvertisingLawBusiness LawEducationEconomicsBusiness Reportbusiness planresearch proposal

英语论文题目英语教学英语论文商务英语英语论文格式商务英语翻译广告英语商务英语商务英语教学英语翻译论文英美文学英语语言学文化交流中西方文化差异英语论文范文英语论文开题报告初中英语教学英语论文文献综述英语论文参考文献

ResumeRecommendation LetterMotivation LetterPSapplication letterMBA essayBusiness Letteradmission letter Offer letter

澳大利亚论文英国论文加拿大论文芬兰论文瑞典论文澳洲论文新西兰论文法国论文香港论文挪威论文美国论文泰国论文马来西亚论文台湾论文新加坡论文荷兰论文南非论文西班牙论文爱尔兰论文

小学英语教学初中英语教学英语语法高中英语教学大学英语教学听力口语英语阅读英语词汇学英语素质教育英语教育毕业英语教学法

英语论文开题报告英语毕业论文写作指导英语论文写作笔记handbook英语论文提纲英语论文参考文献英语论文文献综述Research Proposal代写留学论文代写留学作业代写Essay论文英语摘要英语论文任务书英语论文格式专业名词turnitin抄袭检查

temcet听力雅思考试托福考试GMATGRE职称英语理工卫生职称英语综合职称英语职称英语

经贸英语论文题目旅游英语论文题目大学英语论文题目中学英语论文题目小学英语论文题目英语文学论文题目英语教学论文题目英语语言学论文题目委婉语论文题目商务英语论文题目最新英语论文题目英语翻译论文题目英语跨文化论文题目

日本文学日本语言学商务日语日本历史日本经济怎样写日语论文日语论文写作格式日语教学日本社会文化日语开题报告日语论文选题

职称英语理工完形填空历年试题模拟试题补全短文概括大意词汇指导阅读理解例题习题卫生职称英语词汇指导完形填空概括大意历年试题阅读理解补全短文模拟试题例题习题综合职称英语完形填空历年试题模拟试题例题习题词汇指导阅读理解补全短文概括大意

商务英语翻译论文广告英语商务英语商务英语教学

无忧论文网

联系方式

computer science: 决策树修剪技术

论文作者:www.51lunwen.org论文属性:作业 Assignment登出时间:2013-09-17编辑:yangcheng点击率:3263

论文字数:1444论文编号:org201309131704096475语种:英语 English地区:中国价格:免费论文

关键词:computer science决策树修剪技术决策

摘要:决策树的构造使用属性选择度量来选择将元组最好地划分成不同类的属性,通过属性选择度量来确定各个属性间的拓扑结构。决策树构造的关键步骤就是分裂属性,即在某个节点处按照某一特征属性的不同划分构造不同的分支,其目标是让各个分裂子集尽可能地“纯”。

一般情况下或具有较大概率地说,树越小则树的预测能力越强。要构造尽可能小的决策树,关键在于属性选择度量。因此只能采取用启发式策略选择好的逻辑判断或属性。

Under normal circumstances , or with a larger probability to say that the smaller the tree , the stronger the predictive ability of the tree . To construct a decision tree as small as possible , the key is attribute selection measure . It can only be taken to choose a good heuristic strategy logic judgment or attributes. Property right choice depends on a variety of examples subset impurity metrics . Not purity metrics , including information gain, information gain ratio , Gini index , distance metric , J-measure, G statistics , statistics, weight of evidence , the minimum description length (MLP), orthogonal method, correlation and Relief. Different measures have different effects , especially for multi-valued attributes . [ 7 ]


当决策树创建时,由于数据中的噪声和孤立点,许多分支反映的是训练集数据中的异常,为了处理这种过分适应数据问题,采用树剪枝方法,剪去决策树中的最不可靠分支,减少决策树的尺寸,这样可以达到较快的分类,提高决策树独立于数据正确分类的能力。下面介绍2种常用的树剪枝方法:预剪枝和后剪枝。

When a tree is created, because the data in the noise and isolated points , many branches reflect anomalies in the data training set , in order to deal with this problem too accommodate data , using the tree pruning method , cut tree in the least reliability branch , reducing the size of the tree , so that the classification can be achieved quickly , independent of the data to improve the correct classification tree capacity . Here are two kinds of commonly used tree pruning method : Pre- pruning and post- pruning .


预剪枝方法是在完全正确分类训练集之前,通过一定的法则较早地停止树的构造而对树进行“剪枝”,一旦停止,节点成为树叶,该树叶可能持有子集样本中最频繁的类,或这些样本的概率分布。

Pre- pruning method is entirely before the training set correctly classified by certain rules to stop earlier tree structure and the tree " pruning ", once stopped , the nodes become leaves, the leaves may hold a subset of samples of the most frequent class , or the probability distribution of these samples . In the tree structure , the use of statistics , and other information gain measure , evaluate the quality of the split , if the nodes into a sample will result in less than a predetermined threshold, split , the given subset further divided stops. This algorithm is relatively simple , high efficiency, suitable for large-scale problems . [ 8 ]


后剪枝方法由“完全生长”的树剪去不可靠分枝,通过删除节点的分枝,剪掉树节点。它利用训练集样本或剪枝样本数据,检验决策子树对目标变量的预测精度,并计算出相应的错误率。

After pruning method from " fully grown " tree branch cut is not reliable , by removing nodes branching, cut tree nodes. It uses the training set or pruning sample data , inspection decisions subtree prediction accuracy of the target variable , and calculate the corresponding error rate. Users can pre-set a maximum allowable error rate. When pruning reaches a certain depth, the error rate is higher than the calculated maximum allowable error rate , then stop pruning, otherwise continue pruning . Finally got one with the smallest expected error rate of the decision tree . [ 8 ]

Decision tree algorithm has good scalability refers algorithm capable of handling large amounts of data or the ability to accelerate the data mining process , from the large amounts of data quickly and accurately focused discover hidden in one of the main classification rules . Scalability, including the strengthening of the decision tree algorithm , and scalability . Study the scalability of decision tree algorithm main reasons are the foll论文英语论文网提供整理,提供论文代写英语论文代写代写论文代写英语论文代写留学生论文代写英文论文留学生论文代写相关核心关键词搜索。

共 1/2 页首页上一页12下一页尾页

英国英国 澳大利亚澳大利亚 美国美国 加拿大加拿大 新西兰新西兰 新加坡新加坡 香港香港 日本日本 韩国韩国 法国法国 德国德国 爱尔兰爱尔兰 瑞士瑞士 荷兰荷兰 俄罗斯俄罗斯 西班牙西班牙 马来西亚马来西亚 南非南非