建模数据的准备,是指在未经过整合处理的原始数据中,通过整合处理形成后续建模所需要的各种数据集的过程,这些数据集的变量即是后续建模所需的自变量的英语翻译

建模数据的准备,是指在未经过整合处理的原始数据中,通过整合处理形成后续

建模数据的准备,是指在未经过整合处理的原始数据中,通过整合处理形成后续建模所需要的各种数据集的过程,这些数据集的变量即是后续建模所需的自变量。数据准备的过程可能会因模型的需要而反复处理,而且没有顺序规定。数据准备的主要工作包含选择数据变量和数据量,根据模型需求,标准化处理相关变量,同时处理缺失值和异常值。建模数据的准备可能是数据挖掘过程中最花时间的一个步骤,甚至可能占到了整个数据挖掘流程的一半以上工作量,其数据准备的主要工作流程为:选择数据、清洗数据、构建数据、整合数据、格式化数据和数据集准备。数据模型的建立阶段,首先是要选择参与数据建模的变量,变量的选取非常关键如果选取的变量过多且存在很多无关变量,则会减弱模型的效果,相反如果变量太少,则不能全面体现影响因变量的各方面属性,同样影响模型的有效性。其次是要分析自变量和因变量之间的相关关系,以及自变量之间的相关关系,确保自变量与因变量是具有相关关系的,如无相关关系则属于无关变量可直接剔除;如自变量之间存在很强的相关关系,直接建模则容易产生过拟合现象,因此对于自变量之间存在强相关关系的,可通过主成分分析法进行降维,生成相关性不大的新的综合指标,也可通过剔除相似变量的方法进行处理。最后选择一种或多种数据挖掘技术,在数据建模的过程中不断进行参数的调整,目的是让模型调到最佳。通常需要采用多种数据挖掘算法对数据进行建模,最后将各种算法训练出的结果进行准确性和有效性比较,从而挑选最适合的一种算法。
0/5000
源语言: -
目标语言: -
结果 (英语) 1: [复制]
复制成功!
The preparation of modeling data refers to the process of forming various data sets required for subsequent modeling through integration processing from the original data that has not been integrated. The variables of these data sets are the independent variables required for subsequent modeling. . The data preparation process may be processed repeatedly due to the needs of the model, and there is no order regulation. The main work of data preparation includes the selection of data variables and data volume, standardized processing of related variables according to model requirements, and processing of missing values ​​and outliers at the same time. The preparation of modeling data may be the most time-consuming step in the data mining process, and may even account for more than half of the entire data mining process. The main work flow of data preparation is: selecting data, cleaning data, building data, Consolidate data, format data, and prepare data sets. <br>In the establishment of the data model, the first step is to select the variables involved in the data modeling. The selection of variables is very important. If too many variables are selected and there are many irrelevant variables, the effect of the model will be weakened. On the contrary, if there are too few variables, it will not be comprehensive. Reflects the various attributes that affect the dependent variable, and also affects the validity of the model. The second is to analyze the correlation between the independent variable and the dependent variable, as well as the correlation between the independent variables, to ensure that the independent variable and the dependent variable have a correlation, if there is no correlation, the irrelevant variable can be directly eliminated; There is a strong correlation between variables, and direct modeling is prone to overfitting. Therefore, if there is a strong correlation between independent variables, the principal component analysis method can be used to reduce the dimensionality to generate new ones with little correlation. The comprehensive index can also be processed by eliminating similar variables. Finally, one or more data mining techniques are selected, and the parameters are continuously adjusted in the process of data modeling, in order to optimize the model. It is usually necessary to use a variety of data mining algorithms to model the data, and finally compare the accuracy and effectiveness of the training results of various algorithms, so as to select the most suitable algorithm.
正在翻译中..
结果 (英语) 2:[复制]
复制成功!
The preparation of modeling data refers to the process of forming the various data sets required for subsequent modeling through integration processing in raw data that have not been consolidated, i.e. the variables required for subsequent modeling. The process of data preparation can be repeated due to the needs of the model and is not sequential. The main task of data preparation involves selecting data variables and amounts of data, standardizing the processing of related variables according to model requirements, and processing both missing and outliers. The preparation of modeling data can be one of the most time-consuming steps in the data mining process, and may even account for more than half of the overall data mining process, whose main workflows for data preparation are selecting data, cleaning data, building data, consolidating data, formatting data, and data set preparation.<br>The establishment stage of data model, first of all, to choose the variables involved in data modeling, variable selection is very critical If too many variables are selected and there are many unrelated variables, it will weaken the effect of the model, on the contrary, if there are too few variables, it can not fully reflect the impact of various attributes of the dependent variables, the same impact on the effectiveness of the model. The second is to analyze the correlation between the argument and the dependent variable, as well as the correlation between the argument, to ensure that the argument and the dependent variable are related, if there is no correlation is an unrelated variable can be directly eliminated; If there is a strong correlation between the arguments, direct modeling is prone to overfitting, so for the strong correlation between the arguments, the main component analysis method can be reduced, resulting in a new comprehensive index with little correlation, or can be processed by eliminating similar variables. Finally, one or more data mining techniques are selected, and the parameters are constantly adjusted in the process of data modeling, with the aim of bringing the model to the best possible position. It is often necessary to model the data using a variety of data mining algorithms, and finally compare the results trained by various algorithms with accuracy and effectiveness, so as to select the most suitable algorithm.
正在翻译中..
结果 (英语) 3:[复制]
复制成功!
The preparation of modeling data refers to the process of forming various data sets for subsequent modeling through integration in the original data without integration. The variables of these data sets are the independent variables for subsequent modeling. The process of data preparation may be repeated due to the needs of the model, and there is no order. The main work of data preparation includes the selection of data variables and data volume, the standardized processing of related variables according to the model requirements, and the processing of missing values and outliers. The preparation of modeling data may be the most time-consuming step in the process of data mining, and may even account for more than half of the workload of the whole data mining process. The main workflow of data preparation is: selecting data, cleaning data, building data, integrating data, formatting data and data set preparation.<br>In the data model building stage, the first step is to select the variables that participate in the data modeling. The selection of variables is very critical. If too many variables are selected and there are many irrelevant variables, the effect of the model will be weakened. On the contrary, if too few variables are selected, they can not fully reflect the attributes of the dependent variables, which also affects the effectiveness of the model. Secondly, it is necessary to analyze the correlation between independent variables and dependent variables, as well as the correlation between independent variables, so as to ensure that the independent variables and dependent variables are related. If there is no correlation, it belongs to irrelevant variables and can be directly eliminated; If there is a strong correlation between the independent variables, direct modeling is easy to produce over fitting phenomenon. Therefore, if there is a strong correlation between the independent variables, we can use the principal component analysis to reduce the dimension and generate new comprehensive indicators with little correlation, or eliminate similar variables. Finally, one or more kinds of data mining technologies are selected to adjust the parameters continuously in the process of data modeling, so as to optimize the model. Usually, we need to use a variety of data mining algorithms to model the data, and finally compare the accuracy and effectiveness of the training results of various algorithms, so as to select the most suitable algorithm.<br>
正在翻译中..
 
其它语言
本翻译工具支持: 世界语, 丹麦语, 乌克兰语, 乌兹别克语, 乌尔都语, 亚美尼亚语, 伊博语, 俄语, 保加利亚语, 信德语, 修纳语, 僧伽罗语, 克林贡语, 克罗地亚语, 冰岛语, 加利西亚语, 加泰罗尼亚语, 匈牙利语, 南非祖鲁语, 南非科萨语, 卡纳达语, 卢旺达语, 卢森堡语, 印地语, 印尼巽他语, 印尼爪哇语, 印尼语, 古吉拉特语, 吉尔吉斯语, 哈萨克语, 土库曼语, 土耳其语, 塔吉克语, 塞尔维亚语, 塞索托语, 夏威夷语, 奥利亚语, 威尔士语, 孟加拉语, 宿务语, 尼泊尔语, 巴斯克语, 布尔语(南非荷兰语), 希伯来语, 希腊语, 库尔德语, 弗里西语, 德语, 意大利语, 意第绪语, 拉丁语, 拉脱维亚语, 挪威语, 捷克语, 斯洛伐克语, 斯洛文尼亚语, 斯瓦希里语, 旁遮普语, 日语, 普什图语, 格鲁吉亚语, 毛利语, 法语, 波兰语, 波斯尼亚语, 波斯语, 泰卢固语, 泰米尔语, 泰语, 海地克里奥尔语, 爱尔兰语, 爱沙尼亚语, 瑞典语, 白俄罗斯语, 科西嘉语, 立陶宛语, 简体中文, 索马里语, 繁体中文, 约鲁巴语, 维吾尔语, 缅甸语, 罗马尼亚语, 老挝语, 自动识别, 芬兰语, 苏格兰盖尔语, 苗语, 英语, 荷兰语, 菲律宾语, 萨摩亚语, 葡萄牙语, 蒙古语, 西班牙语, 豪萨语, 越南语, 阿塞拜疆语, 阿姆哈拉语, 阿尔巴尼亚语, 阿拉伯语, 鞑靼语, 韩语, 马其顿语, 马尔加什语, 马拉地语, 马拉雅拉姆语, 马来语, 马耳他语, 高棉语, 齐切瓦语, 等语言的翻译.

Copyright ©2024 I Love Translation. All reserved.

E-mail: