In spark big data cluster environment, data collection is completed by a large number of nodes deployed by the collection unit, which is the basis of network data security monitoring. After data flow collection, it will be safely stored for security monitoring. The data flow collected in spark environment includes structured data and unstructured data. The collection nodes need to be scattered in the complex network. The decentralized data collection nodes are centralized and managed through the parallelization node management strategy, so as to optimize the effect and efficiency of network data collection
正在翻译中..