Which standard classification algorithm has more stable performance for imbalanced network traffic data?

Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algori...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Zheng, Ming [verfasserIn] Ma, Kai Wang, Fei Hu, Xiaowen Yu, Qingying Guo, Liangmin Chen, Fulong

Format:	E-Artikel
Sprache:	Englisch

Erschienen:	2023

Schlagwörter:	Imbalanced network traffic data Data augmentation algorithms Standard classification algorithms Stable classification performance

Anmerkung:	© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Übergeordnetes Werk:	Enthalten in: Soft Computing - Springer-Verlag, 2003, 28(2023), 1 vom: 26. Okt., Seite 217-234
Übergeordnetes Werk:	volume:28 ; year:2023 ; number:1 ; day:26 ; month:10 ; pages:217-234

Links:	Volltext

DOI / URN:	10.1007/s00500-023-09331-1

Katalog-ID:	SPR054258472

Internformat


LEADER	01000naa a22002652 4500
001	SPR054258472
003	DE-627
005	20240105064714.0
007	cr uuu---uuuuu
008	240105s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1007/s00500-023-09331-1 \|2 doi
035			\|a (DE-627)SPR054258472
035			\|a (SPR)s00500-023-09331-1-e
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Zheng, Ming \|e verfasserin \|0 (orcid)0000-0001-9001-0859 \|4 aut
245	1	0	\|a Which standard classification algorithm has more stable performance for imbalanced network traffic data?
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
500			\|a © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
520			\|a Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests.
650		4	\|a Imbalanced network traffic data \|7 (dpeaa)DE-He213
650		4	\|a Data augmentation algorithms \|7 (dpeaa)DE-He213
650		4	\|a Standard classification algorithms \|7 (dpeaa)DE-He213
650		4	\|a Stable classification performance \|7 (dpeaa)DE-He213
700	1		\|a Ma, Kai \|4 aut
700	1		\|a Wang, Fei \|4 aut
700	1		\|a Hu, Xiaowen \|4 aut
700	1		\|a Yu, Qingying \|4 aut
700	1		\|a Guo, Liangmin \|4 aut
700	1		\|a Chen, Fulong \|4 aut
773	0	8	\|i Enthalten in \|t Soft Computing \|d Springer-Verlag, 2003 \|g 28(2023), 1 vom: 26. Okt., Seite 217-234 \|w (DE-627)SPR006469531 \|7 nnns
773	1	8	\|g volume:28 \|g year:2023 \|g number:1 \|g day:26 \|g month:10 \|g pages:217-234
856	4	0	\|u https://dx.doi.org/10.1007/s00500-023-09331-1 \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_SPRINGER
951			\|a AR
952			\|d 28 \|j 2023 \|e 1 \|b 26 \|c 10 \|h 217-234

Indexfelder

author_variant	m z mz k m km f w fw x h xh q y qy l g lg f c fc
matchkey_str	zhengmingmakaiwangfeihuxiaowenyuqingying:2023----:hcsadrcasfctoagrtmamrsalpromneoib
hierarchy_sort_str	2023
publishDate	2023
allfields	10.1007/s00500-023-09331-1 doi (DE-627)SPR054258472 (SPR)s00500-023-09331-1-e DE-627 ger DE-627 rakwb eng Zheng, Ming verfasserin (orcid)0000-0001-9001-0859 aut Which standard classification algorithm has more stable performance for imbalanced network traffic data? 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213 Ma, Kai aut Wang, Fei aut Hu, Xiaowen aut Yu, Qingying aut Guo, Liangmin aut Chen, Fulong aut Enthalten in Soft Computing Springer-Verlag, 2003 28(2023), 1 vom: 26. Okt., Seite 217-234 (DE-627)SPR006469531 nnns volume:28 year:2023 number:1 day:26 month:10 pages:217-234 https://dx.doi.org/10.1007/s00500-023-09331-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 28 2023 1 26 10 217-234
spelling	10.1007/s00500-023-09331-1 doi (DE-627)SPR054258472 (SPR)s00500-023-09331-1-e DE-627 ger DE-627 rakwb eng Zheng, Ming verfasserin (orcid)0000-0001-9001-0859 aut Which standard classification algorithm has more stable performance for imbalanced network traffic data? 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213 Ma, Kai aut Wang, Fei aut Hu, Xiaowen aut Yu, Qingying aut Guo, Liangmin aut Chen, Fulong aut Enthalten in Soft Computing Springer-Verlag, 2003 28(2023), 1 vom: 26. Okt., Seite 217-234 (DE-627)SPR006469531 nnns volume:28 year:2023 number:1 day:26 month:10 pages:217-234 https://dx.doi.org/10.1007/s00500-023-09331-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 28 2023 1 26 10 217-234
allfields_unstemmed	10.1007/s00500-023-09331-1 doi (DE-627)SPR054258472 (SPR)s00500-023-09331-1-e DE-627 ger DE-627 rakwb eng Zheng, Ming verfasserin (orcid)0000-0001-9001-0859 aut Which standard classification algorithm has more stable performance for imbalanced network traffic data? 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213 Ma, Kai aut Wang, Fei aut Hu, Xiaowen aut Yu, Qingying aut Guo, Liangmin aut Chen, Fulong aut Enthalten in Soft Computing Springer-Verlag, 2003 28(2023), 1 vom: 26. Okt., Seite 217-234 (DE-627)SPR006469531 nnns volume:28 year:2023 number:1 day:26 month:10 pages:217-234 https://dx.doi.org/10.1007/s00500-023-09331-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 28 2023 1 26 10 217-234
allfieldsGer	10.1007/s00500-023-09331-1 doi (DE-627)SPR054258472 (SPR)s00500-023-09331-1-e DE-627 ger DE-627 rakwb eng Zheng, Ming verfasserin (orcid)0000-0001-9001-0859 aut Which standard classification algorithm has more stable performance for imbalanced network traffic data? 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213 Ma, Kai aut Wang, Fei aut Hu, Xiaowen aut Yu, Qingying aut Guo, Liangmin aut Chen, Fulong aut Enthalten in Soft Computing Springer-Verlag, 2003 28(2023), 1 vom: 26. Okt., Seite 217-234 (DE-627)SPR006469531 nnns volume:28 year:2023 number:1 day:26 month:10 pages:217-234 https://dx.doi.org/10.1007/s00500-023-09331-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 28 2023 1 26 10 217-234
allfieldsSound	10.1007/s00500-023-09331-1 doi (DE-627)SPR054258472 (SPR)s00500-023-09331-1-e DE-627 ger DE-627 rakwb eng Zheng, Ming verfasserin (orcid)0000-0001-9001-0859 aut Which standard classification algorithm has more stable performance for imbalanced network traffic data? 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213 Ma, Kai aut Wang, Fei aut Hu, Xiaowen aut Yu, Qingying aut Guo, Liangmin aut Chen, Fulong aut Enthalten in Soft Computing Springer-Verlag, 2003 28(2023), 1 vom: 26. Okt., Seite 217-234 (DE-627)SPR006469531 nnns volume:28 year:2023 number:1 day:26 month:10 pages:217-234 https://dx.doi.org/10.1007/s00500-023-09331-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 28 2023 1 26 10 217-234
language	English
source	Enthalten in Soft Computing 28(2023), 1 vom: 26. Okt., Seite 217-234 volume:28 year:2023 number:1 day:26 month:10 pages:217-234
sourceStr	Enthalten in Soft Computing 28(2023), 1 vom: 26. Okt., Seite 217-234 volume:28 year:2023 number:1 day:26 month:10 pages:217-234
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Imbalanced network traffic data Data augmentation algorithms Standard classification algorithms Stable classification performance
isfreeaccess_bool	false
container_title	Soft Computing
authorswithroles_txt_mv	Zheng, Ming @@aut@@ Ma, Kai @@aut@@ Wang, Fei @@aut@@ Hu, Xiaowen @@aut@@ Yu, Qingying @@aut@@ Guo, Liangmin @@aut@@ Chen, Fulong @@aut@@
publishDateDaySort_date	2023-10-26T00:00:00Z
hierarchy_top_id	SPR006469531
id	SPR054258472
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">SPR054258472</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20240105064714.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">240105s2023 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00500-023-09331-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR054258472</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s00500-023-09331-1-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zheng, Ming</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0001-9001-0859</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Which standard classification algorithm has more stable performance for imbalanced network traffic data?</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Imbalanced network traffic data</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data augmentation algorithms</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Standard classification algorithms</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Stable classification performance</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ma, Kai</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Fei</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Xiaowen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Qingying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Liangmin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Fulong</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Soft Computing</subfield><subfield code="d">Springer-Verlag, 2003</subfield><subfield code="g">28(2023), 1 vom: 26. Okt., Seite 217-234</subfield><subfield code="w">(DE-627)SPR006469531</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:28</subfield><subfield code="g">year:2023</subfield><subfield code="g">number:1</subfield><subfield code="g">day:26</subfield><subfield code="g">month:10</subfield><subfield code="g">pages:217-234</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s00500-023-09331-1</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">28</subfield><subfield code="j">2023</subfield><subfield code="e">1</subfield><subfield code="b">26</subfield><subfield code="c">10</subfield><subfield code="h">217-234</subfield></datafield></record></collection>
author	Zheng, Ming
spellingShingle	Zheng, Ming misc Imbalanced network traffic data misc Data augmentation algorithms misc Standard classification algorithms misc Stable classification performance Which standard classification algorithm has more stable performance for imbalanced network traffic data?
authorStr	Zheng, Ming
ppnlink_with_tag_str_mv	@@773@@(DE-627)SPR006469531
format	electronic Article
delete_txt_mv	keep
author_role	aut aut aut aut aut aut aut
collection	springer
remote_str	true
illustrated	Not Illustrated
topic_title	Which standard classification algorithm has more stable performance for imbalanced network traffic data? Imbalanced network traffic data (dpeaa)DE-He213 Data augmentation algorithms (dpeaa)DE-He213 Standard classification algorithms (dpeaa)DE-He213 Stable classification performance (dpeaa)DE-He213
topic	misc Imbalanced network traffic data misc Data augmentation algorithms misc Standard classification algorithms misc Stable classification performance
topic_unstemmed	misc Imbalanced network traffic data misc Data augmentation algorithms misc Standard classification algorithms misc Stable classification performance
topic_browse	misc Imbalanced network traffic data misc Data augmentation algorithms misc Standard classification algorithms misc Stable classification performance
format_facet	Elektronische Aufsätze Aufsätze Elektronische Ressource
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	cr
hierarchy_parent_title	Soft Computing
hierarchy_parent_id	SPR006469531
hierarchy_top_title	Soft Computing
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)SPR006469531
title	Which standard classification algorithm has more stable performance for imbalanced network traffic data?
ctrlnum	(DE-627)SPR054258472 (SPR)s00500-023-09331-1-e
title_full	Which standard classification algorithm has more stable performance for imbalanced network traffic data?
author_sort	Zheng, Ming
journal	Soft Computing
journalStr	Soft Computing
lang_code	eng
isOA_bool	false
recordtype	marc
publishDateSort	2023
contenttype_str_mv	txt
container_start_page	217
author_browse	Zheng, Ming Ma, Kai Wang, Fei Hu, Xiaowen Yu, Qingying Guo, Liangmin Chen, Fulong
container_volume	28
format_se	Elektronische Aufsätze
author-letter	Zheng, Ming
doi_str_mv	10.1007/s00500-023-09331-1
normlink	(ORCID)0000-0001-9001-0859
normlink_prefix_str_mv	(orcid)0000-0001-9001-0859
title_sort	which standard classification algorithm has more stable performance for imbalanced network traffic data?
title_auth	Which standard classification algorithm has more stable performance for imbalanced network traffic data?
abstract	Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
abstractGer	Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
abstract_unstemmed	Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER
container_issue	1
title_short	Which standard classification algorithm has more stable performance for imbalanced network traffic data?
url	https://dx.doi.org/10.1007/s00500-023-09331-1
remote_bool	true
author2	Ma, Kai Wang, Fei Hu, Xiaowen Yu, Qingying Guo, Liangmin Chen, Fulong
author2Str	Ma, Kai Wang, Fei Hu, Xiaowen Yu, Qingying Guo, Liangmin Chen, Fulong
ppnlink	SPR006469531
mediatype_str_mv	c
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s00500-023-09331-1
up_date	2024-07-04T00:43:53.837Z
_version_	1803607168870514688
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">SPR054258472</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20240105064714.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">240105s2023 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00500-023-09331-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR054258472</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s00500-023-09331-1-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zheng, Ming</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0001-9001-0859</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Which standard classification algorithm has more stable performance for imbalanced network traffic data?</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Most standard classification algorithms are difficult to effectively learn and predict from imbalanced network traffic data, which usually leads to lower classification accuracy. To analyze the influence of imbalanced network traffic data on the performance of standard classification algorithms, the imbalanced data augmentation algorithms are first designed to obtain the imbalanced network traffic data set with gradually varying Imbalance Ratio (IR) and belonging to the same distribution. Then, to obtain more objective classification result and simplify the evaluation process, the evaluation metric AFG is used to evaluate the classification performance of standard classification algorithms based on area under the receiver operating characteristic curve (AUC), F-measure and G-mean. Finally, based on AFG and coefficient of variation (CV), performance stability of standard classification algorithms on imbalanced network traffic data is obtained. Experiments of eight widely used standard classification algorithms on 25 different imbalanced network traffic data demonstrate that the classification performance of GNB, RF and DT is unstable, while BNB, KNN, LR, GBDT, and SVC are relatively stable and not susceptible to imbalanced data. Especially, the KNN has the most stable classification performance. Also, the results are statistically confirmed by Friedman and Nemenyi post hoc statistical tests.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Imbalanced network traffic data</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data augmentation algorithms</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Standard classification algorithms</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Stable classification performance</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ma, Kai</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Fei</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Xiaowen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Qingying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Liangmin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Fulong</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Soft Computing</subfield><subfield code="d">Springer-Verlag, 2003</subfield><subfield code="g">28(2023), 1 vom: 26. Okt., Seite 217-234</subfield><subfield code="w">(DE-627)SPR006469531</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:28</subfield><subfield code="g">year:2023</subfield><subfield code="g">number:1</subfield><subfield code="g">day:26</subfield><subfield code="g">month:10</subfield><subfield code="g">pages:217-234</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s00500-023-09331-1</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">28</subfield><subfield code="j">2023</subfield><subfield code="e">1</subfield><subfield code="b">26</subfield><subfield code="c">10</subfield><subfield code="h">217-234</subfield></datafield></record></collection>
score	7.399643

Nicht das Richtige dabei?

Schreiben Sie uns!

Which standard classification algorithm has more stable performance for imbalanced network traffic data?

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?