Constrained Clustering: General Pairwise and Cardinality Constraints
In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to...
Ausführliche Beschreibung
Autor*in: |
Adel Bibi [verfasserIn] Ali Alqahtani [verfasserIn] Bernard Ghanem [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2023 |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
In: IEEE Access - IEEE, 2014, 11(2023), Seite 5824-5836 |
---|---|
Übergeordnetes Werk: |
volume:11 ; year:2023 ; pages:5824-5836 |
Links: |
---|
DOI / URN: |
10.1109/ACCESS.2023.3236608 |
---|
Katalog-ID: |
DOAJ086270028 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | DOAJ086270028 | ||
003 | DE-627 | ||
005 | 20230502204101.0 | ||
007 | cr uuu---uuuuu | ||
008 | 230311s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/ACCESS.2023.3236608 |2 doi | |
035 | |a (DE-627)DOAJ086270028 | ||
035 | |a (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
050 | 0 | |a TK1-9971 | |
100 | 0 | |a Adel Bibi |e verfasserin |4 aut | |
245 | 1 | 0 | |a Constrained Clustering: General Pairwise and Cardinality Constraints |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. | ||
650 | 4 | |a Constrained clustering | |
650 | 4 | |a K-means | |
650 | 4 | |a pairwise | |
650 | 4 | |a cardinality constraints | |
653 | 0 | |a Electrical engineering. Electronics. Nuclear engineering | |
700 | 0 | |a Ali Alqahtani |e verfasserin |4 aut | |
700 | 0 | |a Bernard Ghanem |e verfasserin |4 aut | |
773 | 0 | 8 | |i In |t IEEE Access |d IEEE, 2014 |g 11(2023), Seite 5824-5836 |w (DE-627)728440385 |w (DE-600)2687964-5 |x 21693536 |7 nnns |
773 | 1 | 8 | |g volume:11 |g year:2023 |g pages:5824-5836 |
856 | 4 | 0 | |u https://doi.org/10.1109/ACCESS.2023.3236608 |z kostenfrei |
856 | 4 | 0 | |u https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f |z kostenfrei |
856 | 4 | 0 | |u https://ieeexplore.ieee.org/document/10015761/ |z kostenfrei |
856 | 4 | 2 | |u https://doaj.org/toc/2169-3536 |y Journal toc |z kostenfrei |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_DOAJ | ||
912 | |a SSG-OLC-PHA | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_20 | ||
912 | |a GBV_ILN_22 | ||
912 | |a GBV_ILN_23 | ||
912 | |a GBV_ILN_24 | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_39 | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_60 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_63 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_69 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_73 | ||
912 | |a GBV_ILN_95 | ||
912 | |a GBV_ILN_105 | ||
912 | |a GBV_ILN_110 | ||
912 | |a GBV_ILN_151 | ||
912 | |a GBV_ILN_161 | ||
912 | |a GBV_ILN_170 | ||
912 | |a GBV_ILN_213 | ||
912 | |a GBV_ILN_230 | ||
912 | |a GBV_ILN_285 | ||
912 | |a GBV_ILN_293 | ||
912 | |a GBV_ILN_370 | ||
912 | |a GBV_ILN_602 | ||
912 | |a GBV_ILN_2014 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4037 | ||
912 | |a GBV_ILN_4112 | ||
912 | |a GBV_ILN_4125 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4249 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4306 | ||
912 | |a GBV_ILN_4307 | ||
912 | |a GBV_ILN_4313 | ||
912 | |a GBV_ILN_4322 | ||
912 | |a GBV_ILN_4323 | ||
912 | |a GBV_ILN_4324 | ||
912 | |a GBV_ILN_4325 | ||
912 | |a GBV_ILN_4335 | ||
912 | |a GBV_ILN_4338 | ||
912 | |a GBV_ILN_4367 | ||
912 | |a GBV_ILN_4700 | ||
951 | |a AR | ||
952 | |d 11 |j 2023 |h 5824-5836 |
author_variant |
a b ab a a aa b g bg |
---|---|
matchkey_str |
article:21693536:2023----::osriecutrngnrlariencri |
hierarchy_sort_str |
2023 |
callnumber-subject-code |
TK |
publishDate |
2023 |
allfields |
10.1109/ACCESS.2023.3236608 doi (DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f DE-627 ger DE-627 rakwb eng TK1-9971 Adel Bibi verfasserin aut Constrained Clustering: General Pairwise and Cardinality Constraints 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering Ali Alqahtani verfasserin aut Bernard Ghanem verfasserin aut In IEEE Access IEEE, 2014 11(2023), Seite 5824-5836 (DE-627)728440385 (DE-600)2687964-5 21693536 nnns volume:11 year:2023 pages:5824-5836 https://doi.org/10.1109/ACCESS.2023.3236608 kostenfrei https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f kostenfrei https://ieeexplore.ieee.org/document/10015761/ kostenfrei https://doaj.org/toc/2169-3536 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2023 5824-5836 |
spelling |
10.1109/ACCESS.2023.3236608 doi (DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f DE-627 ger DE-627 rakwb eng TK1-9971 Adel Bibi verfasserin aut Constrained Clustering: General Pairwise and Cardinality Constraints 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering Ali Alqahtani verfasserin aut Bernard Ghanem verfasserin aut In IEEE Access IEEE, 2014 11(2023), Seite 5824-5836 (DE-627)728440385 (DE-600)2687964-5 21693536 nnns volume:11 year:2023 pages:5824-5836 https://doi.org/10.1109/ACCESS.2023.3236608 kostenfrei https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f kostenfrei https://ieeexplore.ieee.org/document/10015761/ kostenfrei https://doaj.org/toc/2169-3536 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2023 5824-5836 |
allfields_unstemmed |
10.1109/ACCESS.2023.3236608 doi (DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f DE-627 ger DE-627 rakwb eng TK1-9971 Adel Bibi verfasserin aut Constrained Clustering: General Pairwise and Cardinality Constraints 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering Ali Alqahtani verfasserin aut Bernard Ghanem verfasserin aut In IEEE Access IEEE, 2014 11(2023), Seite 5824-5836 (DE-627)728440385 (DE-600)2687964-5 21693536 nnns volume:11 year:2023 pages:5824-5836 https://doi.org/10.1109/ACCESS.2023.3236608 kostenfrei https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f kostenfrei https://ieeexplore.ieee.org/document/10015761/ kostenfrei https://doaj.org/toc/2169-3536 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2023 5824-5836 |
allfieldsGer |
10.1109/ACCESS.2023.3236608 doi (DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f DE-627 ger DE-627 rakwb eng TK1-9971 Adel Bibi verfasserin aut Constrained Clustering: General Pairwise and Cardinality Constraints 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering Ali Alqahtani verfasserin aut Bernard Ghanem verfasserin aut In IEEE Access IEEE, 2014 11(2023), Seite 5824-5836 (DE-627)728440385 (DE-600)2687964-5 21693536 nnns volume:11 year:2023 pages:5824-5836 https://doi.org/10.1109/ACCESS.2023.3236608 kostenfrei https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f kostenfrei https://ieeexplore.ieee.org/document/10015761/ kostenfrei https://doaj.org/toc/2169-3536 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2023 5824-5836 |
allfieldsSound |
10.1109/ACCESS.2023.3236608 doi (DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f DE-627 ger DE-627 rakwb eng TK1-9971 Adel Bibi verfasserin aut Constrained Clustering: General Pairwise and Cardinality Constraints 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering Ali Alqahtani verfasserin aut Bernard Ghanem verfasserin aut In IEEE Access IEEE, 2014 11(2023), Seite 5824-5836 (DE-627)728440385 (DE-600)2687964-5 21693536 nnns volume:11 year:2023 pages:5824-5836 https://doi.org/10.1109/ACCESS.2023.3236608 kostenfrei https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f kostenfrei https://ieeexplore.ieee.org/document/10015761/ kostenfrei https://doaj.org/toc/2169-3536 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2023 5824-5836 |
language |
English |
source |
In IEEE Access 11(2023), Seite 5824-5836 volume:11 year:2023 pages:5824-5836 |
sourceStr |
In IEEE Access 11(2023), Seite 5824-5836 volume:11 year:2023 pages:5824-5836 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Constrained clustering K-means pairwise cardinality constraints Electrical engineering. Electronics. Nuclear engineering |
isfreeaccess_bool |
true |
container_title |
IEEE Access |
authorswithroles_txt_mv |
Adel Bibi @@aut@@ Ali Alqahtani @@aut@@ Bernard Ghanem @@aut@@ |
publishDateDaySort_date |
2023-01-01T00:00:00Z |
hierarchy_top_id |
728440385 |
id |
DOAJ086270028 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">DOAJ086270028</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230502204101.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230311s2023 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/ACCESS.2023.3236608</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ086270028</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK1-9971</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Adel Bibi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Constrained Clustering: General Pairwise and Cardinality Constraints</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Constrained clustering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">K-means</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">pairwise</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cardinality constraints</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Electrical engineering. Electronics. Nuclear engineering</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Ali Alqahtani</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Bernard Ghanem</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">IEEE Access</subfield><subfield code="d">IEEE, 2014</subfield><subfield code="g">11(2023), Seite 5824-5836</subfield><subfield code="w">(DE-627)728440385</subfield><subfield code="w">(DE-600)2687964-5</subfield><subfield code="x">21693536</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2023</subfield><subfield code="g">pages:5824-5836</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1109/ACCESS.2023.3236608</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://ieeexplore.ieee.org/document/10015761/</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/2169-3536</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2023</subfield><subfield code="h">5824-5836</subfield></datafield></record></collection>
|
callnumber-first |
T - Technology |
author |
Adel Bibi |
spellingShingle |
Adel Bibi misc TK1-9971 misc Constrained clustering misc K-means misc pairwise misc cardinality constraints misc Electrical engineering. Electronics. Nuclear engineering Constrained Clustering: General Pairwise and Cardinality Constraints |
authorStr |
Adel Bibi |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)728440385 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut |
collection |
DOAJ |
remote_str |
true |
callnumber-label |
TK1-9971 |
illustrated |
Not Illustrated |
issn |
21693536 |
topic_title |
TK1-9971 Constrained Clustering: General Pairwise and Cardinality Constraints Constrained clustering K-means pairwise cardinality constraints |
topic |
misc TK1-9971 misc Constrained clustering misc K-means misc pairwise misc cardinality constraints misc Electrical engineering. Electronics. Nuclear engineering |
topic_unstemmed |
misc TK1-9971 misc Constrained clustering misc K-means misc pairwise misc cardinality constraints misc Electrical engineering. Electronics. Nuclear engineering |
topic_browse |
misc TK1-9971 misc Constrained clustering misc K-means misc pairwise misc cardinality constraints misc Electrical engineering. Electronics. Nuclear engineering |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
IEEE Access |
hierarchy_parent_id |
728440385 |
hierarchy_top_title |
IEEE Access |
isfreeaccess_txt |
true |
familylinks_str_mv |
(DE-627)728440385 (DE-600)2687964-5 |
title |
Constrained Clustering: General Pairwise and Cardinality Constraints |
ctrlnum |
(DE-627)DOAJ086270028 (DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f |
title_full |
Constrained Clustering: General Pairwise and Cardinality Constraints |
author_sort |
Adel Bibi |
journal |
IEEE Access |
journalStr |
IEEE Access |
callnumber-first-code |
T |
lang_code |
eng |
isOA_bool |
true |
recordtype |
marc |
publishDateSort |
2023 |
contenttype_str_mv |
txt |
container_start_page |
5824 |
author_browse |
Adel Bibi Ali Alqahtani Bernard Ghanem |
container_volume |
11 |
class |
TK1-9971 |
format_se |
Elektronische Aufsätze |
author-letter |
Adel Bibi |
doi_str_mv |
10.1109/ACCESS.2023.3236608 |
author2-role |
verfasserin |
title_sort |
constrained clustering: general pairwise and cardinality constraints |
callnumber |
TK1-9971 |
title_auth |
Constrained Clustering: General Pairwise and Cardinality Constraints |
abstract |
In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. |
abstractGer |
In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. |
abstract_unstemmed |
In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 |
title_short |
Constrained Clustering: General Pairwise and Cardinality Constraints |
url |
https://doi.org/10.1109/ACCESS.2023.3236608 https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f https://ieeexplore.ieee.org/document/10015761/ https://doaj.org/toc/2169-3536 |
remote_bool |
true |
author2 |
Ali Alqahtani Bernard Ghanem |
author2Str |
Ali Alqahtani Bernard Ghanem |
ppnlink |
728440385 |
callnumber-subject |
TK - Electrical and Nuclear Engineering |
mediatype_str_mv |
c |
isOA_txt |
true |
hochschulschrift_bool |
false |
doi_str |
10.1109/ACCESS.2023.3236608 |
callnumber-a |
TK1-9971 |
up_date |
2024-07-03T19:40:25.986Z |
_version_ |
1803588076558090241 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">DOAJ086270028</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230502204101.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230311s2023 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/ACCESS.2023.3236608</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ086270028</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ97bf3ee149864caeb81906d8dbc4193f</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">TK1-9971</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Adel Bibi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Constrained Clustering: General Pairwise and Cardinality Constraints</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: 1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and 2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category. The experimental results show that the proposed method exploits the constraints to achieve perfect clustering performance with improved clustering to <inline-formula< <tex-math notation="LaTeX"<$2-5$ </tex-math<</inline-formula<% in classical clustering metrics, e.g., Adjusted Random Index (ARI), Mirkin’s Index (MI), and Huber’s Index (HI), outerperfomring all compared-againts methods across the board. Moreover, we show that our method is robust to initialization.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Constrained clustering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">K-means</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">pairwise</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cardinality constraints</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Electrical engineering. Electronics. Nuclear engineering</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Ali Alqahtani</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Bernard Ghanem</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">IEEE Access</subfield><subfield code="d">IEEE, 2014</subfield><subfield code="g">11(2023), Seite 5824-5836</subfield><subfield code="w">(DE-627)728440385</subfield><subfield code="w">(DE-600)2687964-5</subfield><subfield code="x">21693536</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2023</subfield><subfield code="g">pages:5824-5836</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1109/ACCESS.2023.3236608</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/97bf3ee149864caeb81906d8dbc4193f</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://ieeexplore.ieee.org/document/10015761/</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/2169-3536</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2023</subfield><subfield code="h">5824-5836</subfield></datafield></record></collection>
|
score |
7.399624 |