Understanding a bag of words by conceptual labeling with prior weights
Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to un...
Ausführliche Beschreibung
Autor*in: |
Jiang, Haiyun [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2020 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Springer Science+Business Media, LLC, part of Springer Nature 2020 |
---|
Übergeordnetes Werk: |
Enthalten in: World wide web - Springer US, 1998, 23(2020), 4 vom: 14. Apr., Seite 2429-2447 |
---|---|
Übergeordnetes Werk: |
volume:23 ; year:2020 ; number:4 ; day:14 ; month:04 ; pages:2429-2447 |
Links: |
---|
DOI / URN: |
10.1007/s11280-020-00806-x |
---|
Katalog-ID: |
OLC2062252749 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2062252749 | ||
003 | DE-627 | ||
005 | 20230504153307.0 | ||
007 | tu | ||
008 | 200820s2020 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s11280-020-00806-x |2 doi | |
035 | |a (DE-627)OLC2062252749 | ||
035 | |a (DE-He213)s11280-020-00806-x-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q VZ |
084 | |a 24,1 |2 ssgn | ||
084 | |a 54.84$jWebmanagement |2 bkl | ||
084 | |a 06.74$jInformationssysteme |2 bkl | ||
100 | 1 | |a Jiang, Haiyun |e verfasserin |4 aut | |
245 | 1 | 0 | |a Understanding a bag of words by conceptual labeling with prior weights |
264 | 1 | |c 2020 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Springer Science+Business Media, LLC, part of Springer Nature 2020 | ||
520 | |a Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. | ||
650 | 4 | |a Conceptual labeling | |
650 | 4 | |a Microsoft concept graph | |
650 | 4 | |a Weighted bag of words | |
650 | 4 | |a Multi-objective optimization | |
650 | 4 | |a Concept pruning | |
700 | 1 | |a Yang, Deqing |4 aut | |
700 | 1 | |a Xiao, Yanghua |4 aut | |
700 | 1 | |a Wang, Wei |4 aut | |
773 | 0 | 8 | |i Enthalten in |t World wide web |d Springer US, 1998 |g 23(2020), 4 vom: 14. Apr., Seite 2429-2447 |w (DE-627)301184976 |w (DE-600)1485096-5 |w (DE-576)9301184974 |x 1386-145X |7 nnns |
773 | 1 | 8 | |g volume:23 |g year:2020 |g number:4 |g day:14 |g month:04 |g pages:2429-2447 |
856 | 4 | 1 | |u https://doi.org/10.1007/s11280-020-00806-x |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OLC-MAT | ||
912 | |a SSG-OPC-BBI | ||
936 | b | k | |a 54.84$jWebmanagement |q VZ |0 475288947 |0 (DE-625)475288947 |
936 | b | k | |a 06.74$jInformationssysteme |q VZ |0 106415212 |0 (DE-625)106415212 |
951 | |a AR | ||
952 | |d 23 |j 2020 |e 4 |b 14 |c 04 |h 2429-2447 |
author_variant |
h j hj d y dy y x yx w w ww |
---|---|
matchkey_str |
article:1386145X:2020----::nesadnaaowrsyocpulaeig |
hierarchy_sort_str |
2020 |
bklnumber |
54.84$jWebmanagement 06.74$jInformationssysteme |
publishDate |
2020 |
allfields |
10.1007/s11280-020-00806-x doi (DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p DE-627 ger DE-627 rakwb eng 004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Jiang, Haiyun verfasserin aut Understanding a bag of words by conceptual labeling with prior weights 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning Yang, Deqing aut Xiao, Yanghua aut Wang, Wei aut Enthalten in World wide web Springer US, 1998 23(2020), 4 vom: 14. Apr., Seite 2429-2447 (DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 1386-145X nnns volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 https://doi.org/10.1007/s11280-020-00806-x lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI 54.84$jWebmanagement VZ 475288947 (DE-625)475288947 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 23 2020 4 14 04 2429-2447 |
spelling |
10.1007/s11280-020-00806-x doi (DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p DE-627 ger DE-627 rakwb eng 004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Jiang, Haiyun verfasserin aut Understanding a bag of words by conceptual labeling with prior weights 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning Yang, Deqing aut Xiao, Yanghua aut Wang, Wei aut Enthalten in World wide web Springer US, 1998 23(2020), 4 vom: 14. Apr., Seite 2429-2447 (DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 1386-145X nnns volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 https://doi.org/10.1007/s11280-020-00806-x lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI 54.84$jWebmanagement VZ 475288947 (DE-625)475288947 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 23 2020 4 14 04 2429-2447 |
allfields_unstemmed |
10.1007/s11280-020-00806-x doi (DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p DE-627 ger DE-627 rakwb eng 004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Jiang, Haiyun verfasserin aut Understanding a bag of words by conceptual labeling with prior weights 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning Yang, Deqing aut Xiao, Yanghua aut Wang, Wei aut Enthalten in World wide web Springer US, 1998 23(2020), 4 vom: 14. Apr., Seite 2429-2447 (DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 1386-145X nnns volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 https://doi.org/10.1007/s11280-020-00806-x lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI 54.84$jWebmanagement VZ 475288947 (DE-625)475288947 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 23 2020 4 14 04 2429-2447 |
allfieldsGer |
10.1007/s11280-020-00806-x doi (DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p DE-627 ger DE-627 rakwb eng 004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Jiang, Haiyun verfasserin aut Understanding a bag of words by conceptual labeling with prior weights 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning Yang, Deqing aut Xiao, Yanghua aut Wang, Wei aut Enthalten in World wide web Springer US, 1998 23(2020), 4 vom: 14. Apr., Seite 2429-2447 (DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 1386-145X nnns volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 https://doi.org/10.1007/s11280-020-00806-x lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI 54.84$jWebmanagement VZ 475288947 (DE-625)475288947 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 23 2020 4 14 04 2429-2447 |
allfieldsSound |
10.1007/s11280-020-00806-x doi (DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p DE-627 ger DE-627 rakwb eng 004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Jiang, Haiyun verfasserin aut Understanding a bag of words by conceptual labeling with prior weights 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning Yang, Deqing aut Xiao, Yanghua aut Wang, Wei aut Enthalten in World wide web Springer US, 1998 23(2020), 4 vom: 14. Apr., Seite 2429-2447 (DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 1386-145X nnns volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 https://doi.org/10.1007/s11280-020-00806-x lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI 54.84$jWebmanagement VZ 475288947 (DE-625)475288947 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 23 2020 4 14 04 2429-2447 |
language |
English |
source |
Enthalten in World wide web 23(2020), 4 vom: 14. Apr., Seite 2429-2447 volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 |
sourceStr |
Enthalten in World wide web 23(2020), 4 vom: 14. Apr., Seite 2429-2447 volume:23 year:2020 number:4 day:14 month:04 pages:2429-2447 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
World wide web |
authorswithroles_txt_mv |
Jiang, Haiyun @@aut@@ Yang, Deqing @@aut@@ Xiao, Yanghua @@aut@@ Wang, Wei @@aut@@ |
publishDateDaySort_date |
2020-04-14T00:00:00Z |
hierarchy_top_id |
301184976 |
dewey-sort |
14 |
id |
OLC2062252749 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2062252749</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504153307.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200820s2020 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11280-020-00806-x</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2062252749</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11280-020-00806-x-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.84$jWebmanagement</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Jiang, Haiyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Understanding a bag of words by conceptual labeling with prior weights</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2020</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media, LLC, part of Springer Nature 2020</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Conceptual labeling</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Microsoft concept graph</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weighted bag of words</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Multi-objective optimization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Concept pruning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yang, Deqing</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xiao, Yanghua</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Wei</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">World wide web</subfield><subfield code="d">Springer US, 1998</subfield><subfield code="g">23(2020), 4 vom: 14. Apr., Seite 2429-2447</subfield><subfield code="w">(DE-627)301184976</subfield><subfield code="w">(DE-600)1485096-5</subfield><subfield code="w">(DE-576)9301184974</subfield><subfield code="x">1386-145X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2020</subfield><subfield code="g">number:4</subfield><subfield code="g">day:14</subfield><subfield code="g">month:04</subfield><subfield code="g">pages:2429-2447</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11280-020-00806-x</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">54.84$jWebmanagement</subfield><subfield code="q">VZ</subfield><subfield code="0">475288947</subfield><subfield code="0">(DE-625)475288947</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="q">VZ</subfield><subfield code="0">106415212</subfield><subfield code="0">(DE-625)106415212</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2020</subfield><subfield code="e">4</subfield><subfield code="b">14</subfield><subfield code="c">04</subfield><subfield code="h">2429-2447</subfield></datafield></record></collection>
|
author |
Jiang, Haiyun |
spellingShingle |
Jiang, Haiyun ddc 004 ssgn 24,1 bkl 54.84$jWebmanagement bkl 06.74$jInformationssysteme misc Conceptual labeling misc Microsoft concept graph misc Weighted bag of words misc Multi-objective optimization misc Concept pruning Understanding a bag of words by conceptual labeling with prior weights |
authorStr |
Jiang, Haiyun |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)301184976 |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1386-145X |
topic_title |
004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl Understanding a bag of words by conceptual labeling with prior weights Conceptual labeling Microsoft concept graph Weighted bag of words Multi-objective optimization Concept pruning |
topic |
ddc 004 ssgn 24,1 bkl 54.84$jWebmanagement bkl 06.74$jInformationssysteme misc Conceptual labeling misc Microsoft concept graph misc Weighted bag of words misc Multi-objective optimization misc Concept pruning |
topic_unstemmed |
ddc 004 ssgn 24,1 bkl 54.84$jWebmanagement bkl 06.74$jInformationssysteme misc Conceptual labeling misc Microsoft concept graph misc Weighted bag of words misc Multi-objective optimization misc Concept pruning |
topic_browse |
ddc 004 ssgn 24,1 bkl 54.84$jWebmanagement bkl 06.74$jInformationssysteme misc Conceptual labeling misc Microsoft concept graph misc Weighted bag of words misc Multi-objective optimization misc Concept pruning |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
World wide web |
hierarchy_parent_id |
301184976 |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
World wide web |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)301184976 (DE-600)1485096-5 (DE-576)9301184974 |
title |
Understanding a bag of words by conceptual labeling with prior weights |
ctrlnum |
(DE-627)OLC2062252749 (DE-He213)s11280-020-00806-x-p |
title_full |
Understanding a bag of words by conceptual labeling with prior weights |
author_sort |
Jiang, Haiyun |
journal |
World wide web |
journalStr |
World wide web |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2020 |
contenttype_str_mv |
txt |
container_start_page |
2429 |
author_browse |
Jiang, Haiyun Yang, Deqing Xiao, Yanghua Wang, Wei |
container_volume |
23 |
class |
004 VZ 24,1 ssgn 54.84$jWebmanagement bkl 06.74$jInformationssysteme bkl |
format_se |
Aufsätze |
author-letter |
Jiang, Haiyun |
doi_str_mv |
10.1007/s11280-020-00806-x |
normlink |
475288947 106415212 |
normlink_prefix_str_mv |
475288947 (DE-625)475288947 106415212 (DE-625)106415212 |
dewey-full |
004 |
title_sort |
understanding a bag of words by conceptual labeling with prior weights |
title_auth |
Understanding a bag of words by conceptual labeling with prior weights |
abstract |
Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. © Springer Science+Business Media, LLC, part of Springer Nature 2020 |
abstractGer |
Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. © Springer Science+Business Media, LLC, part of Springer Nature 2020 |
abstract_unstemmed |
Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework. © Springer Science+Business Media, LLC, part of Springer Nature 2020 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI |
container_issue |
4 |
title_short |
Understanding a bag of words by conceptual labeling with prior weights |
url |
https://doi.org/10.1007/s11280-020-00806-x |
remote_bool |
false |
author2 |
Yang, Deqing Xiao, Yanghua Wang, Wei |
author2Str |
Yang, Deqing Xiao, Yanghua Wang, Wei |
ppnlink |
301184976 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s11280-020-00806-x |
up_date |
2024-07-03T14:22:16.564Z |
_version_ |
1803568059851472896 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2062252749</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504153307.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200820s2020 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11280-020-00806-x</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2062252749</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11280-020-00806-x-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.84$jWebmanagement</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Jiang, Haiyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Understanding a bag of words by conceptual labeling with prior weights</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2020</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media, LLC, part of Springer Nature 2020</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract In many natural language processing tasks, e.g., text classification or information extraction, the weighted bag-of-words model is widely used to represent the semantics of text, where the importance of each word is quantified by its weight. However, it is still difficult for machines to understand a weighted bag of words (WBoW) without explicit explanations, which seriously limits its application in downstream tasks. To make a machine better understand a WBoW, we introduce the task of conceptual labeling, which aims at generating the minimum number of concepts as labels to explicitly represent and explain the semantics of a WBoW. Specifically, we first propose three principles for label generation and then model each principle as an objective function. To satisfy the three principles simultaneously, a multi-objective optimization problem is solved. In our framework, a taxonomy (i.e., Microsoft Concept Graph) is used to provide high-quality candidate concepts, and a corresponding search algorithm is proposed to derive the optimal solution (i.e., a small set of proper concepts as labels). Furthermore, two pruning strategies are also proposed to reduce the search space and improve the performance. Our experiments and results prove that the proposed method is capable of generating proper labels for WBoWs. Besides, we also apply the generated labels to the task of text classification and observe an increase in performance, which further justifies the effectiveness of our conceptual labeling framework.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Conceptual labeling</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Microsoft concept graph</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weighted bag of words</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Multi-objective optimization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Concept pruning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yang, Deqing</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xiao, Yanghua</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Wei</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">World wide web</subfield><subfield code="d">Springer US, 1998</subfield><subfield code="g">23(2020), 4 vom: 14. Apr., Seite 2429-2447</subfield><subfield code="w">(DE-627)301184976</subfield><subfield code="w">(DE-600)1485096-5</subfield><subfield code="w">(DE-576)9301184974</subfield><subfield code="x">1386-145X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2020</subfield><subfield code="g">number:4</subfield><subfield code="g">day:14</subfield><subfield code="g">month:04</subfield><subfield code="g">pages:2429-2447</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11280-020-00806-x</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">54.84$jWebmanagement</subfield><subfield code="q">VZ</subfield><subfield code="0">475288947</subfield><subfield code="0">(DE-625)475288947</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="q">VZ</subfield><subfield code="0">106415212</subfield><subfield code="0">(DE-625)106415212</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2020</subfield><subfield code="e">4</subfield><subfield code="b">14</subfield><subfield code="c">04</subfield><subfield code="h">2429-2447</subfield></datafield></record></collection>
|
score |
7.398505 |