Important citation identification by exploiting the syntactic and contextual information of citations
Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framew...
Ausführliche Beschreibung
Autor*in: |
Wang, Mingyang [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2020 |
---|
Schlagwörter: |
Importance citation identification |
---|
Anmerkung: |
© Akadémiai Kiadó, Budapest, Hungary 2020 |
---|
Übergeordnetes Werk: |
Enthalten in: Scientometrics - Springer International Publishing, 1978, 125(2020), 3 vom: 02. Sept., Seite 2109-2129 |
---|---|
Übergeordnetes Werk: |
volume:125 ; year:2020 ; number:3 ; day:02 ; month:09 ; pages:2109-2129 |
Links: |
---|
DOI / URN: |
10.1007/s11192-020-03677-1 |
---|
Katalog-ID: |
OLC2121537813 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | OLC2121537813 | ||
003 | DE-627 | ||
005 | 20230504184713.0 | ||
007 | tu | ||
008 | 230504s2020 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s11192-020-03677-1 |2 doi | |
035 | |a (DE-627)OLC2121537813 | ||
035 | |a (DE-He213)s11192-020-03677-1-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 050 |a 370 |q VZ |
084 | |a 11 |2 ssgn | ||
100 | 1 | |a Wang, Mingyang |e verfasserin |4 aut | |
245 | 1 | 0 | |a Important citation identification by exploiting the syntactic and contextual information of citations |
264 | 1 | |c 2020 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Akadémiai Kiadó, Budapest, Hungary 2020 | ||
520 | |a Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. | ||
650 | 4 | |a Importance citation identification | |
650 | 4 | |a Binary citation classification | |
650 | 4 | |a Syntactic characteristics | |
650 | 4 | |a Contextual characteristics | |
700 | 1 | |a Zhang, Jiaqi |4 aut | |
700 | 1 | |a Jiao, Shijia |4 aut | |
700 | 1 | |a Zhang, Xiangrong |4 aut | |
700 | 1 | |a Zhu, Na |4 aut | |
700 | 1 | |a Chen, Guangsheng |0 (orcid)0000-0003-0525-6120 |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Scientometrics |d Springer International Publishing, 1978 |g 125(2020), 3 vom: 02. Sept., Seite 2109-2129 |w (DE-627)13005352X |w (DE-600)435652-4 |w (DE-576)015591697 |x 0138-9130 |7 nnns |
773 | 1 | 8 | |g volume:125 |g year:2020 |g number:3 |g day:02 |g month:09 |g pages:2109-2129 |
856 | 4 | 1 | |u https://doi.org/10.1007/s11192-020-03677-1 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OLC-HSW | ||
912 | |a SSG-OPC-BBI | ||
912 | |a GBV_ILN_4012 | ||
951 | |a AR | ||
952 | |d 125 |j 2020 |e 3 |b 02 |c 09 |h 2109-2129 |
author_variant |
m w mw j z jz s j sj x z xz n z nz g c gc |
---|---|
matchkey_str |
article:01389130:2020----::motncttoietfctobepotnteytciadotxu |
hierarchy_sort_str |
2020 |
publishDate |
2020 |
allfields |
10.1007/s11192-020-03677-1 doi (DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Wang, Mingyang verfasserin aut Important citation identification by exploiting the syntactic and contextual information of citations 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2020 Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics Zhang, Jiaqi aut Jiao, Shijia aut Zhang, Xiangrong aut Zhu, Na aut Chen, Guangsheng (orcid)0000-0003-0525-6120 aut Enthalten in Scientometrics Springer International Publishing, 1978 125(2020), 3 vom: 02. Sept., Seite 2109-2129 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 https://doi.org/10.1007/s11192-020-03677-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 125 2020 3 02 09 2109-2129 |
spelling |
10.1007/s11192-020-03677-1 doi (DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Wang, Mingyang verfasserin aut Important citation identification by exploiting the syntactic and contextual information of citations 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2020 Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics Zhang, Jiaqi aut Jiao, Shijia aut Zhang, Xiangrong aut Zhu, Na aut Chen, Guangsheng (orcid)0000-0003-0525-6120 aut Enthalten in Scientometrics Springer International Publishing, 1978 125(2020), 3 vom: 02. Sept., Seite 2109-2129 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 https://doi.org/10.1007/s11192-020-03677-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 125 2020 3 02 09 2109-2129 |
allfields_unstemmed |
10.1007/s11192-020-03677-1 doi (DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Wang, Mingyang verfasserin aut Important citation identification by exploiting the syntactic and contextual information of citations 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2020 Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics Zhang, Jiaqi aut Jiao, Shijia aut Zhang, Xiangrong aut Zhu, Na aut Chen, Guangsheng (orcid)0000-0003-0525-6120 aut Enthalten in Scientometrics Springer International Publishing, 1978 125(2020), 3 vom: 02. Sept., Seite 2109-2129 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 https://doi.org/10.1007/s11192-020-03677-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 125 2020 3 02 09 2109-2129 |
allfieldsGer |
10.1007/s11192-020-03677-1 doi (DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Wang, Mingyang verfasserin aut Important citation identification by exploiting the syntactic and contextual information of citations 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2020 Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics Zhang, Jiaqi aut Jiao, Shijia aut Zhang, Xiangrong aut Zhu, Na aut Chen, Guangsheng (orcid)0000-0003-0525-6120 aut Enthalten in Scientometrics Springer International Publishing, 1978 125(2020), 3 vom: 02. Sept., Seite 2109-2129 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 https://doi.org/10.1007/s11192-020-03677-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 125 2020 3 02 09 2109-2129 |
allfieldsSound |
10.1007/s11192-020-03677-1 doi (DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Wang, Mingyang verfasserin aut Important citation identification by exploiting the syntactic and contextual information of citations 2020 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2020 Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics Zhang, Jiaqi aut Jiao, Shijia aut Zhang, Xiangrong aut Zhu, Na aut Chen, Guangsheng (orcid)0000-0003-0525-6120 aut Enthalten in Scientometrics Springer International Publishing, 1978 125(2020), 3 vom: 02. Sept., Seite 2109-2129 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 https://doi.org/10.1007/s11192-020-03677-1 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 125 2020 3 02 09 2109-2129 |
language |
English |
source |
Enthalten in Scientometrics 125(2020), 3 vom: 02. Sept., Seite 2109-2129 volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 |
sourceStr |
Enthalten in Scientometrics 125(2020), 3 vom: 02. Sept., Seite 2109-2129 volume:125 year:2020 number:3 day:02 month:09 pages:2109-2129 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics |
dewey-raw |
050 |
isfreeaccess_bool |
false |
container_title |
Scientometrics |
authorswithroles_txt_mv |
Wang, Mingyang @@aut@@ Zhang, Jiaqi @@aut@@ Jiao, Shijia @@aut@@ Zhang, Xiangrong @@aut@@ Zhu, Na @@aut@@ Chen, Guangsheng @@aut@@ |
publishDateDaySort_date |
2020-09-02T00:00:00Z |
hierarchy_top_id |
13005352X |
dewey-sort |
250 |
id |
OLC2121537813 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2121537813</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504184713.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230504s2020 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-020-03677-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2121537813</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-020-03677-1-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Wang, Mingyang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Important citation identification by exploiting the syntactic and contextual information of citations</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2020</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2020</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Importance citation identification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Binary citation classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Syntactic characteristics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Contextual characteristics</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Jiaqi</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Jiao, Shijia</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Xiangrong</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhu, Na</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Guangsheng</subfield><subfield code="0">(orcid)0000-0003-0525-6120</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">125(2020), 3 vom: 02. Sept., Seite 2109-2129</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:125</subfield><subfield code="g">year:2020</subfield><subfield code="g">number:3</subfield><subfield code="g">day:02</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:2109-2129</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-020-03677-1</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">125</subfield><subfield code="j">2020</subfield><subfield code="e">3</subfield><subfield code="b">02</subfield><subfield code="c">09</subfield><subfield code="h">2109-2129</subfield></datafield></record></collection>
|
author |
Wang, Mingyang |
spellingShingle |
Wang, Mingyang ddc 050 ssgn 11 misc Importance citation identification misc Binary citation classification misc Syntactic characteristics misc Contextual characteristics Important citation identification by exploiting the syntactic and contextual information of citations |
authorStr |
Wang, Mingyang |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)13005352X |
format |
Article |
dewey-ones |
050 - General serial publications 370 - Education |
delete_txt_mv |
keep |
author_role |
aut aut aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0138-9130 |
topic_title |
050 370 VZ 11 ssgn Important citation identification by exploiting the syntactic and contextual information of citations Importance citation identification Binary citation classification Syntactic characteristics Contextual characteristics |
topic |
ddc 050 ssgn 11 misc Importance citation identification misc Binary citation classification misc Syntactic characteristics misc Contextual characteristics |
topic_unstemmed |
ddc 050 ssgn 11 misc Importance citation identification misc Binary citation classification misc Syntactic characteristics misc Contextual characteristics |
topic_browse |
ddc 050 ssgn 11 misc Importance citation identification misc Binary citation classification misc Syntactic characteristics misc Contextual characteristics |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Scientometrics |
hierarchy_parent_id |
13005352X |
dewey-tens |
050 - Magazines, journals & serials 370 - Education |
hierarchy_top_title |
Scientometrics |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 |
title |
Important citation identification by exploiting the syntactic and contextual information of citations |
ctrlnum |
(DE-627)OLC2121537813 (DE-He213)s11192-020-03677-1-p |
title_full |
Important citation identification by exploiting the syntactic and contextual information of citations |
author_sort |
Wang, Mingyang |
journal |
Scientometrics |
journalStr |
Scientometrics |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works 300 - Social sciences |
recordtype |
marc |
publishDateSort |
2020 |
contenttype_str_mv |
txt |
container_start_page |
2109 |
author_browse |
Wang, Mingyang Zhang, Jiaqi Jiao, Shijia Zhang, Xiangrong Zhu, Na Chen, Guangsheng |
container_volume |
125 |
class |
050 370 VZ 11 ssgn |
format_se |
Aufsätze |
author-letter |
Wang, Mingyang |
doi_str_mv |
10.1007/s11192-020-03677-1 |
normlink |
(ORCID)0000-0003-0525-6120 |
normlink_prefix_str_mv |
(orcid)0000-0003-0525-6120 |
dewey-full |
050 370 |
title_sort |
important citation identification by exploiting the syntactic and contextual information of citations |
title_auth |
Important citation identification by exploiting the syntactic and contextual information of citations |
abstract |
Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. © Akadémiai Kiadó, Budapest, Hungary 2020 |
abstractGer |
Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. © Akadémiai Kiadó, Budapest, Hungary 2020 |
abstract_unstemmed |
Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations. © Akadémiai Kiadó, Budapest, Hungary 2020 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 |
container_issue |
3 |
title_short |
Important citation identification by exploiting the syntactic and contextual information of citations |
url |
https://doi.org/10.1007/s11192-020-03677-1 |
remote_bool |
false |
author2 |
Zhang, Jiaqi Jiao, Shijia Zhang, Xiangrong Zhu, Na Chen, Guangsheng |
author2Str |
Zhang, Jiaqi Jiao, Shijia Zhang, Xiangrong Zhu, Na Chen, Guangsheng |
ppnlink |
13005352X |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s11192-020-03677-1 |
up_date |
2024-07-04T07:15:30.444Z |
_version_ |
1803631806848696321 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2121537813</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504184713.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230504s2020 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-020-03677-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2121537813</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-020-03677-1-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Wang, Mingyang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Important citation identification by exploiting the syntactic and contextual information of citations</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2020</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2020</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Importance citation identification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Binary citation classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Syntactic characteristics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Contextual characteristics</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Jiaqi</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Jiao, Shijia</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Xiangrong</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhu, Na</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Guangsheng</subfield><subfield code="0">(orcid)0000-0003-0525-6120</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">125(2020), 3 vom: 02. Sept., Seite 2109-2129</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:125</subfield><subfield code="g">year:2020</subfield><subfield code="g">number:3</subfield><subfield code="g">day:02</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:2109-2129</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-020-03677-1</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">125</subfield><subfield code="j">2020</subfield><subfield code="e">3</subfield><subfield code="b">02</subfield><subfield code="c">09</subfield><subfield code="h">2109-2129</subfield></datafield></record></collection>
|
score |
7.401184 |