DeepPatent: patent classification with convolutional neural networks and word embedding
Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN)...
Ausführliche Beschreibung
Autor*in: |
Li, Shaobo [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2018 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Akadémiai Kiadó, Budapest, Hungary 2018 |
---|
Übergeordnetes Werk: |
Enthalten in: Scientometrics - Springer International Publishing, 1978, 117(2018), 2 vom: 06. Sept., Seite 721-744 |
---|---|
Übergeordnetes Werk: |
volume:117 ; year:2018 ; number:2 ; day:06 ; month:09 ; pages:721-744 |
Links: |
---|
DOI / URN: |
10.1007/s11192-018-2905-5 |
---|
Katalog-ID: |
OLC2033220214 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2033220214 | ||
003 | DE-627 | ||
005 | 20230504042144.0 | ||
007 | tu | ||
008 | 200819s2018 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s11192-018-2905-5 |2 doi | |
035 | |a (DE-627)OLC2033220214 | ||
035 | |a (DE-He213)s11192-018-2905-5-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 050 |a 370 |q VZ |
084 | |a 11 |2 ssgn | ||
100 | 1 | |a Li, Shaobo |e verfasserin |4 aut | |
245 | 1 | 0 | |a DeepPatent: patent classification with convolutional neural networks and word embedding |
264 | 1 | |c 2018 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Akadémiai Kiadó, Budapest, Hungary 2018 | ||
520 | |a Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. | ||
650 | 4 | |a Patent classification | |
650 | 4 | |a Text classification | |
650 | 4 | |a Convolutional neural network | |
650 | 4 | |a Machine learning | |
650 | 4 | |a Word embedding | |
700 | 1 | |a Hu, Jie |4 aut | |
700 | 1 | |a Cui, Yuxin |4 aut | |
700 | 1 | |a Hu, Jianjun |0 (orcid)0000-0002-8725-6660 |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Scientometrics |d Springer International Publishing, 1978 |g 117(2018), 2 vom: 06. Sept., Seite 721-744 |w (DE-627)13005352X |w (DE-600)435652-4 |w (DE-576)015591697 |x 0138-9130 |7 nnns |
773 | 1 | 8 | |g volume:117 |g year:2018 |g number:2 |g day:06 |g month:09 |g pages:721-744 |
856 | 4 | 1 | |u https://doi.org/10.1007/s11192-018-2905-5 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OLC-HSW | ||
912 | |a SSG-OPC-BBI | ||
912 | |a GBV_ILN_4012 | ||
951 | |a AR | ||
952 | |d 117 |j 2018 |e 2 |b 06 |c 09 |h 721-744 |
author_variant |
s l sl j h jh y c yc j h jh |
---|---|
matchkey_str |
article:01389130:2018----::epaetaetlsiiainihovltoanuant |
hierarchy_sort_str |
2018 |
publishDate |
2018 |
allfields |
10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744 |
spelling |
10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744 |
allfields_unstemmed |
10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744 |
allfieldsGer |
10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744 |
allfieldsSound |
10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744 |
language |
English |
source |
Enthalten in Scientometrics 117(2018), 2 vom: 06. Sept., Seite 721-744 volume:117 year:2018 number:2 day:06 month:09 pages:721-744 |
sourceStr |
Enthalten in Scientometrics 117(2018), 2 vom: 06. Sept., Seite 721-744 volume:117 year:2018 number:2 day:06 month:09 pages:721-744 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Patent classification Text classification Convolutional neural network Machine learning Word embedding |
dewey-raw |
050 |
isfreeaccess_bool |
false |
container_title |
Scientometrics |
authorswithroles_txt_mv |
Li, Shaobo @@aut@@ Hu, Jie @@aut@@ Cui, Yuxin @@aut@@ Hu, Jianjun @@aut@@ |
publishDateDaySort_date |
2018-09-06T00:00:00Z |
hierarchy_top_id |
13005352X |
dewey-sort |
250 |
id |
OLC2033220214 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033220214</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504042144.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2018 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-018-2905-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033220214</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-018-2905-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Li, Shaobo</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DeepPatent: patent classification with convolutional neural networks and word embedding</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2018</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2018</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Patent classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Convolutional neural network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Word embedding</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jie</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cui, Yuxin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jianjun</subfield><subfield code="0">(orcid)0000-0002-8725-6660</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">117(2018), 2 vom: 06. Sept., Seite 721-744</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:117</subfield><subfield code="g">year:2018</subfield><subfield code="g">number:2</subfield><subfield code="g">day:06</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:721-744</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-018-2905-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">117</subfield><subfield code="j">2018</subfield><subfield code="e">2</subfield><subfield code="b">06</subfield><subfield code="c">09</subfield><subfield code="h">721-744</subfield></datafield></record></collection>
|
author |
Li, Shaobo |
spellingShingle |
Li, Shaobo ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding DeepPatent: patent classification with convolutional neural networks and word embedding |
authorStr |
Li, Shaobo |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)13005352X |
format |
Article |
dewey-ones |
050 - General serial publications 370 - Education |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0138-9130 |
topic_title |
050 370 VZ 11 ssgn DeepPatent: patent classification with convolutional neural networks and word embedding Patent classification Text classification Convolutional neural network Machine learning Word embedding |
topic |
ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding |
topic_unstemmed |
ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding |
topic_browse |
ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Scientometrics |
hierarchy_parent_id |
13005352X |
dewey-tens |
050 - Magazines, journals & serials 370 - Education |
hierarchy_top_title |
Scientometrics |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 |
title |
DeepPatent: patent classification with convolutional neural networks and word embedding |
ctrlnum |
(DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p |
title_full |
DeepPatent: patent classification with convolutional neural networks and word embedding |
author_sort |
Li, Shaobo |
journal |
Scientometrics |
journalStr |
Scientometrics |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works 300 - Social sciences |
recordtype |
marc |
publishDateSort |
2018 |
contenttype_str_mv |
txt |
container_start_page |
721 |
author_browse |
Li, Shaobo Hu, Jie Cui, Yuxin Hu, Jianjun |
container_volume |
117 |
class |
050 370 VZ 11 ssgn |
format_se |
Aufsätze |
author-letter |
Li, Shaobo |
doi_str_mv |
10.1007/s11192-018-2905-5 |
normlink |
(ORCID)0000-0002-8725-6660 |
normlink_prefix_str_mv |
(orcid)0000-0002-8725-6660 |
dewey-full |
050 370 |
title_sort |
deeppatent: patent classification with convolutional neural networks and word embedding |
title_auth |
DeepPatent: patent classification with convolutional neural networks and word embedding |
abstract |
Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018 |
abstractGer |
Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018 |
abstract_unstemmed |
Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 |
container_issue |
2 |
title_short |
DeepPatent: patent classification with convolutional neural networks and word embedding |
url |
https://doi.org/10.1007/s11192-018-2905-5 |
remote_bool |
false |
author2 |
Hu, Jie Cui, Yuxin Hu, Jianjun |
author2Str |
Hu, Jie Cui, Yuxin Hu, Jianjun |
ppnlink |
13005352X |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s11192-018-2905-5 |
up_date |
2024-07-03T16:11:41.977Z |
_version_ |
1803574944178634753 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033220214</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504042144.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2018 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-018-2905-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033220214</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-018-2905-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Li, Shaobo</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DeepPatent: patent classification with convolutional neural networks and word embedding</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2018</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2018</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Patent classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Convolutional neural network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Word embedding</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jie</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cui, Yuxin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jianjun</subfield><subfield code="0">(orcid)0000-0002-8725-6660</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">117(2018), 2 vom: 06. Sept., Seite 721-744</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:117</subfield><subfield code="g">year:2018</subfield><subfield code="g">number:2</subfield><subfield code="g">day:06</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:721-744</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-018-2905-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">117</subfield><subfield code="j">2018</subfield><subfield code="e">2</subfield><subfield code="b">06</subfield><subfield code="c">09</subfield><subfield code="h">721-744</subfield></datafield></record></collection>
|
score |
7.4019012 |