DeepPatent: patent classification with convolutional neural networks and word embedding

Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN)...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Li, Shaobo [verfasserIn] Hu, Jie Cui, Yuxin Hu, Jianjun

Format:	Artikel
Sprache:	Englisch

Erschienen:	2018

Schlagwörter:	Patent classification Text classification Convolutional neural network Machine learning Word embedding

Anmerkung:	© Akadémiai Kiadó, Budapest, Hungary 2018

Übergeordnetes Werk:	Enthalten in: Scientometrics - Springer International Publishing, 1978, 117(2018), 2 vom: 06. Sept., Seite 721-744
Übergeordnetes Werk:	volume:117 ; year:2018 ; number:2 ; day:06 ; month:09 ; pages:721-744

Links:	Volltext

DOI / URN:	10.1007/s11192-018-2905-5

Katalog-ID:	OLC2033220214

Internformat


LEADER	01000caa a22002652 4500
001	OLC2033220214
003	DE-627
005	20230504042144.0
007	tu
008	200819s2018 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/s11192-018-2905-5 \|2 doi
035			\|a (DE-627)OLC2033220214
035			\|a (DE-He213)s11192-018-2905-5-p
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 050 \|a 370 \|q VZ
084			\|a 11 \|2 ssgn
100	1		\|a Li, Shaobo \|e verfasserin \|4 aut
245	1	0	\|a DeepPatent: patent classification with convolutional neural networks and word embedding
264		1	\|c 2018
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a © Akadémiai Kiadó, Budapest, Hungary 2018
520			\|a Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.
650		4	\|a Patent classification
650		4	\|a Text classification
650		4	\|a Convolutional neural network
650		4	\|a Machine learning
650		4	\|a Word embedding
700	1		\|a Hu, Jie \|4 aut
700	1		\|a Cui, Yuxin \|4 aut
700	1		\|a Hu, Jianjun \|0 (orcid)0000-0002-8725-6660 \|4 aut
773	0	8	\|i Enthalten in \|t Scientometrics \|d Springer International Publishing, 1978 \|g 117(2018), 2 vom: 06. Sept., Seite 721-744 \|w (DE-627)13005352X \|w (DE-600)435652-4 \|w (DE-576)015591697 \|x 0138-9130 \|7 nnns
773	1	8	\|g volume:117 \|g year:2018 \|g number:2 \|g day:06 \|g month:09 \|g pages:721-744
856	4	1	\|u https://doi.org/10.1007/s11192-018-2905-5 \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-BUB
912			\|a SSG-OLC-HSW
912			\|a SSG-OPC-BBI
912			\|a GBV_ILN_4012
951			\|a AR
952			\|d 117 \|j 2018 \|e 2 \|b 06 \|c 09 \|h 721-744

Indexfelder

author_variant	s l sl j h jh y c yc j h jh
matchkey_str	article:01389130:2018----::epaetaetlsiiainihovltoanuant
hierarchy_sort_str	2018
publishDate	2018
allfields	10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744
spelling	10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744
allfields_unstemmed	10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744
allfieldsGer	10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744
allfieldsSound	10.1007/s11192-018-2905-5 doi (DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p DE-627 ger DE-627 rakwb eng 050 370 VZ 11 ssgn Li, Shaobo verfasserin aut DeepPatent: patent classification with convolutional neural networks and word embedding 2018 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Akadémiai Kiadó, Budapest, Hungary 2018 Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. Patent classification Text classification Convolutional neural network Machine learning Word embedding Hu, Jie aut Cui, Yuxin aut Hu, Jianjun (orcid)0000-0002-8725-6660 aut Enthalten in Scientometrics Springer International Publishing, 1978 117(2018), 2 vom: 06. Sept., Seite 721-744 (DE-627)13005352X (DE-600)435652-4 (DE-576)015591697 0138-9130 nnns volume:117 year:2018 number:2 day:06 month:09 pages:721-744 https://doi.org/10.1007/s11192-018-2905-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012 AR 117 2018 2 06 09 721-744
language	English
source	Enthalten in Scientometrics 117(2018), 2 vom: 06. Sept., Seite 721-744 volume:117 year:2018 number:2 day:06 month:09 pages:721-744
sourceStr	Enthalten in Scientometrics 117(2018), 2 vom: 06. Sept., Seite 721-744 volume:117 year:2018 number:2 day:06 month:09 pages:721-744
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Patent classification Text classification Convolutional neural network Machine learning Word embedding
dewey-raw	050
isfreeaccess_bool	false
container_title	Scientometrics
authorswithroles_txt_mv	Li, Shaobo @@aut@@ Hu, Jie @@aut@@ Cui, Yuxin @@aut@@ Hu, Jianjun @@aut@@
publishDateDaySort_date	2018-09-06T00:00:00Z
hierarchy_top_id	13005352X
dewey-sort	250
id	OLC2033220214
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033220214</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504042144.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2018 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-018-2905-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033220214</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-018-2905-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Li, Shaobo</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DeepPatent: patent classification with convolutional neural networks and word embedding</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2018</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2018</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Patent classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Convolutional neural network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Word embedding</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jie</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cui, Yuxin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jianjun</subfield><subfield code="0">(orcid)0000-0002-8725-6660</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">117(2018), 2 vom: 06. Sept., Seite 721-744</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:117</subfield><subfield code="g">year:2018</subfield><subfield code="g">number:2</subfield><subfield code="g">day:06</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:721-744</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-018-2905-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">117</subfield><subfield code="j">2018</subfield><subfield code="e">2</subfield><subfield code="b">06</subfield><subfield code="c">09</subfield><subfield code="h">721-744</subfield></datafield></record></collection>
author	Li, Shaobo
spellingShingle	Li, Shaobo ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding DeepPatent: patent classification with convolutional neural networks and word embedding
authorStr	Li, Shaobo
ppnlink_with_tag_str_mv	@@773@@(DE-627)13005352X
format	Article
dewey-ones	050 - General serial publications 370 - Education
delete_txt_mv	keep
author_role	aut aut aut aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0138-9130
topic_title	050 370 VZ 11 ssgn DeepPatent: patent classification with convolutional neural networks and word embedding Patent classification Text classification Convolutional neural network Machine learning Word embedding
topic	ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding
topic_unstemmed	ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding
topic_browse	ddc 050 ssgn 11 misc Patent classification misc Text classification misc Convolutional neural network misc Machine learning misc Word embedding
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
hierarchy_parent_title	Scientometrics
hierarchy_parent_id	13005352X
dewey-tens	050 - Magazines, journals & serials 370 - Education
hierarchy_top_title	Scientometrics
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)13005352X (DE-600)435652-4 (DE-576)015591697
title	DeepPatent: patent classification with convolutional neural networks and word embedding
ctrlnum	(DE-627)OLC2033220214 (DE-He213)s11192-018-2905-5-p
title_full	DeepPatent: patent classification with convolutional neural networks and word embedding
author_sort	Li, Shaobo
journal	Scientometrics
journalStr	Scientometrics
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works 300 - Social sciences
recordtype	marc
publishDateSort	2018
contenttype_str_mv	txt
container_start_page	721
author_browse	Li, Shaobo Hu, Jie Cui, Yuxin Hu, Jianjun
container_volume	117
class	050 370 VZ 11 ssgn
format_se	Aufsätze
author-letter	Li, Shaobo
doi_str_mv	10.1007/s11192-018-2905-5
normlink	(ORCID)0000-0002-8725-6660
normlink_prefix_str_mv	(orcid)0000-0002-8725-6660
dewey-full	050 370
title_sort	deeppatent: patent classification with convolutional neural networks and word embedding
title_auth	DeepPatent: patent classification with convolutional neural networks and word embedding
abstract	Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018
abstractGer	Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018
abstract_unstemmed	Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%. © Akadémiai Kiadó, Budapest, Hungary 2018
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OLC-HSW SSG-OPC-BBI GBV_ILN_4012
container_issue	2
title_short	DeepPatent: patent classification with convolutional neural networks and word embedding
url	https://doi.org/10.1007/s11192-018-2905-5
remote_bool	false
author2	Hu, Jie Cui, Yuxin Hu, Jianjun
author2Str	Hu, Jie Cui, Yuxin Hu, Jianjun
ppnlink	13005352X
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s11192-018-2905-5
up_date	2024-07-03T16:11:41.977Z
_version_	1803574944178634753
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033220214</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504042144.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2018 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11192-018-2905-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033220214</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11192-018-2905-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">050</subfield><subfield code="a">370</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Li, Shaobo</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DeepPatent: patent classification with convolutional neural networks and word embedding</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2018</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Akadémiai Kiadó, Budapest, Hungary 2018</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Patent classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Convolutional neural network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Word embedding</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jie</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cui, Yuxin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Jianjun</subfield><subfield code="0">(orcid)0000-0002-8725-6660</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Scientometrics</subfield><subfield code="d">Springer International Publishing, 1978</subfield><subfield code="g">117(2018), 2 vom: 06. Sept., Seite 721-744</subfield><subfield code="w">(DE-627)13005352X</subfield><subfield code="w">(DE-600)435652-4</subfield><subfield code="w">(DE-576)015591697</subfield><subfield code="x">0138-9130</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:117</subfield><subfield code="g">year:2018</subfield><subfield code="g">number:2</subfield><subfield code="g">day:06</subfield><subfield code="g">month:09</subfield><subfield code="g">pages:721-744</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11192-018-2905-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-HSW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">117</subfield><subfield code="j">2018</subfield><subfield code="e">2</subfield><subfield code="b">06</subfield><subfield code="c">09</subfield><subfield code="h">721-744</subfield></datafield></record></collection>
score	7.4019012

Nicht das Richtige dabei?

Schreiben Sie uns!

DeepPatent: patent classification with convolutional neural networks and word embedding

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?