Co-occurrence pattern mining based on a biological approximation scoring matrix
Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP w...
Ausführliche Beschreibung
Autor*in: |
Guo, Dan [verfasserIn] Yuan, Ermao [verfasserIn] Hu, Xuegang [verfasserIn] Wu, Xindong [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2017 |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
Enthalten in: Pattern Analysis & Applications - Springer-Verlag, 1999, 21(2017), 4 vom: 28. Feb., Seite 977-996 |
---|---|
Übergeordnetes Werk: |
volume:21 ; year:2017 ; number:4 ; day:28 ; month:02 ; pages:977-996 |
Links: |
---|
DOI / URN: |
10.1007/s10044-017-0609-8 |
---|
Katalog-ID: |
SPR008217564 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | SPR008217564 | ||
003 | DE-627 | ||
005 | 20201124023810.0 | ||
007 | cr uuu---uuuuu | ||
008 | 201005s2017 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1007/s10044-017-0609-8 |2 doi | |
035 | |a (DE-627)SPR008217564 | ||
035 | |a (SPR)s10044-017-0609-8-e | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Guo, Dan |e verfasserin |4 aut | |
245 | 1 | 0 | |a Co-occurrence pattern mining based on a biological approximation scoring matrix |
264 | 1 | |c 2017 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. | ||
650 | 4 | |a Co-occurrence pattern |7 (dpeaa)DE-He213 | |
650 | 4 | |a Pattern mining |7 (dpeaa)DE-He213 | |
650 | 4 | |a Approximate |7 (dpeaa)DE-He213 | |
650 | 4 | |a Gap |7 (dpeaa)DE-He213 | |
650 | 4 | |a Edit distance matrix |7 (dpeaa)DE-He213 | |
700 | 1 | |a Yuan, Ermao |e verfasserin |4 aut | |
700 | 1 | |a Hu, Xuegang |e verfasserin |4 aut | |
700 | 1 | |a Wu, Xindong |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Pattern Analysis & Applications |d Springer-Verlag, 1999 |g 21(2017), 4 vom: 28. Feb., Seite 977-996 |w (DE-627)SPR008209189 |7 nnns |
773 | 1 | 8 | |g volume:21 |g year:2017 |g number:4 |g day:28 |g month:02 |g pages:977-996 |
856 | 4 | 0 | |u https://dx.doi.org/10.1007/s10044-017-0609-8 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_SPRINGER | ||
951 | |a AR | ||
952 | |d 21 |j 2017 |e 4 |b 28 |c 02 |h 977-996 |
author_variant |
d g dg e y ey x h xh x w xw |
---|---|
matchkey_str |
guodanyuanermaohuxuegangwuxindong:2017----:ocurneatrmnnbsdnbooiaapoi |
hierarchy_sort_str |
2017 |
publishDate |
2017 |
allfields |
10.1007/s10044-017-0609-8 doi (DE-627)SPR008217564 (SPR)s10044-017-0609-8-e DE-627 ger DE-627 rakwb eng Guo, Dan verfasserin aut Co-occurrence pattern mining based on a biological approximation scoring matrix 2017 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 Yuan, Ermao verfasserin aut Hu, Xuegang verfasserin aut Wu, Xindong verfasserin aut Enthalten in Pattern Analysis & Applications Springer-Verlag, 1999 21(2017), 4 vom: 28. Feb., Seite 977-996 (DE-627)SPR008209189 nnns volume:21 year:2017 number:4 day:28 month:02 pages:977-996 https://dx.doi.org/10.1007/s10044-017-0609-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 21 2017 4 28 02 977-996 |
spelling |
10.1007/s10044-017-0609-8 doi (DE-627)SPR008217564 (SPR)s10044-017-0609-8-e DE-627 ger DE-627 rakwb eng Guo, Dan verfasserin aut Co-occurrence pattern mining based on a biological approximation scoring matrix 2017 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 Yuan, Ermao verfasserin aut Hu, Xuegang verfasserin aut Wu, Xindong verfasserin aut Enthalten in Pattern Analysis & Applications Springer-Verlag, 1999 21(2017), 4 vom: 28. Feb., Seite 977-996 (DE-627)SPR008209189 nnns volume:21 year:2017 number:4 day:28 month:02 pages:977-996 https://dx.doi.org/10.1007/s10044-017-0609-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 21 2017 4 28 02 977-996 |
allfields_unstemmed |
10.1007/s10044-017-0609-8 doi (DE-627)SPR008217564 (SPR)s10044-017-0609-8-e DE-627 ger DE-627 rakwb eng Guo, Dan verfasserin aut Co-occurrence pattern mining based on a biological approximation scoring matrix 2017 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 Yuan, Ermao verfasserin aut Hu, Xuegang verfasserin aut Wu, Xindong verfasserin aut Enthalten in Pattern Analysis & Applications Springer-Verlag, 1999 21(2017), 4 vom: 28. Feb., Seite 977-996 (DE-627)SPR008209189 nnns volume:21 year:2017 number:4 day:28 month:02 pages:977-996 https://dx.doi.org/10.1007/s10044-017-0609-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 21 2017 4 28 02 977-996 |
allfieldsGer |
10.1007/s10044-017-0609-8 doi (DE-627)SPR008217564 (SPR)s10044-017-0609-8-e DE-627 ger DE-627 rakwb eng Guo, Dan verfasserin aut Co-occurrence pattern mining based on a biological approximation scoring matrix 2017 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 Yuan, Ermao verfasserin aut Hu, Xuegang verfasserin aut Wu, Xindong verfasserin aut Enthalten in Pattern Analysis & Applications Springer-Verlag, 1999 21(2017), 4 vom: 28. Feb., Seite 977-996 (DE-627)SPR008209189 nnns volume:21 year:2017 number:4 day:28 month:02 pages:977-996 https://dx.doi.org/10.1007/s10044-017-0609-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 21 2017 4 28 02 977-996 |
allfieldsSound |
10.1007/s10044-017-0609-8 doi (DE-627)SPR008217564 (SPR)s10044-017-0609-8-e DE-627 ger DE-627 rakwb eng Guo, Dan verfasserin aut Co-occurrence pattern mining based on a biological approximation scoring matrix 2017 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 Yuan, Ermao verfasserin aut Hu, Xuegang verfasserin aut Wu, Xindong verfasserin aut Enthalten in Pattern Analysis & Applications Springer-Verlag, 1999 21(2017), 4 vom: 28. Feb., Seite 977-996 (DE-627)SPR008209189 nnns volume:21 year:2017 number:4 day:28 month:02 pages:977-996 https://dx.doi.org/10.1007/s10044-017-0609-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER AR 21 2017 4 28 02 977-996 |
language |
English |
source |
Enthalten in Pattern Analysis & Applications 21(2017), 4 vom: 28. Feb., Seite 977-996 volume:21 year:2017 number:4 day:28 month:02 pages:977-996 |
sourceStr |
Enthalten in Pattern Analysis & Applications 21(2017), 4 vom: 28. Feb., Seite 977-996 volume:21 year:2017 number:4 day:28 month:02 pages:977-996 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Co-occurrence pattern Pattern mining Approximate Gap Edit distance matrix |
isfreeaccess_bool |
false |
container_title |
Pattern Analysis & Applications |
authorswithroles_txt_mv |
Guo, Dan @@aut@@ Yuan, Ermao @@aut@@ Hu, Xuegang @@aut@@ Wu, Xindong @@aut@@ |
publishDateDaySort_date |
2017-02-28T00:00:00Z |
hierarchy_top_id |
SPR008209189 |
id |
SPR008217564 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">SPR008217564</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20201124023810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">201005s2017 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10044-017-0609-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR008217564</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s10044-017-0609-8-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Guo, Dan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Co-occurrence pattern mining based on a biological approximation scoring matrix</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2017</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Co-occurrence pattern</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Pattern mining</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Approximate</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Gap</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Edit distance matrix</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yuan, Ermao</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Xuegang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wu, Xindong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Pattern Analysis & Applications</subfield><subfield code="d">Springer-Verlag, 1999</subfield><subfield code="g">21(2017), 4 vom: 28. Feb., Seite 977-996</subfield><subfield code="w">(DE-627)SPR008209189</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:21</subfield><subfield code="g">year:2017</subfield><subfield code="g">number:4</subfield><subfield code="g">day:28</subfield><subfield code="g">month:02</subfield><subfield code="g">pages:977-996</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s10044-017-0609-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">21</subfield><subfield code="j">2017</subfield><subfield code="e">4</subfield><subfield code="b">28</subfield><subfield code="c">02</subfield><subfield code="h">977-996</subfield></datafield></record></collection>
|
author |
Guo, Dan |
spellingShingle |
Guo, Dan misc Co-occurrence pattern misc Pattern mining misc Approximate misc Gap misc Edit distance matrix Co-occurrence pattern mining based on a biological approximation scoring matrix |
authorStr |
Guo, Dan |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)SPR008209189 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
springer |
remote_str |
true |
illustrated |
Not Illustrated |
topic_title |
Co-occurrence pattern mining based on a biological approximation scoring matrix Co-occurrence pattern (dpeaa)DE-He213 Pattern mining (dpeaa)DE-He213 Approximate (dpeaa)DE-He213 Gap (dpeaa)DE-He213 Edit distance matrix (dpeaa)DE-He213 |
topic |
misc Co-occurrence pattern misc Pattern mining misc Approximate misc Gap misc Edit distance matrix |
topic_unstemmed |
misc Co-occurrence pattern misc Pattern mining misc Approximate misc Gap misc Edit distance matrix |
topic_browse |
misc Co-occurrence pattern misc Pattern mining misc Approximate misc Gap misc Edit distance matrix |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
Pattern Analysis & Applications |
hierarchy_parent_id |
SPR008209189 |
hierarchy_top_title |
Pattern Analysis & Applications |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)SPR008209189 |
title |
Co-occurrence pattern mining based on a biological approximation scoring matrix |
ctrlnum |
(DE-627)SPR008217564 (SPR)s10044-017-0609-8-e |
title_full |
Co-occurrence pattern mining based on a biological approximation scoring matrix |
author_sort |
Guo, Dan |
journal |
Pattern Analysis & Applications |
journalStr |
Pattern Analysis & Applications |
lang_code |
eng |
isOA_bool |
false |
recordtype |
marc |
publishDateSort |
2017 |
contenttype_str_mv |
txt |
container_start_page |
977 |
author_browse |
Guo, Dan Yuan, Ermao Hu, Xuegang Wu, Xindong |
container_volume |
21 |
format_se |
Elektronische Aufsätze |
author-letter |
Guo, Dan |
doi_str_mv |
10.1007/s10044-017-0609-8 |
author2-role |
verfasserin |
title_sort |
co-occurrence pattern mining based on a biological approximation scoring matrix |
title_auth |
Co-occurrence pattern mining based on a biological approximation scoring matrix |
abstract |
Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. |
abstractGer |
Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. |
abstract_unstemmed |
Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER |
container_issue |
4 |
title_short |
Co-occurrence pattern mining based on a biological approximation scoring matrix |
url |
https://dx.doi.org/10.1007/s10044-017-0609-8 |
remote_bool |
true |
author2 |
Yuan, Ermao Hu, Xuegang Wu, Xindong |
author2Str |
Yuan, Ermao Hu, Xuegang Wu, Xindong |
ppnlink |
SPR008209189 |
mediatype_str_mv |
c |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s10044-017-0609-8 |
up_date |
2024-07-03T18:01:41.456Z |
_version_ |
1803581864233926656 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">SPR008217564</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20201124023810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">201005s2017 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10044-017-0609-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR008217564</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s10044-017-0609-8-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Guo, Dan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Co-occurrence pattern mining based on a biological approximation scoring matrix</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2017</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Mining co-occurrence frequency patterns from multiple sequences is a hot topic in bioinformatics. Many seemingly disorganized constituents repetitively appear under different biological matrices, such as PAM250 and BLOSUM62, which are considered hidden frequent patterns (FPs). A hidden FP with both gap and flexible approximation operations (replacement, deletion or insertion) deepens the difficulty in discovering its true occurrences. To effectively discover co-occurrence FPs (Co-FPs) under these conditions, we design a mining algorithm (co-fp-miner) using the following steps: (1) a biological approximation scoring matrix is designed to discover various deformations of a single FP pattern; (2) a data-driven intersection tactic is used to generate candidate Co-FPs; (3) a deterministic Apriori-like rule is proposed to prune unnecessary Co-FPs; and (4) finally, we employ a backtracking matching scheme to validate true Co-FPs. The co-fp-miner algorithm is an unified framework for both exact and approximate mining on multiple sequences. Experiments on DNA and protein sequences demonstrate that co-fp-miner is more efficient on solutions, time and memory consumption than that of other peers.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Co-occurrence pattern</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Pattern mining</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Approximate</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Gap</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Edit distance matrix</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yuan, Ermao</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hu, Xuegang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wu, Xindong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Pattern Analysis & Applications</subfield><subfield code="d">Springer-Verlag, 1999</subfield><subfield code="g">21(2017), 4 vom: 28. Feb., Seite 977-996</subfield><subfield code="w">(DE-627)SPR008209189</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:21</subfield><subfield code="g">year:2017</subfield><subfield code="g">number:4</subfield><subfield code="g">day:28</subfield><subfield code="g">month:02</subfield><subfield code="g">pages:977-996</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s10044-017-0609-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">21</subfield><subfield code="j">2017</subfield><subfield code="e">4</subfield><subfield code="b">28</subfield><subfield code="c">02</subfield><subfield code="h">977-996</subfield></datafield></record></collection>
|
score |
7.4003353 |