The impact of JavaScript on archivability
Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require us...
Ausführliche Beschreibung
Autor*in: |
Brunelle, Justin F. [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2015 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Springer-Verlag Berlin Heidelberg 2015 |
---|
Übergeordnetes Werk: |
Enthalten in: International journal on digital libraries - Springer Berlin Heidelberg, 1997, 17(2015), 2 vom: 25. Jan., Seite 95-117 |
---|---|
Übergeordnetes Werk: |
volume:17 ; year:2015 ; number:2 ; day:25 ; month:01 ; pages:95-117 |
Links: |
---|
DOI / URN: |
10.1007/s00799-015-0140-8 |
---|
Katalog-ID: |
OLC2051434905 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2051434905 | ||
003 | DE-627 | ||
005 | 20230502153701.0 | ||
007 | tu | ||
008 | 200819s2015 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s00799-015-0140-8 |2 doi | |
035 | |a (DE-627)OLC2051434905 | ||
035 | |a (DE-He213)s00799-015-0140-8-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 020 |a 004 |q VZ |
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Brunelle, Justin F. |e verfasserin |4 aut | |
245 | 1 | 0 | |a The impact of JavaScript on archivability |
264 | 1 | |c 2015 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Springer-Verlag Berlin Heidelberg 2015 | ||
520 | |a Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. | ||
650 | 4 | |a Web architecture | |
650 | 4 | |a Web archiving | |
650 | 4 | |a Digital preservation | |
700 | 1 | |a Kelly, Mat |4 aut | |
700 | 1 | |a Weigle, Michele C. |4 aut | |
700 | 1 | |a Nelson, Michael L. |4 aut | |
773 | 0 | 8 | |i Enthalten in |t International journal on digital libraries |d Springer Berlin Heidelberg, 1997 |g 17(2015), 2 vom: 25. Jan., Seite 95-117 |w (DE-627)223267902 |w (DE-600)1357321-4 |w (DE-576)059412127 |x 1432-5012 |7 nnns |
773 | 1 | 8 | |g volume:17 |g year:2015 |g number:2 |g day:25 |g month:01 |g pages:95-117 |
856 | 4 | 1 | |u https://doi.org/10.1007/s00799-015-0140-8 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OPC-BBI | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_2018 | ||
912 | |a GBV_ILN_4277 | ||
951 | |a AR | ||
952 | |d 17 |j 2015 |e 2 |b 25 |c 01 |h 95-117 |
author_variant |
j f b jf jfb m k mk m c w mc mcw m l n ml mln |
---|---|
matchkey_str |
article:14325012:2015----::hipcojvsrpoac |
hierarchy_sort_str |
2015 |
publishDate |
2015 |
allfields |
10.1007/s00799-015-0140-8 doi (DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p DE-627 ger DE-627 rakwb eng 020 004 VZ 24,1 ssgn Brunelle, Justin F. verfasserin aut The impact of JavaScript on archivability 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer-Verlag Berlin Heidelberg 2015 Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. Web architecture Web archiving Digital preservation Kelly, Mat aut Weigle, Michele C. aut Nelson, Michael L. aut Enthalten in International journal on digital libraries Springer Berlin Heidelberg, 1997 17(2015), 2 vom: 25. Jan., Seite 95-117 (DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 1432-5012 nnns volume:17 year:2015 number:2 day:25 month:01 pages:95-117 https://doi.org/10.1007/s00799-015-0140-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 AR 17 2015 2 25 01 95-117 |
spelling |
10.1007/s00799-015-0140-8 doi (DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p DE-627 ger DE-627 rakwb eng 020 004 VZ 24,1 ssgn Brunelle, Justin F. verfasserin aut The impact of JavaScript on archivability 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer-Verlag Berlin Heidelberg 2015 Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. Web architecture Web archiving Digital preservation Kelly, Mat aut Weigle, Michele C. aut Nelson, Michael L. aut Enthalten in International journal on digital libraries Springer Berlin Heidelberg, 1997 17(2015), 2 vom: 25. Jan., Seite 95-117 (DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 1432-5012 nnns volume:17 year:2015 number:2 day:25 month:01 pages:95-117 https://doi.org/10.1007/s00799-015-0140-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 AR 17 2015 2 25 01 95-117 |
allfields_unstemmed |
10.1007/s00799-015-0140-8 doi (DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p DE-627 ger DE-627 rakwb eng 020 004 VZ 24,1 ssgn Brunelle, Justin F. verfasserin aut The impact of JavaScript on archivability 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer-Verlag Berlin Heidelberg 2015 Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. Web architecture Web archiving Digital preservation Kelly, Mat aut Weigle, Michele C. aut Nelson, Michael L. aut Enthalten in International journal on digital libraries Springer Berlin Heidelberg, 1997 17(2015), 2 vom: 25. Jan., Seite 95-117 (DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 1432-5012 nnns volume:17 year:2015 number:2 day:25 month:01 pages:95-117 https://doi.org/10.1007/s00799-015-0140-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 AR 17 2015 2 25 01 95-117 |
allfieldsGer |
10.1007/s00799-015-0140-8 doi (DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p DE-627 ger DE-627 rakwb eng 020 004 VZ 24,1 ssgn Brunelle, Justin F. verfasserin aut The impact of JavaScript on archivability 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer-Verlag Berlin Heidelberg 2015 Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. Web architecture Web archiving Digital preservation Kelly, Mat aut Weigle, Michele C. aut Nelson, Michael L. aut Enthalten in International journal on digital libraries Springer Berlin Heidelberg, 1997 17(2015), 2 vom: 25. Jan., Seite 95-117 (DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 1432-5012 nnns volume:17 year:2015 number:2 day:25 month:01 pages:95-117 https://doi.org/10.1007/s00799-015-0140-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 AR 17 2015 2 25 01 95-117 |
allfieldsSound |
10.1007/s00799-015-0140-8 doi (DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p DE-627 ger DE-627 rakwb eng 020 004 VZ 24,1 ssgn Brunelle, Justin F. verfasserin aut The impact of JavaScript on archivability 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer-Verlag Berlin Heidelberg 2015 Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. Web architecture Web archiving Digital preservation Kelly, Mat aut Weigle, Michele C. aut Nelson, Michael L. aut Enthalten in International journal on digital libraries Springer Berlin Heidelberg, 1997 17(2015), 2 vom: 25. Jan., Seite 95-117 (DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 1432-5012 nnns volume:17 year:2015 number:2 day:25 month:01 pages:95-117 https://doi.org/10.1007/s00799-015-0140-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 AR 17 2015 2 25 01 95-117 |
language |
English |
source |
Enthalten in International journal on digital libraries 17(2015), 2 vom: 25. Jan., Seite 95-117 volume:17 year:2015 number:2 day:25 month:01 pages:95-117 |
sourceStr |
Enthalten in International journal on digital libraries 17(2015), 2 vom: 25. Jan., Seite 95-117 volume:17 year:2015 number:2 day:25 month:01 pages:95-117 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Web architecture Web archiving Digital preservation |
dewey-raw |
020 |
isfreeaccess_bool |
false |
container_title |
International journal on digital libraries |
authorswithroles_txt_mv |
Brunelle, Justin F. @@aut@@ Kelly, Mat @@aut@@ Weigle, Michele C. @@aut@@ Nelson, Michael L. @@aut@@ |
publishDateDaySort_date |
2015-01-25T00:00:00Z |
hierarchy_top_id |
223267902 |
dewey-sort |
220 |
id |
OLC2051434905 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2051434905</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230502153701.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00799-015-0140-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2051434905</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00799-015-0140-8-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">020</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Brunelle, Justin F.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">The impact of JavaScript on archivability</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer-Verlag Berlin Heidelberg 2015</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Web architecture</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Web archiving</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Digital preservation</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kelly, Mat</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Weigle, Michele C.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nelson, Michael L.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">International journal on digital libraries</subfield><subfield code="d">Springer Berlin Heidelberg, 1997</subfield><subfield code="g">17(2015), 2 vom: 25. Jan., Seite 95-117</subfield><subfield code="w">(DE-627)223267902</subfield><subfield code="w">(DE-600)1357321-4</subfield><subfield code="w">(DE-576)059412127</subfield><subfield code="x">1432-5012</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:17</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:01</subfield><subfield code="g">pages:95-117</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00799-015-0140-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">17</subfield><subfield code="j">2015</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">01</subfield><subfield code="h">95-117</subfield></datafield></record></collection>
|
author |
Brunelle, Justin F. |
spellingShingle |
Brunelle, Justin F. ddc 020 ssgn 24,1 misc Web architecture misc Web archiving misc Digital preservation The impact of JavaScript on archivability |
authorStr |
Brunelle, Justin F. |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)223267902 |
format |
Article |
dewey-ones |
020 - Library & information sciences 004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1432-5012 |
topic_title |
020 004 VZ 24,1 ssgn The impact of JavaScript on archivability Web architecture Web archiving Digital preservation |
topic |
ddc 020 ssgn 24,1 misc Web architecture misc Web archiving misc Digital preservation |
topic_unstemmed |
ddc 020 ssgn 24,1 misc Web architecture misc Web archiving misc Digital preservation |
topic_browse |
ddc 020 ssgn 24,1 misc Web architecture misc Web archiving misc Digital preservation |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
International journal on digital libraries |
hierarchy_parent_id |
223267902 |
dewey-tens |
020 - Library & information sciences 000 - Computer science, knowledge & systems |
hierarchy_top_title |
International journal on digital libraries |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)223267902 (DE-600)1357321-4 (DE-576)059412127 |
title |
The impact of JavaScript on archivability |
ctrlnum |
(DE-627)OLC2051434905 (DE-He213)s00799-015-0140-8-p |
title_full |
The impact of JavaScript on archivability |
author_sort |
Brunelle, Justin F. |
journal |
International journal on digital libraries |
journalStr |
International journal on digital libraries |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2015 |
contenttype_str_mv |
txt |
container_start_page |
95 |
author_browse |
Brunelle, Justin F. Kelly, Mat Weigle, Michele C. Nelson, Michael L. |
container_volume |
17 |
class |
020 004 VZ 24,1 ssgn |
format_se |
Aufsätze |
author-letter |
Brunelle, Justin F. |
doi_str_mv |
10.1007/s00799-015-0140-8 |
dewey-full |
020 004 |
title_sort |
the impact of javascript on archivability |
title_auth |
The impact of JavaScript on archivability |
abstract |
Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. © Springer-Verlag Berlin Heidelberg 2015 |
abstractGer |
Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. © Springer-Verlag Berlin Heidelberg 2015 |
abstract_unstemmed |
Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study. © Springer-Verlag Berlin Heidelberg 2015 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_11 GBV_ILN_70 GBV_ILN_2018 GBV_ILN_4277 |
container_issue |
2 |
title_short |
The impact of JavaScript on archivability |
url |
https://doi.org/10.1007/s00799-015-0140-8 |
remote_bool |
false |
author2 |
Kelly, Mat Weigle, Michele C. Nelson, Michael L. |
author2Str |
Kelly, Mat Weigle, Michele C. Nelson, Michael L. |
ppnlink |
223267902 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s00799-015-0140-8 |
up_date |
2024-07-04T04:25:37.230Z |
_version_ |
1803621118495424512 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2051434905</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230502153701.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00799-015-0140-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2051434905</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00799-015-0140-8-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">020</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Brunelle, Justin F.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">The impact of JavaScript on archivability</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer-Verlag Berlin Heidelberg 2015</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today’s archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Web architecture</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Web archiving</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Digital preservation</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kelly, Mat</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Weigle, Michele C.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nelson, Michael L.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">International journal on digital libraries</subfield><subfield code="d">Springer Berlin Heidelberg, 1997</subfield><subfield code="g">17(2015), 2 vom: 25. Jan., Seite 95-117</subfield><subfield code="w">(DE-627)223267902</subfield><subfield code="w">(DE-600)1357321-4</subfield><subfield code="w">(DE-576)059412127</subfield><subfield code="x">1432-5012</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:17</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:01</subfield><subfield code="g">pages:95-117</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00799-015-0140-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">17</subfield><subfield code="j">2015</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">01</subfield><subfield code="h">95-117</subfield></datafield></record></collection>
|
score |
7.397565 |