Reborn digital and black box – impact of archiving processes on holdings of Web archives
DOI:
https://doi.org/10.12775/AKZ.2019.008Keywords
archiwizacja Webu, archiwa Webu, źródła cyfrowe, zasoby cyfrowe, historia Webu, reborn digital, black box,Abstract
The article contemplates general characteristics of holdings of various Web archives. Understanding the problem formed in the title seems to be crucial for reflections on this new type of sources and using it research. A user aiming at familiarizing with the old Web must know what is stored in this type of digital repositories and what characterizes these holdings. The problem was presented on two levels, related to two stages of archiving – selection and acquisition. The first aspect, of theoretical character, depends mostly on gathering sources using the method of harvesting (with crawlers). Their capabilities and limitations result in what will be archived and in what form. It must be noted, that this can lead to a certain deformation of Web sources, thus after archiving they will not be exactly what they were before. The second aspect, of practical character, is an effect of selection, i.e. all decisions made by archives’ employees before the process of gathering starts. These decisions comprise of, among others, specifying the aim and scope of archiving and choosing strategies to accomplish them. The text presents two basic strategies – mass archiving and selective archiving. An important obstacle for Web archives users is lack of information about selection criteria or crawlers’ logs. Holdings of the old Web can be a kind of mystery, because not always one can describe, what is in them and what is not, and what is the reason for this state.
References
„About DACHS | DACHS | East Asian Library”. Dostęp 26.08.2019. https://www.zo.uni-heidelberg.de/boa/digital_resources/dachs/about_en.html.
AlSum, Ahmed, Michele C. Weigle, Michael L. Nelson, i Herbert Van de Sompel. „Profiling Web Archive Coverage for Top-Level Domain and Content Language”. International Journal on Digital Libraries 14, nr 3–4 (sierpień 2014): 149–66. https://doi.org/10.1007/s00799-014-0118-y.
Archive-It. „About Us”. Dostęp 26.08.2019. https://archive-it.org/blog/learn-more/.
Archive-It. „Harvard University Archives”. Dostęp 26.08.2019. https://archive-it.org/organizations/935.
Archive-It. „MIT Libraries”. Dostęp 26.08.2019. https://archive-it.org/home/MIT.
„Archive Team Collections.” Dostęp 26.08.2019. https://archive.org/details/archiveteam?
tab=about.
Ben-David, Anat, i Adam Amram. „The Internet Archive and the Socio-Technical Construction of Historical Facts”. Internet Histories 2, nr 1–2 (3 kwiecień 2018): 179–201. https://doi.org/10.1080/24701475.2018.1455412.
Bodleian Libraries. „BEAM: Bodleian Libraries’ Web Archive”. Dostęp 26.08.2019. https://www.bodleian.ox.ac.uk/beam/webarchive.
„Browse DACHS | DACHS | East Asian Library”. Dostęp 26.08.2019. https://www.zo.uni-heidelberg.de/boa/digital_resources/dachs/browse_en.html.
Brügger, Niels. Archiving Websites: general Considerations and Strategies. Aarhus: The Centre for Internet Research, 2005. http://cfi.au.dk/fileadmin/www.cfi.au.dk/publikationer/
archiving_underside/archiving.pdf.
Brügger, Niels. „Web Archiving – Between Past, Present, and Future.” W Handbook of Internet Studies, zredagowali Mia Consalvo, Charles Ess, 24–42. Oxford, UK: Wiley-Blackwell, 2011.
Brügger, Niels. „Web Historiography and Internet Studies: Challenges and Perspectives”. New Media & Society 15, nr 5 (sierpień 2013): 752–64. https://doi.org/10.1177/
Brügger, Niels. „Wenn Das Web Vergangenheit Wird: Web-Geschichtsschreibung, Digitale Geschichte Und Internet-Forschung / When the Present Web Is Later the Past: Web Historiography, Digital History and Internet Studies”. Historical Social Research 37, No. 4 (2012): 102–117. https://doi.org/10.12759/HSR.37.2012.4.102-117.
Columbia University Libraries. „Web Archives at Columbia.” Dostęp 26.08.2019. https://library.columbia.edu/collections/web-archives.html.
Common Crawl. „In a Nutshell, Here’s Who We Are.” Dostęp 26.08.2019. https://commoncrawl.org/about/.
Costa, Miguel, i Mário J. Silva. „Evaluating Web Archive Search Systems”. W Web Information Systems Engineering – WISE 2012, zredagowali X. Sean Wang, Isabel Cruz, Alex Delis, i Guangyan Huang, 440–454. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. https://doi.org/10.1007/978-3-642-35063-4_32.
„DACHS – Leiden: The Digital Archive for Chinese Studies, Leiden Division - Homepage”. Dostęp 26.08.2019. https://projects.zo.uni-heidelberg.de/archive2/DACHS_Leiden/.
„End of Term Web Archive: U.S. Government Websites”. Dostęp 26.08.2019. http://eotarchive.cdlib.org/.
European University Institute. „About the Web Archive of the EU Institutions”. Dostęp 26.08.2019. https://www.eui.eu/Research/HistoricalArchivesOfEU/WebsitesArchives
ofEUInstitutions.aspx.
Geereart, Friedel, i Sébastien Soyez. „The first steps towards a Belgian web archive: a federal strategy.” Dostęp 26.08.2019. http://netpreserve.org/ga2019/wp-content/uploads/
/07/IIPCWAC2019-FRIEDEL_GEERAERT__SEBASTIEN_SOYEZ-The_first
_steps_towards_a_Belgian_web_archive-a_federal_strategy.pdf.
Holub, Karolina, i Ingeborg Rudomino. “A decade of web archiving in the National and University Library in Zagreb.” Dostęp 26.08.2019. http://library.ifla.org/1092/1/090-holub-en.pdf.
International Organization for Standardization. Information and documentation – Statistics and quality issues for web archiving. ISO/TR 14873. Genewa: ISO, opublikowana 01. 12.2013.
„Internet Archive: About IA”. Dostęp 26.08.2019. https://archive.org/about/.
Keskitalo, Esa-Pekka. Web Archiving in Finland: memorandum for the members of the CDNL. 2010. http://www.doria.fi/bitstream/handle/10024/67051/webarchivingfinland_cdnl
.pdf.
Koninklijke Bibliotheek. „Selection.” Dostęp 26.08.2019. https://www.kb.nl/en/organisation/
research-expertise/long-term-usability-of-digital-resources/web-archiving/selection.
Konopa, Bartłomiej. „Archiwa Internetu jako nowe bazy źródłowe”. Archiwa - Kancelarie – Zbiory 9(11) (2018): 49–62. https://doi.org/10.12775/AKZ.2018.003.
Król, Karol. „Z archiwów internetu: zmiany w sposobie prezentacji oferty agroturystycznej.” Marketing i Rynek 24, nr 11 (2017): 19–27. http://homeproject.pl/wp-content/
uploads/2018/12/Krol_MiR_11_2017_NR.pdf.
Library of Congress. „Archived Websites | Web Archiving | Programs at the Library of Congress | Library of Congress”. Dostęp 26.08.2019. https://www.loc.gov/programs/web-archiving/archived-websites/.
Masanès, Julien. „Selection for Web Archives.” W Web Archiving, zredagował Julien Masanès, 71–91. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.
Masanes, Julien. „Web Archiving Methods and Approaches: A Comparative Study”. Library Trends 54, nr 1 (2005): 72–90. https://doi.org/10.1353/lib.2006.0005.
Milligan, Ian. „Lost in the Infinite Archive: The Promise and Pitfalls of Web Archives”. International Journal of Humanities and Arts Computing 10, nr 1 (marzec 2016): 78–94. https://doi.org/10.3366/ijhac.2016.0161.
Nacionalna i sveučilišna knjižnica u Zagrebu, National and University Library in Zagreb, i University Computing Centre Zagreb Sveučilišni računski centar (Srce). „Hrvatski arhiv weba, HAW.” Dostęp 26.08.2019. http://haw.nsk.hr/en/thematic-harvestings.
Nacionalna i sveučilišna knjižnica u Zagrebu, National and University Library in Zagreb, i University Computing Centre Zagreb Sveučilišni računski centar (Srce). „Thematic harvesting.” Dostęp 26.08.2019. http://haw.nsk.hr/en.
National Diet Library. „Archiving Internet Information.” Dostęp 26.08.2019. https://www.ndl.go.jp/en/collect/internet/index.html.
Netarkivet. „Selektive høstninger.” Dostęp 26.08.2019. http://netarkivet.dk/om-netarkivet/
selektive-hostninger_2016/.
Nielsen, Janne. Using Web Archives in Research: an Introduction. Aarhus: NetLab, 2016. http://www.netlab.dk/wp-content/uploads/2016/10/Nielsen_Using_Web_Archives_
in_Research.pdf.
„Ondarenet”. Dostęp 26.08.2019. http://www.ondarenet.kultura.ejgv.euskadi.eus:8085/
ondarenet/.
Pamuła-Cieślak, Natalia. „Ukryty Internet – nowe podejście.” W Oblicza przestrzeni informacyjnej w dobie Web 2.0, zredagowali Katarzyna Domańska, Ewa Głowacka i Paweł Marzec, 35–48. Bydgoszcz: Wydawnictwo Uniwersytetu Kazimierza Wielkiego, 2016.
Pedicat. „Mission and objectives.” Dostęp 26.08.2019. https://www.padicat.cat/en/about-us/what-padicat/mission-and-objectives.
Pedicat. „Monographics.” Dostęp 26.08.2019. https://www.padicat.cat/en/search-and-discover/monographics.
Schostag, Sabine, i Eva Fønss-Jørgensen. “Webarchiving: Legal deposit of internet in Denmark: a curatorial perspective.” Microform & Digitization Review 41, nr 3-4 (2012): 110–120.
Spaniol, Marc, Dimitar Denev, Arturas Mazeika, Gerhard Weikum, i Pierre Senellart. „Data Quality in Web Archiving”. W WICOW '09 Proceedings of the 3rd workshop on Information credibility on the web, 19–26. Nowy Jork: ACM Press, 2009. https://doi.org/10.1145/1526993.1526999.
Summers, Ed, i Ricardo Punzalan. „Bots, Seeds and People: Web Archives as Infrastructure”. W Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW ’17, 821–834. Portland, Oregon, USA: ACM Press, 2017. https://doi.org/10.1145/2998181.2998345.
The British Library. „UK Web Archive”. Dostęp 26.08.2019. https://www.bl.uk/collection-guides/uk-web-archive.
The National Archives. „UK Government Web Archive”. Dostęp 26.08.2019. http://www.nationalarchives.gov.uk/webarchive/.
The National Archives, Washington D.C. „Congressional & Federal Government Web Harvests.” Dostęp 26.08.2019. https://www.webharvest.gov/.
Thouvenin, Florent, Peter Hettich, Herbert Burkert, i Urs Gasser. Remembering and Forgetting in the Digital Age. T. 38. Law, Governance and Technology Series. Cham: Springer International Publishing, 2018. https://doi.org/10.1007/978-3-319-90230-2.
Trove. „ Australian Web Archive.” Dostęp 26.08.2019. https://trove.nla.gov.au/website.
UK Web Archives. „Topics and Themes.” Dostęp 26.08.2019. https://www.webarchive.org.uk/
en/ukwa/collection.
UNT Libraries. „CyberCemetery Home.” Dostęp 26.08.2019. https://govinfo.library.unt.edu/.
Vernalte, Francisca P. , i Sonia M. Maciá. „Capturing the Basque Web.” Dostęp 26.08.2019. http://eprints.rclis.org/13164/1/EN_Lida_paper_Ondarenet_APA.pdf.
Web Archive Singapore. „Frequently asked questions.” Dostęp 26.08.2019. http://eresources.nlb.gov.sg/webarchives/faq.
Web Archive Singapore. „Special collections.” Dostęp 26.08.2019. http://eresources.nlb.gov.sg/webarchives/special-collection.
„Web Archiving Project (WARP)”. Dostęp 26.08.2019. http://warp.da.ndl.go.jp/?_lang=en.
„Wikimedia Foundation Collections.” Dostęp 26.08.2019. https://archive.org/details/
wikimediadownloads?tab=collection.
Downloads
Published
How to Cite
Issue
Section
Stats
Number of views and downloads: 634
Number of citations: 0