Publication Cover
Internet Histories
Digital Technology, Culture and Society
Volume 7, 2023 - Issue 4
408
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Sorting URLs out: seeing the web through infrastructural inversion of archival crawling

ORCID Icon
Pages 386-401 | Received 13 Apr 2023, Accepted 11 Sep 2023, Published online: 16 Sep 2023

References

  • Acker, A. (2015). How cells became records: Standardization and infrastructure in tissue culture. Archival Science, 15(1), 1–24. https://doi.org/10.1007/s10502-013-9213-x
  • Ainsworth, S. G., Nelson, M. L., & Van de Sompel, H. (2015). Only one out of five archived web pages existed as presented. Proceedings of the 26th ACM Conference on Hypertext & Social Media (pp. 257–266). https://doi.org/10.1145/2700171.2791044
  • Ben-David, A., & Amram, A. (2018). The Internet Archive and the socio-technical construction of historical facts. Internet Histories, 2(1-2), 179–201. https://doi.org/10.1080/24701475.2018.1455412
  • Bowker, G. C. (1994). Science on the run: Information management and industrial geophysics at Schlumberger, 1920-1940. MIT Press.
  • Bowker, G. C., & Star, S. L. (2000). Sorting things out: Classification and its consequences. MIT Press.
  • Brilmyer, G. (2018). Archival assemblages: Applying disability studies’ political/relational model to archival description. Archival Science, 18(2), 95–118. https://doi.org/10.1007/s10502-018-9287-6
  • Brügger, N. (2018). The archived web: Doing history in the digital age. MIT Press.
  • Brügger, N. (Ed.). (2017). Web 25: Histories from the first 25 years of the World Wide Web. Peter Lang.
  • Brügger, N., & Milligan, I. (Eds.) (2018). The Sage handbook of web history (1st ed.). SAGE.
  • Brunelle, J. F., Kelly, M., Weigle, M. C., & Nelson, M. L. (2016). The impact of JavaScript on archivability. International Journal on Digital Libraries, 17(2), 95–117. https://doi.org/10.1007/s00799-015-0140-8
  • Caswell, M., Punzalan, R., & Sangwand, T.-K. (2017). Critical archival studies: An introduction. Journal of Critical Library and Information Studies, 1(2), 1–8. https://doi.org/10.24242/jclis.v1i2.50
  • Costa, M., Gomes, D., & Silva, M. J. (2017). The evolution of web archiving. International Journal on Digital Libraries, 18(3), 191–205. https://doi.org/10.1007/s00799-016-0171-9
  • Edwards, P. N., Jackson, S. J., Bowker, G. C., & Knobel, C. P. (2007). Understanding infrastructure: Dynamics, tensions and design
  • Gilliland, A. (2011). Neutrality, social justice and the obligations of archival education and educators in the twenty-first century. Archival Science, 11(3-4), 193–209. https://doi.org/10.1007/s10502-011-9147-0
  • Hegarty, K. (2022). The invention of the archived web: Tracing the influence of library frameworks on web archiving infrastructure. Internet Histories, 6(4), 432–451. https://doi.org/10.1080/24701475.2022.2103988
  • Jackson, S. J. (2014). Rethinking Repair. In T. Gillespie, P. J. Boczkowski, & K. A. Foot (Eds.), Media technologies: Essays on communication, materiality, and society. MIT Press. https://doi.org/10.7551/mitpress/9780262525374.001.0001
  • Justie, B. (2021). Little history of CAPTCHA. Internet Histories, 5(1), 30–47. https://doi.org/10.1080/24701475.2020.1831197
  • Kelly, M., Brunelle, J. F., Weigle, M. C., & Nelson, M. L. (2013). On the change in archivability of websites over time. In T. Aalberg, C. Papatheodorou, M. Dobreva, G. Tsakonas, & C. J. Farrugia (Eds.), Research and advanced technology for digital libraries (Vol. 8092, pp. 35–47). Springer. https://doi.org/10.1007/978-3-642-40501-3_5
  • Laursen, D., & Møldrup-Dalum, P. (2017). Looking back, looking forward: 10 years of development to collect, preserve, and access the Danish web. In N. Brügger (Ed.), Web 25 (pp. 207–227). Peter Lang.
  • Loukissas, Y. A. (2019). Collecting Infrastructures. In All data are local: Thinking critically in a data-driven society. MIT Press. https://doi.org/10.7551/mitpress/11543.001
  • Maemura, E. (2023). All WARC and no playback: The materialities of data-centered web archives research. Big Data & Society, 10(1). https://doi.org/10.1177/20539517231163172
  • Maemura, E., Worby, N., Milligan, I., & Becker, C. (2018). If these crawls could talk: Studying and documenting web archives provenance. Journal of the Association for Information Science and Technology, 69(10), 1223–1233. https://doi.org/10.1002/asi.24048
  • Mayernik, M. S., Wallis, J. C., & Borgman, C. L. (2013). Unearthing the infrastructure: Humans and sensors in field-based scientific research. Computer Supported Cooperative Work (CSCW), 22(1), 65–101. https://doi.org/10.1007/s10606-012-9178-y
  • Milligan, I. (2019). History in the age of abundance?: How the web is transforming historical research. McGill-Queen’s University Press.
  • Nadim, T. (2016). Data Labours: How the sequence databases GenBank and EMBL-bank make data. Science as Culture, 25(4), 496–519. https://doi.org/10.1080/09505431.2016.1189894
  • Ogden, J. (2022). “Everything on the internet can be saved”: Archive Team, Tumblr and the cultural significance of web archiving. Internet Histories, 6(1-2), 113–132. https://doi.org/10.1080/24701475.2021.1985835
  • Ogden, J., Halford, S., & Carr, L. (2017). Observing web archives: The case for an ethnographic study of web archiving. Proceedings of the 2017 ACM on Web Science Conference (pp. 299–308). https://doi.org/10.1145/3091478.3091506
  • Paris, B. S., Cath, C., & West, S. M. (2023). Radical infrastructure: Building beyond the failures of past imaginaries for networked communication. New Media & Society. Advance online publication. https://doi.org/10.1177/14614448231152546
  • Plantin, J.-C., Lagoze, C., Edwards, P. N., & Sandvig, C. (2018). Infrastructure studies meet platform studies in the age of Google and Facebook. New Media & Society, 20(1), 293–310. https://doi.org/10.1177/1461444816661553
  • Praetzellis, M. (2022, December 29). Identify and avoid crawler traps. Archive-It Help Center. https://support.archive-it.org/hc/en-us/articles/208332943-Identify-and-avoid-crawler-traps-
  • Punzalan, R. L., & Caswell, M. (2016). Critical directions for archival approaches to social justice. The Library Quarterly, 86(1), 25–42. https://doi.org/10.1086/684145
  • Ribes, D., & Jackson, S. J. (2013). Data bite man: The work of sustaining a long-term study. In L. Gitelman (Ed.), “Raw data” is an oxymoron (pp. 147–166). MIT Press.
  • Sandvig, C. (2013). The internet as infrastructure. In W. H. Dutton (Ed.), The Oxford handbook of Internet studies (1st ed., pp. 86–108). Oxford University Press.
  • Schostag, S., & Fønss-Jørgensen, E. (2012). Webarchiving: Legal deposit of Internet in Denmark. A curatorial perspective. Microform & Digitization Review, 41(3-4), 110–120. https://doi.org/10.1515/mir-2012-0018
  • Star, S. L., & Griesemer, J. R. (1989). Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in Berkeley’s museum of vertebrate zoology, 1907-39. Social Studies of Science, 19(3), 387–420. https://doi.org/10.1177/030631289019003001
  • Star, S. L., & Ruhleder, K. (1996). Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research, 7(1), 111–134. https://doi.org/10.1287/isre.7.1.111
  • Summers, E. (2020). Appraisal talk in web archives. Archivaria, 89, 70–102.
  • Summers, E., & Punzalan, R. (2017). Bots, seeds and people: Web archives as infrastructure. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 821–834). https://doi.org/10.1145/2998181.2998345
  • Sundin, O. (2011). Janitors of knowledge: Constructing knowledge in the everyday life of Wikipedia editors. Journal of Documentation, 67(5), 840–862. https://doi.org/10.1108/00220411111164709
  • ten Oever, N. (2023). Norm conflict in the governance of transnational and distributed infrastructures: The case of Internet routing. Globalizations, 20(1), 184–200. https://doi.org/10.1080/14747731.2021.1953221
  • Thomer, A. K., Starks, J. R., Rayburn, A., & Lenard, M. C. (2022). Maintaining repositories, databases, and digital collections in memory institutions: An integrative review. Proceedings of the Association for Information Science and Technology, 59(1), 310–323. https://doi.org/10.1002/pra2.755
  • Thylstrup, N. B. (2018). The politics of mass digitization. MIT Press.
  • Trace, C. B. (2022). Archives, information infrastructure, and maintenance work. Digital Humanities Quarterly, 016(1).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.