Data Here and There: Studying Web Archives Research Infrastructures in Danish and Canadian Settings. Maemura, E. Ph.D. Thesis, November, 2021. Accepted: 2021-11-30T19:39:13Z
Data Here and There: Studying Web Archives Research Infrastructures in Danish and Canadian Settings [link]Paper  abstract   bibtex   
Web archives collections are an important site for preserving digital cultural heritage, capturing online resources and the underlying data of web pages and websites over time. This dissertation explores the data practices involved in the curation and scholarly uses of archived web data, and the emerging humanities research infrastructures that support this work. I focus on how data are created by and interact with multiple, varied, conflicting and often opaque systems from collection through to analysis and visualization, involving a range of actors and systems. Through interviews and observations in two settings centred on emerging research infrastructures from national libraries and academic libraries, I explore how web archiving processes involve technical systems that impose or manipulate data structures, as well as policies, legal restrictions, and a range of decisions and interpretations of the data made by the people involved in curation practices. My analysis is guided by the approach of infrastructure studies towards describing, documenting and contextualizing archived web data at varying scales and through their sociotechnical relationships. By studying these settings ethnographically, I seek to understand how data produced here are different from data produced there; in spite of the use of similar technologies and standard formats of data, I reveal how the particular contexts of these research infrastructures serve to shape the data found in each. I highlight the role of categories and classifications both in the construction of digital materials and the design and tailoring of computational systems. Ultimately, I argue that one cannot study, describe, or otherwise work critically with data without also understanding the classification systems and infrastructures through which data categories are developed and applied. It is in fact these categories of data that determine judgments and decision-making, and impact movement and legibility between systems and contexts; digital curation practices are in essence categorical work. I conclude by highlighting contributions to digital curation theory and practice and new directions for emerging work in critical data studies to consider how data categories are compatible or incompatible, and the complex sociotechnical relationships between categories, meanings and uses in data-centered algorithmic systems.
@phdthesis{maemura_data_2021,
	type = {Thesis},
	title = {Data {Here} and {There}: {Studying} {Web} {Archives} {Research} {Infrastructures} in {Danish} and {Canadian} {Settings}},
	shorttitle = {Data {Here} and {There}},
	url = {https://tspace.library.utoronto.ca/handle/1807/109294},
	abstract = {Web archives collections are an important site for preserving digital cultural heritage, capturing online resources and the underlying data of web pages and websites over time. This dissertation explores the data practices involved in the curation and scholarly uses of archived web data, and the emerging humanities research infrastructures that support this work. I focus on how data are created by and interact with multiple, varied, conflicting and often opaque systems from collection through to analysis and visualization, involving a range of actors and systems. Through interviews and observations in two settings centred on emerging research infrastructures from national libraries and academic libraries, I explore how web archiving processes involve technical systems that impose or manipulate data structures, as well as policies, legal restrictions, and a range of decisions and interpretations of the data made by the people involved in curation practices. 
My analysis is guided by the approach of infrastructure studies towards describing, documenting and contextualizing archived web data at varying scales and through their sociotechnical relationships. By studying these settings ethnographically, I seek to understand how data produced here are different from data produced there; in spite of the use of similar technologies and standard formats of data, I reveal how the particular contexts of these research infrastructures serve to shape the data found in each. I highlight the role of categories and classifications both in the construction of digital materials and the design and tailoring of computational systems. Ultimately, I argue that one cannot study, describe, or otherwise work critically with data without also understanding the classification systems and infrastructures through which data categories are developed and applied. It is in fact these categories of data that determine judgments and decision-making, and impact movement and legibility between systems and contexts; digital curation practices are in essence categorical work. I conclude by highlighting contributions to digital curation theory and practice and new directions for emerging work in critical data studies to consider how data categories are compatible or incompatible, and the complex sociotechnical relationships between categories, meanings and uses in data-centered algorithmic systems.},
	language = {en},
	urldate = {2021-12-02},
	author = {Maemura, Emily},
	month = nov,
	year = {2021},
	note = {Accepted: 2021-11-30T19:39:13Z},
}

Downloads: 0