626
Views
0
CrossRef citations to date
0
Altmetric
Articles

Care, collaboration, and service in academic data work: biocuration as ‘academia otherwise’

&
Pages 683-701 | Received 31 Jul 2023, Accepted 01 Feb 2024, Published online: 19 Feb 2024

ABSTRACT

This paper discusses the emergent field of biocuration, taking it as a case of academic data work. Biocurators organise, manage, and enrich the now vast quantities of data that are produced by the contemporary biosciences, but their work remains largely invisible to the scholarly communities that make use of it. Based on ethnographic engagement with the field and interviews with biocurators, and mobilising conceptual frames of care and epistemic justice, we examine how biocurators frame their data practices, arguing that biocuration can, in emphasising collaboration and care, be seen as an ‘academia otherwise’ that resists dominant narratives of scholarly excellence. At the same time this explicit framing of data work as care work involves a ‘dark side’ that elides the epistemic labour involved in it. In closing we suggest that engagement with biocuration leads us to attend to the ways in which care work constitutes technoscientific knowledge, and to the epistemic contributions it may make.

You go to a conference, and I’ve been called like some kind of gnome, that fills the database at night, and [users] didn’t know how this happened, that there was data there in the morning. It’s like oh, you’re the one! So we’re quite invisible … they don’t always see what you do, the people using your work (Speaker at a biocuration conference, 2021)

This paper engages with a case of academic data work – data labours that are located in, and serve, academic scholarship (Nadim, Citation2016). Specifically, we discuss data work in the context of the biosciences, in the form of an emergent field known as biocuration, an area of practice that involves ‘identifying, organising, correcting, annotating, standardising, and enriching biological data’ (Tang et al., Citation2019, p. 1). As the opening quote suggests, biocuration is under-recognised, to the extent that it is often invisible to its end users. It therefore forms part of a long lineage of invisibilised or devalued data work (Irani, Citation2019). In addition to its under-recognition, in discussing biocuration we will be particularly concerned with the ways in which it is framed and practiced as care work. Biocurators often understand their activities through notions of service, being care-ful, and ‘helping’, and as defined by attentiveness to detail and nuance (Gabrielsen, Citation2023). In the context of contemporary academic work biocuration therefore offers an ‘academia otherwise’ in which such practices are valued over, for instance, publishing or individual ‘excellence’. However, we will also discuss how this emphasis on care involves a ‘dark side’ (Martin et al., Citation2015) that may heighten its under-recognition.

Our arguments are situated in Science and Technology Studies (STS) generally (Law, Citation2017), and in feminist and decolonial approaches to analysis of technoscience more specifically (Milan & Treré, Citation2019). This entails a rejection of the ‘positivist, transcendental empiricism and disembodiment value-freeness’ (Leurs, Citation2017, p. 133) that characterises much data studies, and an emphasis on the situated and contextual nature not just of particular data practices, but research into them. Eschewing the ‘god trick’ of imagined total objectivity (Haraway, Citation1988), scholarship is instead framed as an ‘ethical, self-reflexive and situated attempt to achieve multiple partial views on everyday life practices and experiences’ (Leurs, Citation2017, p. 137). Similarly, we draw on Milan and Treré’s notion of (the) South(s) as ‘a place of (and a proxy for) alterity, resistance, subversion, and creativity’, finding it in sites in which ‘people suffer discrimination, and/or enact resistance to injustice and oppression’ (Citation2019, p. 325.). We will therefore be concerned with how inequities and injustices may be enacted in the context of biocuration, and with how such inequalities might be resisted. In particular we will reflect on questions of epistemic justice, and the ways in which particular kinds of knowledge practices are more or less valued within academic work (Ottinger, Citation2023).

We discuss these literatures in more detail in the sections that follow, beginning by discussing existing work that is related to the study before describing our conceptual frames and methods. The central section delves into our empirical engagements with biocuration and details particular aspects of the data practices it involves, including their framing through languages of care, collaboration, and service, the way in which this characterisation presents biocuration as an ‘academia otherwise’, and the ‘dark side’ of this manner of conceptualising academic data work. We do this in dialogue with existing literature and theory, situating and reflecting on the arguments that we make. We therefore close not by an extended return to the state of the art, but by offering some brief reflections on the implications of our analysis.

Data work as (invisibilised) labour

Critical data studies have explored how data is produced, used, and travels (e.g., Leonelli & Tempini, Citation2020), as well as the labour and (environmental and other) costs involved in creating, cleaning, and maintaining it (e.g., Crawford, Citation2021). The notion of data work captures the activities that comprise this labour, and the efforts that go into adding value to data (Foster et al., Citation2018). Data work thus involves ‘capturing, organizing, analyzing, and using’ particular types of information (Foster et al., Citation2018, p. 1416). Given the ubiquity of data in contemporary societies (Beaulieu & Leonelli, Citation2021), data work is present in a wide array of spaces and domains, and involves a variety of practices, from managing metadata to classification, annotation, or collation.

Key domains in which data work has been studied include the development of artifical intelligence (AI) technologies, healthcare, governance and public administration, and scientific research (Bossen, Pine, et al., Citation2019; Eubanks, Citation2017; Pine et al., Citation2022). In the context of AI Tubaro et al. (Citation2020) identify three forms of data work: ‘“artificial intelligence preparation”, “artificial intelligence verification” and “artificial intelligence impersonation”’ (p. 1). This work can involve tagging or annotating data; constructing queries or producing training data; acting as ‘AI supervisors’ who check machine-produced content or recommendations; flagging offensive content in data sets or online material; and straightforwardly impersonating AI in workflows in which ‘workers [are] hardly distinguishable from algorithms’ (p.7). Increasingly, such data workers are relied upon to correct bias or toxicity in training data, thereby taking on – or being given – responsibility for the values embedded in emergent AI engines (Perrigo, Citation2023; Sambasivan et al., Citation2021). Such work is often carried out by cloud-based microworkers, who work via platforms such as Amazon’s Mechanical Turk in a form of low paid piecework (Gray & Suri, Citation2019; Irani, Citation2015; Citation2019). In the context of healthcare such outsourcing is less common, but it is clear that the need for data work is both opening up new health-related professions and adding new tasks – such as collecting new kinds of data or ensuring their documentation – to existing roles (Bossen, Chen, et al., Citation2019: Pine et al., Citation2022). Heathcare data – and thereby data work – has ‘grown exponentially’ over the last years (Bossen, Chen, et al., Citation2019, p. 466), and is producing novel roles such as the ‘medical scribe’ (Bossen, Pine, et al., Citation2019), as well as creating expectations of data production and management by patients (Piras, Citation2019). Similarly, in the public sector (such as welfare systems or policing), workers are increasingly being asked to produce data for or otherwise engage with algorithmic systems (Egbert & Leese, Citation2020; Eubanks, Citation2017; Møller et al., Citation2020).

A key interest in scholarship across the varied domains in which data work is becoming central concerns how data work is credited and valued, and the ways in which its emergence may reinforce (global) patterns of exploitation and injustice. In contexts such as healthcare the framing of data workers such as medical clerks or scribes as ‘low-paid, unskilled employees’ (Bossen, Pine, et al., Citation2019, p. 78) suggests a devaluation of data work and a lack of acknowledgement of the labour it entails (cf. Green et al., Citation2023). Others have argued that digital platforms and tools that are presented as immaterial, non-human, and purely computational in fact rely on scores of human workers, often in the Global South or marginalised positionalities, and often requiring these workers to engage in exhausting or traumatising labour (Gray & Suri, Citation2019; Irani, Citation2015, Citation2019; Perrigo, Citation2023; Turkopticon, Citation2023). Research has thus explored the ways in which data tasks such as annotation are outsourced and at times rendered invisible to end users, and how workers may experience alienation and a lack of meaning with regard to their labour (Le Ludec et al., Citation2023; Miceli et al., Citation2020; Wang et al., Citation2022). Despite promises of flexibility and autonomy made by microwork platforms (through which outsourced data work is often carried out; Irani, Citation2015), such work is generally experienced as draining and as without meaning in and of itself, as well as involving a high degree of precarity (Kost et al., Citation2018; Meisner et al., Citation2022; Tubaro et al., Citation2020; Wood et al., Citation2019). Satisfaction emerges from the simple possibility of making money, and from the social relations that may exist between microworkers (Grohmann et al., Citation2022; Muldoon & Apostolidis, Citation2023). While data work is highly varied and exists across diverse domains (Pine et al., Citation2022), one central feature is thus that it is readily devalued or invisibilised, and lacks the prestige of other areas of data science (Irani, Citation2015).

Service science, academic data work, and the conditions of academic labour

One area where data work is central is academic research. The accessibility of large quantities of data – for example from social media, enhanced throughput of biological sequencing data, or radio telescopes – has impacted multiple areas of scholarship, and driven a need for data work and workers (Leonelli & Tempini, Citation2020). While some data work in research may be carried out by researchers themselves, it is often delegated to support staff (Plantin, Citation2019, Citation2021). An antecedent of such work is thus what has been termed service science (Gorman & Spohrer, Citation2010): technical support, interactional expertise, and invisible labour that supports the development of other areas of research (Scroggins & Pasquetto, Citation2020). There is a long history of under-recognised and under-valued technical support in the sciences (Shapin, Citation1989), and increasing attention to this form of scientific work (in particular in historical research; Doing, Citation2009; Tansey, Citation2008). The rise of ‘data-intensive’ science seems to have exacerbated the need for such labour (Scroggins & Pasquetto, Citation2020), to the extent that Ribeiro et al. (Citation2023) suggest that contemporary scientific work is marked by a ‘digitalisation paradox’:

while robotics and advanced data analytics aim at simplifying work processes by substituting them, they can also contribute to increasing their complexity in terms of number and diversity of tasks. This is because the use of digital technologies and robotics, while substituting humans in performing key routine tasks (e.g., pipetting), create other routine tasks for which they cannot substitute (i. e. mundane knowledge work with data and robots) (Ribeiro et al., Citation2023, p. 10)

Such emergent ‘routine tasks’ are numerous, from data production and processing to the interactional work of mediating between different knowledge domains, ‘housekeeping’ (such as managing metrics), or data ‘cleaning’ (Leonelli, Citation2016; Nadim, Citation2016; Plantin, Citation2021; Scroggins & Pasquetto, Citation2020; Stevens, Citation2013). Importantly, data work in the context of academic research is often understood as meaningful and valuable in a way that it is not in other settings. Even when it is experienced as repetitive or tedious – for instance as a ‘factory line’ of data processing (Plantin, Citation2021) – data workers in academic contexts feel ‘knowledge and pride’ (Plantin, Citation2021, p. 8) regarding their activities, and such ‘data labours’ (Nadim, Citation2016) are repeatedly framed as central to knowledge production (Plantin, Citation2019, Citation2021; Wu et al., Citation2020). At the same time it is also clear that such work is at best under-recognised, and at worst entirely erased from view. Just as with other forms of service science, which is, as Ribeiro et al note, ‘likely to be overlooked in appraisals’ (Citation2023, p. 12), the work of data processors ‘remains … unacknowledged, despite the essential work they provide to create trust in datasets’ (Plantin, Citation2021, p. 9).

Alongside the rise of data-centric research, the conditions of academic labour more widely are shifting. Academic work is becoming marked by increased competition, greater use of short term contracts, and heightened precarity (Bamber et al., Citation2017; Courtois & O’Keefe, Citation2015; Norkus et al., Citation2016). An ‘academic precariat’ now forms the majority of university labour (Ullrich, Citation2019). Ideas of ‘excellence’ are often central to success in these competitive academic environments, in which excellence implies individual excellence through a specific set of characteristics or behaviours: internationalisation, research publications, the acquisition of third party funding (Herschberg et al., Citation2018; Lund, Citation2015). While data work is central to contemporary academic knowledge production, its frequent invisibility and devaluation mean that its status in relation to these dynamics remains unclear.

Biocuration as academic data work

Biocuration can be understood as one instantiation of the rise of service science, and of the increasing necessity of data workers to the functioning of contemporary research (Nadim, Citation2016; Plantin, Citation2019, Citation2021). Whilst the landscape of biocuration includes industry actors, as a field it is primarily located in academia (Strasser, Citation2019). Biocurators, who often have PhDs and postdoctoral experience in relevant lab sciences, organise the now vast quantities of data that are produced by the contemporary biosciences (particularly in the so-called ‘omics’ disciplines, such as genomics or proteonomics). Biocuration work may include reading scientific articles and extracting useful information from them; inputting such information into databases; adding metadata and annotating information in databases; creating taxonomies (‘ontologies’); developing digital tools (such as research infrastructures or AI tools for text-mining); and organising as collectives to do these things (for instance by raising funding or by ensuring standardisation across biocuration in different sites) (Tang et al., Citation2019). These practices are spread across a range of digital platforms, tools, and types of encounter, from the databases themselves (particularly well known ones include FlyBase or the Protein Data Bank) to both highly specific tools (such as the ontology editor Protegé) and generic communication and project management software and sites such as Slack or GitHub. Whilst biocuration thus includes a range of different data practices, its ‘primary role … is to extract knowledge from biological data and convert it into a structured, computable form via manual, semi-automated and automated methods’ (Quaglia et al., Citation2022).

It is difficult to convey the degree of precarity under which biocurators – who are, again, highly skilled workers with at least PhD level training – generally work. Biocuration is currently poorly institutionalised (with few permanent positions, even at senior levels, or educational programmes in the field) and precariously funded: financing for databases and their curation is a perennial concern (Bellen et al., Citation2021; Check Hayden, Citation2016; Imker Citation2020), exacerbated by the frequent invisibility of biocurators to the bioscience researchers who use their work (as exemplified by the quote that heads this paper). This means, for example, that biocurators almost entirely work on short-term contracts and that there is no clear career pathway along which they can expect to progress (Quaglia et al., Citation2022; Vita et al., Citation2023). Biocurators go by a range of job titles (from ‘data wrangler’ to ‘ontologist’ or simply ‘[bio]curator’), often work remotely, and predominantly identify as women (Quaglia et al., Citation2022). Indeed, Gabrielsen (Citation2023) has recently argued that the field is highly gendered, in part because of the opportunities it offers to women who wish to leave bench science in order to meet care responsibilities, and in part because of the service ethos that it espouses (Leonelli, Citation2016).Footnote1 Whilst there are several key global centres of biocuration – such as EBI-EMBL (the European Molecular Biology Laboratory’s European Bioinformatics Institute) and the NCBI (the US National Center for Biotechnology Information) – many curators work in isolated positions, for instance as the single curator for a project, lab group, or curated resource. The relative invisibility of the field (and thereby its lack of recognition and financial support) is a central concern within it. While biocuration results in the creation and maintainance of a central infrastructure for contemporary bioscience – the resources that lab scientists use to access existing knowledge concerning the sequences, structures, and functions of biological entities such as genes or proteins – it does so in a manner that invisibilises the human labour involved (cf. Plantin, Citation2019). Indeed, if bioscientists do not think it is ‘gnomes’ that fill the databases (as in the opening quote), they often assume that these computable resources are in fact entirely populated through automated methods, erasing human curation work in a manner similar to ‘fauxtomation’ in other AI tools (Taylor, Citation2018).

Conceptual frames and methods

In this paper we explore biocuration as a case of academic data work, examining how those working in the field frame and understand it. We do this through the lens of care, on the one hand, and epistemic justice, on the other. Data work in academic contexts is frequently understood as care work (Nadim, Citation2016; Pinel et al., Citation2020; Wu et al., Citation2020), while its lack of recognition also lends itself to a care approach. In line with Maria Puig della Bellacasa’s call for attention to the ‘petty doings of things’ (Citation2011, p. 92), care can act as ‘a signifier of devalued ordinary labours that are crucial for getting us through the day’ (Puig della Bellacasa, Citation2011, p. 93). Engaging with data work through care is thus to seek to render visible that which is mundane, taken-for-granted, and unimpressive, and to explore how such ‘petty doings’ sustain wider infrastructures and worlds. Engagement with care also brings a normative or ethical focus to research and practice. Whilst logics and ethics of care are impossible to define, given that care is always local and bricolaged (Mol et al., Citation2010), they imply something about nurturing liveability (Tronto, Citation1998), something about justice (Tacheva, Citation2022), something about modesty and contingency (Law, Citation2021). They also imply a sensitivity to what has been called the ‘dark side’ of care (Martin et al., Citation2015): ‘the violence committed in its name’ (Martin et al., Citation2015, p. 627) and the ways in which the creation of more liveable worlds for some can involve harsher conditions for others. A concern for care will therefore involve not just noticing and acknowledging care practices, but observing their ‘dark sides’ and seeking to enable more care-ful data work (cf. Baker & Karasti, Citation2018; Meng et al., Citation2019).

We therefore seek to draw attention to, and explore, neglected but essential data practices within the work of biocuration, and to engage with questions of justice and liveability in academic contexts. We do this in particular through attention to epistemic justice (Ottinger, Citation2023) and in particular by mobilising Milan and Treré’s (Citation2019) notion of the South(s) in order to attend to discrimination, injustice, and oppression in the context of data work. As noted, for Milan and Treré the South(s) is any site where ‘people suffer discrimination, and/or enact resistance to injustice and oppression’ (Citation2019, p. 325). Their work calls us to look for ‘epistemic injustices’ with regard to whose knowledge practices are valued, and whose not, and to ‘promote a reparation to the cognitive injustice … that fails to recognize nonmainstream ways of knowing the world through data’ (Milan & Treré, Citation2019, p. 329). It thus renders us attentive to how different epistemic practices may be more or less valued, and encourages us to consider how certain ways of knowing are being elided or devalued in discussions of data (in this case in the context of biocuration) and to query what epistemic novelty and value entail. Doing so, Milan and Treré suggest, may help diversify understandings of data in society.

The empirical work on which this paper draws engages with biocuration as a case of data-oriented service science, exploring the field in terms of the data practices involved in it and the ways in which these relate to academic lives and careers – the ‘epistemic living spaces’ (Felt & Fochler, Citation2012) of biocurators. This research involves ongoing ethnographic engagement with the biocuration community, in particular with the International Society for Biocuration (ISB)Footnote2 and participant observation in its events, publications, and discussions. In addition we draw on 15 semi-structured interviews with individuals working in biocuration, recruited via discussion with ISB committee members and through snowball sampling. Interviewees were based in Europe, North America, and Africa, and worked either in academic institutions (n = 10) or in bioscience companies closely connected to academic research (n = 5). In line with the dominance of women in biocuration, we spoke to two men and thirteen women.Footnote3 The interviews lasted approximately an hour and covered interviewees’ career trajectories and experiences of biocuration and their accounts of its nature, history, and future. Overall we take an ethnographic approach to this material, following ‘flows’ (Markham & Gammelby, Citation2018) and ‘resonances’ (Miller, Citation2015) that speak to our research interests, as well as being open to emergent patterns (Timmermans & Tavory, Citation2012). In this paper we describe themes that emerge from the empirical material with regard to care, invisibilisation of data work, and epistemic justice.

Data work in biocuration

In this section we discuss our empirical material, developing an argument that biocuration can, in emphasising collaboration and care, be seen as an ‘academia otherwise’ that resists dominant narratives of scholarly excellence. At the same time this explicit framing of data work as care work involves a ‘dark side’ that may elide the epistemic labour involved in it.

Biocuration as care work

As Plantin notes, data work in science is characterised by a central tension: on the one hand labours of ‘care, maintenance, and repair’ are essential to ‘the dissemination, archiving, and reuse of data’ and thereby to the functioning of the (bio)sciences, whilst on the other there has long been a ‘lack of appropriate acknowledgement or reward’ for this work (Citation2021, p. 2). This tension is heightened in the context of biocuration, perhaps especially so because of the way in which the work it involves is consistently framed as relating to care and service by biocurators themselves (cf. Gabrielsen, Citation2023; Leonelli, Citation2016). Biocurators frame their data work as a care practice in a number of ways.

First, it was clear that in carrying out biocuration one needed to be care-ful. Biocurators spoke about being ‘detail-oriented’ or ‘liking putting things into ordered structures’. Whilst it was clear that biocuration could be learned – and while education and student involvement is key to the field – there was often a sense that this carefulness was connected to one’s personality or characteristics. You could identify a ‘born curator’, one interviewee said, if there was a ‘picture on the wall that's crooked and they walk around and straighten it out’ (INT 1). Similarly, another spoke of the ‘traits’ that were useful in biocuration work, and the pleasure they experienced from using these:

If you're very picky about how things go and where things go, you know, those are very good traits. And so I am all those things. … [biocuration] brings out in me those traits of being very nitpicky about it, being precise and exact. I feel like it really exploits my quirky traits and I think a lot of biocurators feel that way (INT 8)

Not everyone is ‘meticulous’ to the extent that is needed in biocuration, interviewees pointed out, or could cope with ‘sitting alone and looking at data for hours’ (INT 8). Biocuration required deep attention to the nuances of academic articles or datasets, and care with regard to how one read or engaged with these. As the above quote further suggests, those we spoke to also referred to the affects of curation – to its satisfactions and the ways in which they found it profoundly rewarding. It was ‘fun’ or ‘exciting’ (as well as, at times, frustrating, particularly with regard to its under-recognition or under-funding). It was thus framed as an affective practice, one that in at least some cases resonates with particular personalities or traits (being someone ‘who actually likes spending hours and hours and hours hunched over a laptop’; INT 5). A similar sense of personality came through in discussions of biocuration as a service, in that this was a framing of the field that was presented as something that came more naturally to some people than others. This was not a place for individuals with ‘big egos’, one curator said.Footnote4 Instead biocuration was about ‘helping’ or ‘supporting’ other researchers. Such support was expressed both in the creation of the resources themselves – the way in which ‘having these dedicated resources make it easier for researchers … to collect all the information they could need to plan experiments … [and] to keep up with new findings and literature’ (INT 2) – but also with regard to interactions with users and the disciplinary or field-specific ‘communities’ that resources developed around. As one biocurator explained:

a lot of [our work] is engaging with the community. We do a lot of outreach to support the people in our community. Are we satisfying the things that you need to use the database for? How can we better serve you to enable you to do your science? (INT 7)

Such support often included encouraging (and helping) researchers to prepare their publications and datasets in a way that rendered them more accessible to curation, as well as responding to requests or feedback regarding a resource. As Sabina Leonelli notes, then, (bio)curators espouse a ‘service ethos’, in which there is a shared sense of a ‘professional duty to serve the user community as best as they can’ (Citation2016, p. 35). The data work of biocuration is presented less as an end in itself, and more as a practice of care, attention, and service that assists the work of others.

It is not just the work of individual researchers that is supported by this service orientation within biocuration, but knowledge production as a whole. A final way in which care is integral to biocuration is in its framing as a vital resource that infrastructures the contemporary biosciences and where one can, by caring for best practice in curation, care for science and the furtherance of knowledge more generally. In some cases this is tied to medical research and to the potential of good data management to aid this (for instance in the context of rare disease research, where curation might result in tangible benefits for patients), but there was also a more general sense that biocuration is a ‘public service’ (INT 8) that furthers science. As one curator explained:

I think [researchers] are more empowered if they understand [biocuration]. Because if they understand how things are structured and have a better knowledge then they can utilise it … users could make better use of data (INT 13)

For INT 13, the work of biocurators allows biomedical researchers to maximise exploitation of data. For both this interviewee and others in the biocuration community, recognition for their work is thus urgent because it enables better science.

In sum, biocuration is presented as a care-ful practice about which curators care deeply (and which is entangled with affect), and through which they care for knowledge production and the public good. Caring for data thus simultaneously involves care for a number of other things: science, a biomedical research community, but also oneself (with regard to the pleasures of curation).

Biocuration as academia otherwise

As previously discussed, contemporary academia is characterised by increased precarity and competition, and by a focus on individual excellence demonstrated by, for instance, high profile research publications (Lund, Citation2015). In prioritising care, and explicitly framing its activities as oriented to service, biocuration stands in stark contrast to these mainstream norms. In this section we want to highlight some of these differences and to discuss the way in which biocuration is framed as a different approach to doing science, to the extent that we might think of it as an ‘academia otherwise’ – a scholarly space in which different norms apply and different behaviours are celebrated.

These differences are often pointed out by biocurators themselves, who, alongside emphasising care and service, talk about the ways in which biocuration is a community marked by collaboration and openess rather than competition and individual gain. Indeed, this was one way that some curators explained their trajectory into the field:

I thought it was interesting to go in the field of public databases because this was more my view of science, that you're doing something that helps the entire community, you know? So your contribution is towards enhancing the science that is done as opposed to, you know, trying to re-prove something and publish it in a better journal. Which for me, it's an aspect of science that I find still very frustrating (INT 11)

INT 11 was interested in working in a way that ‘helps the entire community’ rather than simply trying to ‘publish in a better journal’ than other researchers. For this interviewee and others, biocuration offered welcome relief from mainstream academia, where they had experienced dynamics such as ‘the PI putting all the postdocs in competition against each other’ (INT 11) or where ‘you don't put everything out there because your ideas could be stolen’ (INT 9). In contrast, biocuration was ‘friendly and respectful and helpful … it's a team of helpers’ (INT 8). Indeed, (interdisciplinary) collaboration is in many ways intrinsic to biocuration work, in that it often involves interactions between computer scientists and programmers, curators, and bioscientists. Larger databases and resources involve extensive global teams and collaborations, and tools such as Slack or Zoom were often mentioned as being key to teamwork and collaboration, such that even biocurators working remotely or in smaller groups were able to participate in creating a sense of digitally mediated community around their work. Such community was experienced as generous and helpful, in a manner that again could be contrasted to other experiences of science:

in America in our PhDs we’re just kind of told … we’ve got to figure it out ourselves, you’re like thrown into the deep end, and either you sink or swim. And it was really nice to be in a community that's like, no, I won't let you sink. Here, this is what you can do to swim. So I really found that rewarding (INT 6)

Such experiences were often linked to the affects of biocuration mentioned above, in that enjoyment of biocuration work was linked to the data practices themselves – the pleasures of organising and ordering complex material – but also to the mode of working and the other people involved in it. Biocurators ‘want to help each other, like when one person reads a paper, and they don't understand it, they don't feel bad asking someone else for help. Everyone enjoys helping each other’ (INT 8).

Biocuration differs from wider academia (and specifically the biosciences, in which many curators have trained up to postdoctoral level) in other ways. As already noted, it is a gendered community, with the majority of curators identifying as women (Gabrielsen, Citation2023; Quaglia et al., Citation2022). At least part of the reason for this seems to be the flexibility offered by curation work, in contrast to bench science:

[in our team] we get a lot of people who got their PhD, they worked for a while and they decided it wasn't for them. And sometimes it's because they just don't enjoy it as much as they wanted to enjoy their life. And sometimes it's because they want to stay at home. Or they have children and they need something more flexible. Or they have a special needs child, and they really need to be more flexible (INT 8)

Remote working and flexible hours are possible in biocuration to a degree that is not the case in other scientific work, where essentially ‘you need to come when your cells are ready’ (INT 11), and the rest of your life is in service to the temporalities of the lab.Footnote5 Biocuration was thus framed as an academic space in which different ways of doing science were possible, both regard to the practicalities – how and when one worked – and the logics that animated scholarship. Similarly, it is a space where diversity and difference are more visible than in other areas of science. Alongside the work of the ISB’s highly active Equity, Diversity, and Inclusion committee,Footnote6 interviewees spoke of disciplinary diversity and of the roundabout career routes through which they had come to biocuration. Unlike much lab science, biocuration allowed for surprises, for career breaks, for simply not liking some aspects of research, or for ‘falling into’ the field.

In sum, interviewees presented biocuration as a different kind of academic space – one that was distinguished from mainstream research culture in its emphasis on openess, collaboration, help rather than competition, and flexibility, as well as by the care and maintenance work that biocuration involved.

The dark side of care: biocuration and epistemic justice

Accounts of biocurators thus suggest that it is, as a field, characterised by a different set of logics and possibilities than the majority of academia. What is celebrated within this community of academic data workers is less individual achievement through markers such as high-profile publications, and more care-ful and collaborative data work as demonstrated by the successful production and maintenance of useful, and used, resources such as databases (and the languages – ontologies – through which they are organised). Carefulness and helpfulness are key qualities. This is in stark contrast to what tends to be rewarded through contemporary notions of excellence, where individual success and productivity (in the form of high profile journal articles and third party funds) are central (Lund, Citation2015). In this section we consider further what this ‘academia otherwise’ means in practice for biocurators, and what ‘dark sides’ there might be to an epistemic community that prioritises care, collaboration, and service in working with data. In particular we return to Milan and Treré’s (Citation2019) ‘epistemological, ontological, and ethical program’ (p. 322) of big data from the South(s). Whilst the focus of their agenda is datafication rather than data work specifically, their engagement with questions of epistemic justice is useful in considering some of the results of the centrality of care to biocuration.

In starting we can echo an argument made by Ane Møller Gabrielsen (Citation2023): that the danger of emphasising the ‘service ethos’ (Leonelli, Citation2016) of data work is that it downplays or elides the epistemic work, and novelty, involved in it. As Gabrielsen writes:

By attracting women with childcare responsibilities and by communicating a ‘feminine ethos’ which traditionally is seen as incompatible with scientific work, biocuration becomes both materially and discursively gendered in a way that places the field at the bottom of both the scientific and the organisational hierarchies of the data-centric biosciences (Gabrielsen, Citation2023, p. 19)

Aside from the gendered dimensions of biocuration, and the ways in which this situates it at ‘the bottom of both the scientific and the organisational hierarchies’ (Gabrielsen, Citation2023, p. 19), celebrating care means, in the current academic climate, that this form of labour is too readily understood as not novel, not excellent, not requiring specialist and hard-won expertise. This view indeed seems to lie behind decades of under-funding of databases and similar resources, and the current challenges that biocuration faces as a field with regard to recognition and institutionalisation. As both Bruno Strasser (Citation2019) and Hallam Stevens (Citation2013) have shown, the work of developing the earliest precursors of contemporary databases was often seen as merely ‘clerical’ (Strasser, Citation2019, p. 141) or ‘the trivial process of reading old journal articles and typing in the sequences’ (Stevens, Citation2013, p. 152). It ‘simply did not fit within the standard categories of science funding’ (Strasser, Citation2019, p. 142). These struggles continue within biocuration today. There are, as we have already seen, issues with visibility: users tend to assume that databases are populated through automation, or simply not consider how they come into being at all, and there is thus a corresponding lack of attention to the need for funding databases (and curators). As one biocurator said of their past work as a lab scientist:

I had never heard of biocuration. I had used databases loads. But I’d never really reflected on who maintains it and how does stuff get in there (INT 13)

At the same time (and relatedly), there is a lack of recognition of the kind of epistemic work that biocuration entails, leading to experiences of precarity as academic data workers. Challenges with regard to funding, lack of career progression and permanent posts, and recruitment into the field (all of which were central issues for interviewees) were framed as exacerbated by ambiguity concerning the nature of biocuration: was it research? Were biocurators really scientists? One interviewee spoke of a moment when a funder did not renew a key grant because ‘they said, you’re not doing research, [and] we fund research’ (INT 3), whilst another noted that a ‘lot of [user] communities don’t think of their database providers as true scientists’ (INT 7). In line with a much wider devaluation of care work of all kinds (Mol et al., Citation2010), such assumptions conflate caring for data with a lack of epistemic novelty or value, and separate knowledge production from its management. One ‘dark side’ of care and service is thus that such practices, in the context of scientific data and academic work, are implicitly understood as incompatible with work that is epistemically novel or generative, and therefore not rewarded within funding, recruitment, or promotion processes. Indeed, many of those we spoke to struggled with regard to uncertainty regarding whether their contract would be extended, or whether they would find further funding, in a particular extreme version of the academic precarity that is present more widely (Ullrich, Citation2019).

Is this implicit downgrading of the epistemic value of biocuration (and similar kinds of academic data work) justified? In fact it seems clear, both from this research and from other studies of data work in academia (e.g., Leonelli, Citation2016; Stevens, Citation2013), that biocuration plays a hugely important role in constituting the knowledge and intellectual resources of contemporary biosciences. As one interviewee said, ‘modern science will be built on these databases, so the data better be good’ (INT 5). One example is the way in which biocuration involves categorising and organising knowledge through standardised languages (ontologies). As this interviewee explains:

What we do as curators is also taking care not only of doing curation itself and the resources [the databases], but also of drafting or working on ontologies, so biomedical ontologies, that are the standards that you use to annotate in a specific field. Because we need to have standardised information (INT 2)

Organising, labelling, annotating, standardising: such activities can readily be understood in terms of the kinds of care-ful, ‘nitpicky’, service-oriented labours described above. But they simultaneously define how biological knowledge is accessible and thinkable, and are as influential in constituting it as the work of the lab scientists who produce the data with which these activities are concerned. It seems strange not to class this work as epistemically novel, or as ‘not research’ (as in the quote from INT 3 above). Instead we might choose to understand this form of academic work as comprising a different kind of epistemic activity, but one that is equally valid and important. In the context of Milan and Treré’s (Citation2019) arguments regarding data practices ‘from the South’, we can thus view biocuration practice as embodying ‘novel epistemologies’ (p.328) that have the potential to create useful knowledge and to further bioscience scholarship, but that differ from mainstream academic logics regarding the nature of epistemic novelty and value, and how this is demonstrated. Such practices ‘have the ability to embed and embody prefigurative realities capable of producing change’ (p. 329), and, indeed, we have seen this with regard to the ‘academia otherwise’ represented by biocuration. While one dark side of care in this context may be its ready devaluation, reframing care-oriented data work as enacting a different kind of data (and research) imaginary therefore also offers one means of intervening in debates regarding the valuation of academic work. Seeking epistemic justice will thus mean acknowledging the value of diverse epistemic practices, not just those that are currently visible within mainstream academic logics regarding the nature of excellence.

Concluding discussion

In the preceding sections we have described some of the practices and meanings involved in biocuration, taking this nascent field as an instance of data work within academia. We have argued that biocuration’s framing through ideas of care and service involves a ‘dark side’ (Martin et al., Citation2015) in that these practices are (within mainstream academia) understood as lacking in epistemic novelty and are therefore poorly recognised and rewarded. In contrast, a concern for epistemic justice and the value of epistemic diversity (Milan & Treré, Citation2019) leads us to attend to the ways in which care work constitutes scientific knowledge, and to the epistemic contributions it makes. In line with Puig de la Bellacasa’s (Citation2011) call for attention to ‘matters of care’, reframing the ‘petty doings’ of data work in biocuration as epistemically significant might help to acknowledge – within both scholarship and science policy – the extent to which these practices create and sustain the biosciences. In this case, at least, data work is simultaneously care work and epistemic work.

What is the significance of these findings? On the one hand we can observe a parallel with many existing studies of data work in which practices of data annotation, maintenance, and organisation are invisibilised and devalued (Gray & Suri, Citation2019; Green et al., Citation2023; Irani, Citation2015; Citation2019; Sambasivan et al., Citation2021). Despite key differences between biocuration and AI annotation and other forms of data-oriented microwork in industry contexts (not least with regard to how meaningful that work is experienced as being; Tubaro et al., Citation2020), we believe that it is significant that data work in academia may also be under-recognised and poorly rewarded, and that such labour is rendered subaltern in scholarly as well as private sector spaces. Indeed, engaging with biocuration suggests one reason why this may be the case: in being framed as care work, a service, or maintenance, data work may be stripped of claims to epistemic novelty or innovation (cf. Sambasivan et al., Citation2021).

But the findings also speak to current discussion regarding the nature of contemporary academic work, and what is valued in it. In a wider academic context that prioritises ‘excellence’ in the form of individual achievement and productivity, biocuration is a scholarly community that emphasises collaboration and care, and is, in essence, penalised for doing so. Biocuration’s struggle for recognition and financial support thus demonstrates a continuing under-recognition of epistemic diversity within mainstream research, as well as an implicit coding of service or care work as epistemically empty (‘not research’, as one funder communicated). While it is certainly not the only example of service or care work being under-recognised in contemporary systems of academic reward (Lund, Citation2015), it offers a particularly extreme example of how essential labours may be erased from view. ‘What is the specificity of biocuration in today’s academia?’ asked one of the reviewers of this paper. ‘Whose labor is not invisible? Think about me writing this review: it is nine o’clock in the evening, I will not get any monetary compensation, nor reward or satisfaction of doing this.’ The answer, of course, is that while any particular reviewer may not be compensated for their efforts (aside from the intrinsic interest or satisfaction that many get from participating in peer review), other academics know that peer review exists. In the case of biocuration we have the work of an entire professional community rendered invisible, such that users of bioscience databases and similar resources ‘never really reflect’ on how those infrastructures are brought into being (to use the words of INT 13, quoted above). Surfacing the epistemic value of biocuration as an example of academic data work therefore encourages all of us working within the academy to acknowledge epistemic diversity, and to identify ways of rewarding the kind of care-oriented data work that biocuration (and other invisibilised scholarly practices) involves. This might mean, for example, expecting, funding, and rewarding data work and other forms of academic care work within academic evaluations, as well as dedicating sustained resources to the often-invisible infrastructures – such as databases – on which research relies.

At the same time foregrounding the data labours of biocuration may act as an intervention into those same academic cultures, in that it offers an ‘imaginary’ – a concrete set of practices that might act as a form of ‘intervention and transformation of the established order’ (Milan & Treré, Citation2019, p. 329) – that we have termed ‘academia otherwise’. In describing their work biocurators emphasise care, collaboration, and service, presenting it as a different kind of academic space to the mainstream. It therefore encourages us to reflect on what might be gained from wider uptake of this ‘academia otherwise’. What would contemporary research cultures look like if we prioritised (and rewarded) service, care, and successful collaboration? How might recruitment, promotion, and reward processes change, and what could this do with regard to the kinds of bodies and identities able to access academic careers? Similarly, and in the context of critical data studies more generally, we might consider what framing data work as a knowledge practice could do for discussions of data justice. What would it mean to recognise that data production and management, however mundane, co-constitute the technologies and knowledges that emerge from data use? To what extent could re-framing data work as epistemic labour bring into being worlds where such work is better rewarded? At the very least, the case of biocuration encourages us to engage with other instances in which data, care, and epistemic practices are intertwined.

Acknowledgments

We are grateful to all those who spoke to us as part of this research, and in particular to the International Society for Biocuration for their support. We would like to thank Ruth Lovering and Charles Tapley Hoyt for their comments on earlier versions of this manuscript, as well as the reviewers and editors for their feedback. We also want to acknowledge our immediate colleagues, whose work supports ours in multiple ways: Andrea Schikowitz, Ariadne Avkıran, Bao-Chau Pham, Elaine Goldberg, Esther Dessewffy, Fredy Mora Gámez, and Kathleen Gregory.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Sarah R. Davies

Sarah R. Davies is Professor of Technosciences, Materiality, and Digital Cultures at the Department of Science and Technology Studies, University of Vienna. Her current work explores the intersection between digital and epistemic practices and forms of life [email: [email protected]].

Constantin Holmer

Constantin Holmer is a Masters student at the Department of Science and Technology Studies, University of Vienna, and is currently completing a research thesis on environmental controversy.

Notes

1 Beyond the gendered nature of biocuration, which we will return to, the demographics of the field (and our interviewees) is not a focus of this article – not least because there is relatively little knowledge regarding this. Those interested can refer to Quaglia et al. (Citation2022), which offers the most up to date survey of the biocuration community.

3 It is not in the scope of this paper to discuss geographical or gender differences between interviewees’ accounts, in particular given that we are drawing on ethnographic engagement with the biocuration community as a whole and are seeking to characterise central dynamics within this. However, it is worth noting that our engagement with the field so far suggests that the geographies of biocuration also involve patterns familiar from other studies of data work, such as outsourcing to lower income countries, and that this should be a direction for future research.

4 It is important not to over emphasise the framing of these skills being innate characteristics of certain individuals. This was certainly one way that biocurators talked about their trajectories into biocuration and the work itself, but it runs in parallel with arguments that the biocuration can be taught, and that it should, in fact, become a greater part of the work of the researchers who produce the data and knowledge that is curated. The notion of ‘community curation’, asking data-producing researchers to assist with curation efforts, is particularly important in the field (Arnaboldi et al., Citation2020).

5 This also meant that a degree of geographical flexibility was possible, at least for some biocurators – offering the possibility to, for instance, live in a different country or region than one’s employment, based on personal preference or family responsibilities.

References

  • Arnaboldi, V., Raciti, D., Van Auken, K., Chan, J. N., Müller, H.-M., & Sternberg, P. W. (2020). Text mining meets community curation: A newly designed curation platform to improve author experience and participation at WormBase. Database, 2020, baaa006. https://doi.org/10.1093/database/baaa006
  • Baker, K. S., & Karasti, H. (2018). Data care and its politics: Designing for local collective data management as a neglected thing. In Proceedings of the 15th Participatory Design Conference: Full Papers (Vol. 1, pp. 1–12). https://doi.org/10.1145/3210586.3210587
  • Bamber, M., Allen-Collinson, J., & McCormack, J. (2017). Occupational limbo, transitional liminality and permanent liminality: New conceptual distinctions. Human Relations, 70, 1514–1537. https://doi.org/10.1177/0018726717706535
  • Beaulieu, A., & Leonelli, S. (2021). Data and society: A critical introduction. SAGE.
  • Bellen, H. J., Hubbard, E. J. A., Lehmann, R., Madhani, H. D., Solnica-Krezel, L., & Southard-Smith, E. M. (2021). Model organism databases are in jeopardy. Development, 148(19), dev200193. https://doi.org/10.1242/dev.200193
  • Bossen, C., Chen, Y., & Pine, K. H. (2019). The emergence of new data work occupations in healthcare: The case of medical scribes. International Journal of Medical Informatics, 123, 76–83. https://doi.org/10.1016/j.ijmedinf.2019.01.001
  • Bossen, C., Pine, K. H., Cabitza, F., Ellingsen, G., & Piras, E. M. (2019). Data work in healthcare: An Introduction. Health Informatics Journal, 25(3), 465–474. https://doi.org/10.1177/1460458219864730
  • Check Hayden, E. (2016). Funding for model-organism databases in trouble. Nature, nature.2016.20134. https://doi.org/10.1038/nature.2016.20134
  • Courtois, A., & O’Keefe, T. (2015). Precarity in the ivory cage: Neoliberalism and casualisation of work in the Irish higher education sector. Journal for Critical Education Policy Studies, 13(1), 43–66.
  • Crawford, K. (2021). Atlas of AI: Power, politics, and the planetary costs of artificial intelligence. Yale University Press.
  • de la Bellacasa, M. P. (2011). Matters of care in technoscience: Assembling neglected things. Social Studies of Science, 41(1), 85–106. https://doi.org/10.1177/0306312710380301
  • Doing, P. (2009). Velvet revolution at the synchrotron: Biology, physics, and change in science. MIT Press.
  • Egbert, S., & Leese, M. (2020). Criminal futures: Predictive policing and everyday police work (1st ed.). Routledge. https://doi.org/10.4324/9780429328732
  • Eubanks, V. (2017). Automating inequality: How high-tech tools profile, police, and punish the poor (1st ed.). St. Martin’s Press.
  • Felt, U., & Fochler, M. (2012). Re-ordering epistemic living spaces: On the tacit governance effects of the public communication of science. In S. Rödder, M. Franzen, & P. Weingart (Eds.), The sciences’ media connection –public communication and its repercussions (pp. 133–154). Springer Netherlands.
  • Foster, J., Mcleod, J., Nolin, J., & Greifeneder, E. (2018). Data work in context: Value, risks, and governance. Journal of the Association for Information Science and Technology, 69(12), 1414–1427. https://doi.org/10.1002/asi.24105
  • Gabrielsen, A. M. (2023). Gendering data care: Curators, care, and computers in data-centric biology. Science as Culture, 0(0), 1–25. https://doi.org/10.1080/09505431.2023.2260830
  • Gorman, M. E., & Spohrer, J. (2010). Service science: A new expertise for managing sociotechnical systems. In M. E. Gorman (Ed.), Trading zones and interactional expertise: Creating new kinds of collaboration. The MIT Press. https://doi.org/10.7551/mitpress/8351.003.0007
  • Gray, M. L., & Suri, S. (2019). Ghost work: How to stop Silicon Valley from building a new global underclass. Houghton Mifflin Harcourt.
  • Green, S., Hillersdal, L., Holt, J., Hoeyer, K., & Wadmann, S. (2023). The practical ethics of repurposing health data: How to acknowledge invisible data work and the need for prioritization. Medicine, Health Care and Philosophy, 26(1), 119–132. https://doi.org/10.1007/s11019-022-10128-6
  • Grohmann, R., Govari Nunes, C., & Da Rosa Amaral, A. (2022). Click farm platforms and informal work in Brazil (34th ed.). Southern Centre for Inequality Studies. https://doi.org/10.54223/uniwitwatersrand-10539-33453.
  • Haraway, D. (1988). Situated knowledges: The science question in feminism and the privilege of partial perspective. Feminist Studies, 14(3), 575. https://doi.org/10.2307/3178066
  • Herschberg, C., Benschop, Y., & van den Brink, M. (2018). Selecting early-career researchers: The influence of discourses of internationalisation and excellence on formal and applied selection criteria in academia. Higher Education, 76(5), 807–825. https://doi.org/10.1007/s10734-018-0237-2
  • Imker, H. J. (2020). Who bears the burden of long-lived molecular biology databases?. Data Science Journal, 19(1), 8. https://doi.org/10.5334/dsj-2020-008
  • Irani, L. (2015). The cultural work of microwork. New Media & Society, 17(5), 720–739. https://doi.org/10.1177/1461444813511926
  • Irani, L. (2019). Justice for data janitors. In S. Marcus & C. Zaloom (Eds.), Think in public (pp. 23–40). Columbia University Press. https://doi.org/10.7312/marc19008-003.
  • Kost, D., Fieseler, C., & Wong, S. I. (2018). Finding meaning in a hopeless place? The construction of meaningfulness in digital microwork. Computers in Human Behavior, 82, 101–110. https://doi.org/10.1016/j.chb.2018.01.002
  • Law, J. (2017). STS as method. In U. Felt, R. Fouché, C. Miller, & L. Smith-Doerr (Eds.), The handbook of science and technology studies (pp. 31–57). MIT Press.
  • Law, J. (2021). From after method to care-ful research. http://heterogeneities.net/publications/Law2021FromAfterMethodToCare-fulResearch.pdf
  • Le Ludec, C., Cornet, M., & Casilli, A. (2023). The problem with annotation. Human labour and outsourcing between France and Madagascar. Big Data & Society, 10(2), 20539517231188723. https://doi.org/10.1177/20539517231188723
  • Leonelli, S. (2016). Data-centric biology: A philosophical study. The University of Chicago Press.
  • Leonelli, S., & Tempini, N. (Eds.). (2020). Data journeys in the sciences. Springer International Publishing. https://doi.org/10.1007/978-3-030-37177-7
  • Leurs, K. (2017). Feminist data studies: Using digital methods for ethical, reflexive and situated socio-cultural research. Feminist Review, 115(1), 130–154. https://doi.org/10.1057/s41305-017-0043-1
  • Lund, R. W. B. (2015). Doing the ideal academic—gender, excellence and changing academia. Aalto University. https://aaltodoc.aalto.fi:443/handle/123456789/17846.
  • Markham, A. N., & Gammelby, A. K. (2018). Moving through digital flows: An epistemological and practical approach. In U. Flick (Ed.), The SAGE handbook of qualitative data collection (pp. 451–465). SAGE Publications Ltd. https://doi.org/10.4135/9781526416070.n29
  • Martin, A., Myers, N., & Viseu, A. (2015). The politics of care in technoscience. Social Studies of Science, 45(5), 625–641. https://doi.org/10.1177/0306312715602073
  • Meisner, C., Duffy, B. E., & Ziewitz, M. (2022). The labor of search engine evaluation: Making algorithms more human or humans more algorithmic? New Media & Society, 26, 1018–1033. https://doi.org/10.1177/14614448211063860
  • Meng, A., DiSalvo, C., & Zegura, E. (2019). Collaborative data work towards a caring democracy. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1–23. https://doi.org/10.1145/3359144
  • Miceli, M., Schuessler, M., & Yang, T. (2020). Between subjectivity and imposition: Power dynamics in data annotation for computer vision. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2), 1–25. https://doi.org/10.1145/3415186
  • Milan, S., & Treré, E. (2019). Big data from the south(s): Beyond data universalism. Television & New Media, 20(4), 319–335. https://doi.org/10.1177/1527476419837739
  • Miller, V. (2015). Resonance as a social phenomenon. Sociological Research Online, 20(2), 58–70. https://doi.org/10.5153/sro.3557
  • Mol, A. M., Moser, I., & Pols, J. (2010). Care in practice: On tinkering in clinics, homes and farms. Transcript Verlag.
  • Møller, N. H., Shklovski, I., & Hildebrandt, T. T. (2020). Shifting concepts of value: Designing algorithmic decision-support systems for public services. In Proceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society (pp. 1–12). Association for Computing Machinery. https://doi.org/10.1145/3419249.3420149
  • Muldoon, J., & Apostolidis, P. (2023). ‘Neither work nor leisure’: Motivations of microworkers in the United Kingdom on three digital platforms. New Media & Society, 14614448231183942. https://doi.org/10.1177/14614448231183942
  • Nadim, T. (2016). Data labours: How the sequence databases GenBank and EMBL-bank make data. Science as Culture, 25(4), 496–519. https://doi.org/10.1080/09505431.2016.1189894
  • Norkus, M., Besio, C., & Baur, N. (2016). Effects of project-based research work on the career paths of young academics. Work Organisation, Labour and Globalisation, 10(2), 9–26. https://doi.org/10.13169/workorgalaboglob.10.2.0009
  • Ottinger, G. (2023). Responsible epistemic innovation: How combatting epistemic injustice advances responsible innovation (and vice versa). Journal of Responsible Innovation, 10(1), 1–19. https://doi.org/10.1080/23299460.2022.2054306
  • Perrigo, B. (2023, January 18). Exclusive: The $2 Per Hour Workers Who Made ChatGPT Safer. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/.
  • Pine, K., Bossen, C., Holten Møller, N., Miceli, M., Lu, A. J., Chen, Y., Horgan, L., Su, Z., Neff, G., & Mazmanian, M. (2022). Investigating data work across domains. In New Perspectives on the Work of Creating Data. Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (pp. 1–6). Association for Computing Machinery. https://doi.org/10.1145/3491101.3503724
  • Pinel, C., Prainsack, B., & McKevitt, C. (2020). Caring for data: Value creation in a data-intensive research laboratory. Social Studies of Science, 50(2), 175–197. https://doi.org/10.1177/0306312720906567
  • Piras, E. M. (2019). Beyond self-tracking: Exploring and unpacking four emerging labels of patient data work. Health Informatics Journal, 25(3), 598–607. https://doi.org/10.1177/1460458219833121
  • Plantin, J.-C. (2019). Data cleaners for pristine datasets: Visibility and invisibility of data processors in social science. Science, Technology, & Human Values, 44(1), 52–73. https://doi.org/10.1177/0162243918781268
  • Plantin, J.-C. (2021). The data archive as factory: Alienation and resistance of data processors. Big Data & Society, 8(1), 205395172110075. https://doi.org/10.1177/20539517211007510
  • Puig de la Bellacasa, M. (2011). Matters of care in technoscience: Assembling neglected things. Social Studies of Science, 41(1), 85–106. https://doi.org/10.1177/0306312710380301
  • Quaglia, F., Balakrishnan, R., Bello, S. M., & Vasilevsky, N. (2022). Conference report: Biocuration 2021 virtual conference. Database, 2022, baac027. https://doi.org/10.1093/database/baac027
  • Ribeiro, B., Meckin, R., Balmer, A., & Shapira, P. (2023). The digitalisation paradox of everyday scientific labour: How mundane knowledge work is amplified and diversified in the biosciences. Research Policy, 52(1), 104607. https://doi.org/10.1016/j.respol.2022.104607
  • Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P., & Aroyo, L. M. (2021). “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–15). Publisher is Association for Computing Machinery. https://doi.org/10.1145/3411764.3445518
  • Scroggins, M. J., & Pasquetto, I. V. (2020). Labor out of place: On the varieties and valences of (In)visible labor in data-intensive science. Engaging Science, Technology, and Society, 6, 111–132. https://doi.org/10.17351/ests2020.341
  • Shapin, S. (1989). The invisible technician. American Scientist, 77(6), 554–563.
  • Stevens, H. (2013). Life out of sequence: A data-driven history of bioinformatics. The University of Chicago Press.
  • Strasser, B. J. (2019). Collecting experiments: Making big data biology. University of Chicago Press.
  • Tacheva, Z. (2022). Taking a critical look at the critical turn in data science: From “data feminism” to transnational feminist data science. Big Data & Society, 9(2), 205395172211129. https://doi.org/10.1177/20539517221112901
  • Tang, Y. A., Pichler, K., Füllgrabe, A., Lomax, J., Malone, J., Munoz-Torres, M. C., Vasant, D. V., Williams, E., & Haendel, M. (2019). Ten quick tips for biocuration. PLoS Computational Biology, 15(5), https://doi.org/10.1371/journal.pcbi.1006906
  • Tansey, E. M. (2008). Keeping the culture alive: The laboratory technician in mid-twentieth-century British medical research. Notes and Records of the Royal Society, 62(1), 77–95. https://doi.org/10.1098/rsnr.2007.0035
  • Taylor, A. (2018). The automation charade. Logic(s) Magazine. https://logicmag.io/failure/the-automation-charade/.
  • Timmermans, S., & Tavory, I. (2012). Theory construction in qualitative research: From grounded theory to abductive analysis. Sociological Theory, 30(3), 167–186. https://doi.org/10.1177/0735275112457914
  • Tronto, J. (1998). An ethic of care. Generations: Journal of the American Society on Aging, 22(3), 15–20.
  • Tubaro, P., Casilli, A. A., & Coville, M. (2020). The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence. Big Data & Society, 7(1), 205395172091977. https://doi.org/10.1177/2053951720919776
  • Turkopticon. (2023). Beware the hype: ChatGPT didn’t replace human data annotators. https://4sonline.org/news_manager.php?page=31554
  • Ullrich, P. (2019). In itself but not yet for itself—organising the new academic precariat. In W. Baier, E. Canepa, & H. Golemis (Eds.), The radical left in Europe: Rediscovering hope (pp. 155–168). Merlin Press.
  • Vita, R., Aspromonte, M. C., Bello, S. M., Harris, N. L., Caufield, J. H., Haendel, M., Hoyt, C. T., Quaglia, F., Mujambere, J., Panossian, S. P., Reddy, T. B. K., Tuli, M. A., Khodiyar, V. K., Vasilevsky, N. (2023). Careers in biocuration: 2023 workshop report. https://zenodo.org/records/10246586
  • Wang, D., Prabhat, S., & Sambasivan, N. (2022). Whose AI dream? In search of the aspiration in data annotation. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (pp. 1–16). Association for Computing Machinery. https://doi.org/10.1145/3491102.3502121
  • Wood, A. J., Graham, M., Lehdonvirta, V., & Hjorth, I. (2019). Networked but commodified: The (Dis)Embeddedness of digital labour in the Gig economy. Sociology, 53(5), 931–950. https://doi.org/10.1177/0038038519828906
  • Wu, C.-L., Ha, J.-O., & Tsuge, A. (2020). Data reporting as care infrastructure: Assembling ART registries in Japan, Taiwan, and South Korea. East Asian Science, Technology and Society: An International Journal, 14(1), 35–59. https://doi.org/10.1215/18752160-8233676