517
Views
0
CrossRef citations to date
0
Altmetric
Articles

Audio forensics behind the Iron Curtain: from raw sounds to expert testimony

ORCID Icon
Pages 187-208 | Received 02 Feb 2023, Accepted 29 Jun 2023, Published online: 27 Jul 2023

ABSTRACT

This essay investigates the construction of forensic audio expertise in the legal and security system of Communist Czechoslovakia and shows that the contested nature of speaker identification and sound-based objectivity contributed to the formulation of probabilistic claims in forensics. It explores the practices of the Department of Fonoscopy: a unique research laboratory of audio forensics that systematically examined the spectrographic, linguistic, and auditory means of sound analysis for the purpose of identifying unknown voices and environments in audio recordings. Bringing together the notions of “forensic cultures” and “sonic skills”, this article addresses the scientific, cultural, and political underpinning of the nascent field of audio expertise as well as the changing status of sound-based knowledge and forms of representation in forensics. In establishing fonoscopic expertise before the court and in the broader praxis of police investigation, the idea of vocal fingerprints and the use of sound visualisation technologies became instrumental. This essay pays special attention to the dynamics of the intricate process in which acoustic “raw material” (from anonymous calls, wiretapped phone lines, recorded conversations, or police interrogation rooms) was transformed into different kinds of legal and criminalistic evidence in the service of the totalitarian surveillance state.

“The police will become deaf!” announced one news headline in 2010, when Jan Málek, a veteran of Czechoslovak forensic audio analysis, was retiring from the Prague Institute of Criminalistics after forty years of service.Footnote1 Despite examining thousands of sound recordings and providing numerous expert reviews over his long police career, Málek’s work rarely made the news. Yet in the 1990s and early 2000s, the media did cover his field of expertise – “fonoscopy” – due to several famous criminal cases involving voice analysis. When the former US Secretary of State Madeleine Albright was threatened while visiting the city of Brno, Málek was able to identify the suspect within two hours of analysing the telephone call. News outlets also recalled his contribution to the identification of the so-called cyanide killer as well as a bomb blackmailer in the Prague subway. Málek’s professional work during the Cold War era, however, was not remembered on this occasion, and neither was the very existence of the specialised voice identification programme he launched at the Institute of Criminalistics in the mid-1970s.

Málek was a founding member of the police criminal department of audio analysis known as the Department of Fonoscopy, established in 1975 as a special section of the Institute of Criminalistics in Prague.Footnote2 The term fonoscopy,Footnote3 still in use today, denoted a specific kind of sound-based expertise, which emerged from and was constantly shaped by interactions between socialist police, courtrooms, scientific institutions, and audio and sound visualisation technologies. In what follows, I will examine the construction of forensic audio expertise in Communist Czechoslovakia against the backdrop of research in the history and social studies of science that attends to the role of bodily skills and forms of representation in the production of knowledge. To effectively grasp the relationship between audio technologies and embodied practices, I employ the notion of “sonic skills”; in analysing the construction of sound-based objectivity in forensics, I will use the STS concept of “inscription”. The investigation of Czechoslovak fonoscopy, as this paper will show, offers unique and in-depth insight into the process of establishing sound-based forensic expertise by means of a broad distribution of sonic skills and a negotiation with existing notions of objectivity in forensics. I will argue that inconclusiveness of the spectrographic images of the voice and a continuous reliance on “professional audition” in sound analysis directly contributed to the rise of probabilistic claims in forensic science more generally. By bringing together the notions of “forensic cultures” and “sonic skills”, the paper demonstrates a specificity of evidence-making in audio forensics, which prefigured later debates concerning the performance of scientific expertise in the courtroom.

This study is based on both published and unpublished primary sources, including criminological journals and handbooks, an instruction film, and archival documents from the archive of the Department of Fonoscopy in Prague and the Security Services Archive. While conducting research for this paper, I also benefited from conversations with Málek and his colleague from the Department of Fonoscopy, Václava Musilová.

Audio expertise behind the Iron Curtain

When historians examine socialist expertise, it is usually in relation to technocratic totalitarian governance and centrally planned economics. Although expertise is political by definition, the exact nature of its involvement with the state differs considerably depending on the field. The older view that expertise developed in Communist states and was simply an extension of party politics is untenable.Footnote4 Several recent studies by historians on the relationship between expert knowledge and the technocratic regime of the Czechoslovak state show a complex network of interactions and feedback loops between the state and specialised knowledge in domains of law, urbanism, psychiatric care, corporate management, and environmental protection (Kopeček Citation2019; Sommer, Spurný, and Mrňka Citation2019). However, the connections between expertise involved in, and fashioned through, practices of surveillance and forensics have not yet been examined. During the Cold War, forensics was closely affiliated with government institutions both in the East and in the West, and despite differences in the legal and security frameworks, its defining features, including the use of specialised language and standardised procedures, were quite similar.

Beginning in the nineteenth century, Czechoslovak forensics and biometrics were firmly embedded in international criminalistic networks. Although institutional ties with Western forensic experts were suspended during the Cold War era, when international cooperation was limited to the countries of the Eastern Bloc, forensics never became completely detached from developments in Western countries. The extent and exact nature of this relationship remains to be addressed by historians, although recent studies indicate that both know-how and technological equipment travelled across ideological borders more often than we think (Badenoch, Fickers, and Henrich-Franke Citation2013; Bijsterveld Citation2021; Iacob et al. Citation2018). When audio forensics was established in Czechoslovakia in 1975, it drew heavily on existing research of voiceprints in the United States, combined with up-to-date knowledge in phonetics, linguistics, and electroacoustics.Footnote5 Over two decades, the Department of Fonoscopy functioned as a unique sound laboratory, developing and testing a novel form of expert knowledge on the human voice and acoustic traces while similar investigations were carried out in the United Kingdom, the Netherlands, or the US (Broeders and Rietveld Citation1995; French Citation2017; Nolan Citation1983, 1991).Footnote6 Although fonoscopic expertise differed from the eavesdropping activities of the Czechoslovak secret police (Státní bezpečnost [StB]), much of its expertise, mostly in connection to voice identification methods, was subject to a certain degree of secrecy and reluctantly shared even with other Eastern Bloc countries, despite numerous official proclamations.

Czechoslovak fonoscopy cooperated with audio forensic departments in the German Democratic Republic and Poland, but the scope of collaboration with other socialist states, including the Soviet Union, was very limited. As I have shown elsewhere, the Polish fonoskopia department founded in Warsaw as early as 1963 served as the main reference point when the Czechoslovak laboratory was established over a decade later.Footnote7 Cooperation with Polish fonoscopy was realised through research stays and occasional meetings between 1975 and 1990, while Polish forensic experts provided much needed technological as well as methodological advice, especially in the beginning. Both departments focused on three main areas of analysis: (1) speaker identification, (2) the identification of indoor and outdoor environments, and (3) the inspection of the authenticity of sound recordings. Apart from the expert exchange with Poland, there is also evidence that Málek visited the Berlin Kriminalistische Akustik department in 1981 and met with Christian Koristka, the leader of the research programme at Humboldt University.Footnote8 Despite differences in organisation, technological equipment, and the level of standardisation,Footnote9 the three departments shared basic methodological approaches and premises that combined spectral, aural, and linguistic means of analysis.

The logic of forensics: between eavesdropping and expert science

In the 1970s, the technical section of the Czechoslovak intelligence services envisioned a project that exceeded the scope of the regular eavesdropping activities performed by the StB. For more than a decade, intelligence hoped to invent a workable way for the secret police to listen to the sounds of a typewriter and, based on the acoustic traces of typing, determine what was being written.Footnote10 The mission was never completed, but it represents an interesting borderline case requiring different kinds of sound-based expertise. In many cases, the Department of Fonoscopy collaborated rather closely with the StB, especially in identifying unknown or familiar voices in recordings; but although both relied on listening in their daily practice, they treated sound in fundamentally different ways. While the secret police listened in order to obtain useful information about and from the individuals it wiretapped, fonoscopic expertise required advanced knowledge of phonetics, electroacoustics, and linguistics to pin down the voice.

The StB was part of the National Security Corps (Sbor národní bezpečnosti [SNB]) along with Public Security (Veřejná bezpečnosti [VB]) – the standard police forces, which housed the Institute of Criminalistics and, by extension, the Department of Fonoscopy. Unlike the small expert fonoscopy lab, the StB had large personnel at its disposal to perform all kinds of intelligence, counterintelligence, and surveillance activities. Similar to the East German Stasi and other Cold War intelligence agencies, the StB was constantly eavesdropping on Czechoslovak citizens, foreign diplomats, scientists, and tourists, either through wiretapped phone lines or bugged locations.Footnote11 Although neither its human nor technological resources were unlimited, and its wiretapping activities were subjected to regulations, its staff members were able to listen to a large number of private conversations every day, which were recorded and transcribed by trained staff.Footnote12 In its practices of acoustic surveillance, the StB was primarily interested in the information exchanged in the recorded conversations and in the best technological means of eavesdropping on people in private apartments, hotel rooms, restaurants, or public space, with the last posing a significant technological challenge.Footnote13

The logic of forensic audio expertise was rooted in the history of Czechoslovak criminalistics. New departments dealing with criminalistic technology and expertise were gradually established within the structure of the Czechoslovak police beginning in 1945. A technical “T” department was created in 1947, which included mechanoscopic and ballistic expertise, photographic documentation, and a small chemical laboratory; and a separate “I” (identification) department was founded that same year, which performed dactyloscopic (fingerprint) examinations and biological analyses.Footnote14 The Institute of Criminalistics, within which the Department of Fonoscopy was later established, was created in 1958 (Pješčak Citation1976) and its function was defined as follows:

“It carries out expert opinions in the field of technical criminology for all bodies of the Ministry of the Interior at the request of prosecutors’ offices and the courts, as well as all identification work from the point of view of dactyloscopy. It sends its staff to the places where the crime was committed (…). It generalises the scientific and technical work of the VB with the help of published textbooks and the journal Kriminalistický sborník. It introduces and promotes the use of scientific and technical means in the fight against crime. It organises the forensic cabinet of the VB administration, and maintains liaison with scientific institutions in the field of forensics”.Footnote15

All these tasks, including the promotion of science and technology, cooperation with state and police institutions, identification work, and the publication of results, were instrumental in creating forensic expertise in Communist Czechoslovakia. The emphasis on “scientific means” was also critical for the later development of audio forensic expertise within the institute. The “scientification” of society, especially regarding methods of governance and the involvement of experts in decision-making, was one of the defining features of the so-called normalisation period (1969–1989), which was initially marked by increased economic growth and personal consumption (Činátl, Mervart, and Najbert Citation2018; Fulbrook Citation2009; Kolář and Pullman Citation2016). Whereas historians have paid attention mostly to the involvement of experts in economics, agriculture, urban planning, and corporate management (Kopeček Citation2019; Sommer, Spurný, and Mrňka Citation2019) the tendency towards greater involvement of “science” can also be observed in surveillance, criminalistics, and the military. The field of fonoscopy was a prime example of the scientific transformation of criminalistics, as it introduced into forensics (and further refined) up-to-date electroacoustic, phonetic, and linguistic methods and helped to establish sound-based knowledge in the Czechoslovak criminal justice system.

Apart from the cooperation with the StB and with counter-intelligence services of the Ministry of the Interior, the fonoscopy lab analysed material for criminal police departments (SNB, FKÚ) all over Czechoslovakia, wrote expert reviews for court trials, and provided expert advice wherever it was needed. The daily workload of the department mostly involved the identification of criminal offenders, blackmailers, anonymous callers, murderers, or frauds. The administrative book of the Department of Fonoscopy suggests that between 1975 and 1989 only about twenty percent of audio analysis was directly performed for the StB. Although it was not until 1992 that the first database of speakers was created at the Prague Criminological Institute (Rak, Matyáš, and Říha Citation2008), the fonoscopy lab invented its own system of classifying anonymous callers, who were assigned registrations numbers so that the files could be retrieved from the archive in case new evidence emerged.Footnote16 Sound recordings were also stored in the lab on Bartolomějská Street in case the audio had to be revisited by experts. No tapes with recorded speech have been preserved to date, though, and it remains unclear how systematically and for how long the files were kept at the department. The database created in the 1990s was not personalised, but it was used as a basis for further statistical analysis. Even though the department attempted to classify unknown speakers and systematise their records, it should be noted that the profiling of speakers did not link their voices to personality traits or ethnicity. Czechoslovak audio forensics neither assumed nor investigated a connection between speech sounds and human nature. Forensics science in the 1970s no longer analysed the voice to uncover a person’s criminal characterFootnote17 or reveal their racial, ethnic, or religious identity.Footnote18 It dissected both lexical and acoustics components of speech to determine one’s age, gender, education, place of origin or immediate mental state (Kvicalova Citation2023, 398).

While fonoscopic expertise was primary concerned with speaker identification, profiling, and an analysis of the environment and authenticity of recordings, the actual scope of its activities was broader than that. It cooperated with Czechoslovak aviation to research speech sounds in emotional distress; fonoscopy experts were invited to examine cockpit and operation tower recordings of airplane accidents.Footnote19 Málek took part in the reconstruction of crimes scenes, such as the well-known case of the so-called Spartakiad Killer in 1985, which did not require speaker identification but simply the use of audio equipment to record the suspect describing his murders in situ.Footnote20 The staff members of the department were called upon not only when expert voice dissection was needed but also in cases requiring the professional treatment of audio tapes.Footnote21

From the very beginning, the audio forensic department also routinely evaluated sound material for court trials. Expert reviews were requested directly from prosecutors’ offices or defence lawyers at different stages of the legal process.Footnote22 Fonoscopy was not a subsection of the secret police and much of its activities were done in secret – in many cases even fonoscopy staff members had no information about the cases they were analysing. The activities of the department were not limited to the identification of unknown voices in conversations recorded by the StB but were involved in much broader institutional and professional networks.

Sonic skills and expertise in the making: audio, video, and information sheets

In fonoscopy, expert hearing was by definition mediated by audio technologies. Although experts sometimes listened to magnetic tapes without the use of headphones, raw material subjected to fonoscopic analysis always took the form of a sound recording; thus acoustic evidence was already submitted in a circumscribed, technologically mediated form. This was both an advantage and a challenge: On the one hand, fonoscopy experts did not have to worry about converting immediate experience from the field into a material inscription, as was the case for most types of forensic traces collected at a crime scene. On the other hand, to be able to dissect the voices fonoscopically and transform the raw sonic evidence into the category of proof, the recordings had to meet a certain standard of quality.

This posed a continuous problem throughout the 1970s and 1980s, when low-quality tape recordings were often submitted to the fonoscopy lab by the police. While interrogating suspects, police officers often wrongly positioned directional microphones, reused magnetic tapes, and interrupted the speaker. Expert sound dissection was a delicate process and bad sound recordings posed a serious problem for the analysis. To improve the quality of material submitted for fonoscopic dissection, it was important that police officers be taught specific kinds of sonic skills.

I understand sonic skills as a combination of culturally determined listening techniques as well as skills required to effectively operate audio equipment (Bijsterveld Citation2019).Footnote23 They entail an informed use of sound recording technologies and techniques of listening, but also strategies of obtaining usable acoustic material, such as interviewing techniques or basic knowledge of the electroacoustic and phonetic means of sound analysis. An awareness of the fact that acoustic material required for spectrographic sound dissection had to meet specific standards formed part of the instruction imparted to police officers, phone operators, and other forensic experts.

Such broad distribution of sonic skills related to forensic audio analysis was important to both securing acoustic traces in a usable form and establishing the new field of fonoscopic expertise within the wider security and legal system. In order to disseminate up-to-date forensic knowledge among various professional groups, staff members of the Institute of Criminalistics authored articles and textbooks, drafted information bulletins, and co-produced more than thirty instructional films in the early 1980s, introducing various forensic methods, including fonoscopy.

The thirty-four-minute film Kriminalistická technika XI – FonoskopieFootnote24 is a very rare example of the practical demonstration of fonoscopic expertise, which introduces its main principles and methods, including the proper means of securing acoustic evidence, to criminalists and police officers. This video material is remarkable in several respects: First, it shows a variety of sonic skills pertaining to fonoscopic expertise and offers a direct glimpse into the work of fonoscopy experts inside the lab. Second, it admits from the outset that fonoscopy is required not only in the fight against “terror and blackmailing”, but also to track down the authors of dissident, “antisocial proclamations”. This is a rare confession, as the topic of political dissent is usually excluded altogether from official discussions of forensic audio analysis.

The film is framed by an illustrative case of blackmailing, in which an organised group of “anti-state actors” threatens to bomb an airplane flying from Prague to Bratislava. The case itself is fictious but is based on actual bomb threats in which the Department of Fonoscopy was professionally involved.Footnote25 To understand how sonic skills were imparted to police officers, the film provides unique insight into methods of instruction as well as the strategies of self-promotion of this novel expertise.

The basic information about fonoscopic methods is communicated in several ways. First, theoretical insights about linguistic, phonetic, and acoustic means of analysis are presented in simple two-dimensional diagrams, including a picture of the vocal tract and the larynx, accompanied by acoustic samples played back and altered by speed or frequency filters. The viewers are instructed that fonoscopy deals with (1) voice identification and identification of recording devices; (2) recovering the contents of recordings, especially when they are of poor quality; and (3) verification of the authenticity of recordings. This is further demonstrated in the video reconstruction of the blackmailing case, which features actual members of the criminal police, including Málek, interviewing the suspect and demonstrating how different kinds of reference acoustic material are obtained during so-called speech tests. The film shows much of the technical equipment used by the department (manufactured by Sony, Tesla, and the Danish company Bruel & Kjaer), including frequency filters, devices for the augmentation and visual examination of magnetic tapes, and the “voiceprint” – a device manufactured by the US company Voiceprint Laboratories and exported to Czechoslovakia as early as the 1960s.Footnote26 The film allows the viewers to observe Málek in action, as he operates the equipment, analysing different kinds of acoustic material. .

Figure 1. Jan Málek in the fonoscopy lab.

Source: The film Kriminalistická technika XI – Fonoskopie, 1981.
Figure 1. Jan Málek in the fonoscopy lab.

The criminal case that guides the audience through the film begins with a telephone call to the Czechoslovak emergency line 158. The caller instructs the police officer who answers to retrieve a package from safety deposit box number 13 at Prague’s main railway station. The deposit box contains a magnetic tape with a pre-recorded bomb threat, which was used as reference material for subsequent fonoscopic comparison. The message itself, as shown in the video, was clearly recorded on a tape that originally played music by the famous Czechoslovak pop singer Karel Gott. The tune of the popular song “Je jaká je” (She is the way she is) can be clearly heard in the beginning and at the end of the recorded message, along with the sounds of church bells and a sawmill in the background. The same anonymous speaker, the film shows, also called the airport and repeated the threat, providing experts with yet another sound recording to compare with previous ones.

In these examples, the video demonstrates that the sine qua non of any successful fonoscopic dissection is the correct “processual, technical, and tactical approach” in acquiring the reference audio material. It emphasises the urgency of allowing the caller to speak freely and for as long as possible without unnecessary interruptions. It also shows that it is especially important for the department to receive original recordings or, at the very least, professional copies made by staff members. The film instructs police officers to sparingly listen to the recorded sounds, as repeated playback significantly worsens the acoustic quality of the recording. The instructions go as far as to show how the magnetic tapes should be stored and prepared for transportation to avoid physical damage.

In the next part of the film, a suspect is summoned to the fonoscopy lab to undergo speech tests performed by Málek. If the reference recording was made at the fonoscopy lab, all the technical and professional criteria were easily met; the detailed directions provided in the film were meant for cases where the recordings were made by police officers in police stations, prisons, or interrogation rooms. The film recommends against the use of battery-powered recording devices (as the batteries might run out during the interview), gives specific instructions regarding recording speed (which should be at least 9.5 cm/s−1), and demonstrates how a directional microphone should be positioned on a table. Clearly, sonic skills of all kinds could have made a difference in the final identification and/or recognition work.

The speech tests performed by Málek include recording samples of uninterrupted speech, when the person under examination speaks freely (recounting their educational and vocational background, family life, etc.) followed by recordings in which the suspect is asked to speak in a manner similar to the original recording. It was especially important that the recorded material contain all significant linguistic, phonetic, and acoustic elements present in the reference material. In the video, Jan Moravčík (the suspect) is first asked to read aloud a pre-prepared text (“reference”) containing the same phrases and expressions used in the original bomb threat call (“original”).

Reference: “We are starting a youth club, there are many of us and therefore we will organise various events. The first action we announce is this: we will be holding a singing competition on a bus during a trip to Bratislava, for which we need suitable conditions. At least half of the bus must register for the competition by tomorrow, and the tickets with the names of the interested parties must be put in the same box used for the raffle. Do not try to influence the person who will draw the order”.

Original: “We are a group called the Dynamite. There are many of us. We hate you and therefore we will be destroying you. The first action we announce is this: If you don’t meet our conditions, there will be a bomb explosion on flight OK 280 from Prague to Bratislava. Put half a million crowns in the same box used for the message by tomorrow midnight (…). Do not try to follow the person who withdraws the money”.

In addition to the film material, a series of professional publications dealing with specific areas of forensic expertise were published between 1983 and 1990, including a detailed handbook co-authored by Málek and the linguist Václava Musilová called Fonoskopie (Málek and Musilová Citation1989). It further explained all the different – technical, theoretical, and procedural – aspects of sound dissection and demonstrated once again that sound spectrograms, also referred to as sonograms,Footnote27 were best suited for comparing short fragments of speech, as Moravčík’s speech tests showed. Although the handbook represents by far the most detailed source on the tenets of Czechoslovak fonoscopy, basic information about criminalistic audio forensics was probably more efficiently disseminated by the police information bulletin, which included a simple information sheet with instructions regarding the best way to answer anonymous phone calls (Málek Citation1981a, Citation1981b). To supplement the printed and video material, Málek also lectured about fonoscopy to aspiring criminalists at OKTE (Odborná příprava kriminalistických techniků a experů [forensic technicians and experts training]).

Another important source of up-do-date forensic knowledge was the periodical Kriminalistický sborník (Criminal proceedings), which first appeared in 1957 and at its peak reached a publishing run of 28,000 copies. Generations of forensic scientists, investigators, lawyers, medical examiners, prosecutors, and judges drew professional knowledge from its articles and case studies (Hlaváček Citation2007–2010), including those dealing with fonoscopy (Musilová Citation1977; Málek Citation1981b). Although Kiminalistický sborník did not provide in-depth accounts of forensic methods as far as fonoscopy was concerned, it played an important role in establishing it as a field of forensic expertise. In 1961, the journal also began publishing reports from international forensic symposia, which convened forensic experts from the Eastern Bloc countries – mostly Czechoslovakia, GDR, Bulgaria, and the USSR – to exchange ideas about new criminalistic methods, including voice analysis. However, as noted above, the actual extent of collaboration in voice identification methods was rather limited, and the cooperation rarely expanded beyond mere declarations. Apart from visits to Poland and the GDR, the department had no professional contacts with similar institutes in Hungary, Russia, and Bulgaria.

Audio forensic expertise was thus constructed both through precise laboratory work and its representation and communication within the forensic community and by means of the broad distributions of sonic skills in lectures, information bulletins, and an instruction film. The Department of Fonoscopy worked closely with different state institutions, while it pursued independent research on voice dissection, which included the process of turning acoustic raw data into reliable forensic evidence. In the remainder of this essay, I will reflect on the construction of sound-based objectivity in forensics against the background of the preference for visual inscriptions in the Czechoslovak legal and police system.

Enacting objectivity: visual inscriptions and probability statements

It is now a mainstream assumption of the history of science and STS to understand objectivity as a historically and culturally contingent category, negotiated and enacted by numerous human as well as non-human actors (Adam Citation2020; Daston and Galison Citation2007; Latour Citation1987, Citation2005). Forensic objectivity, then, might best be understood as an assemblage of criminal procedures, witness testimonies, and the means of representing and constructing evidence. This is created and maintained not only in forensic science laboratories and in the courtroom, but in dialogue with the much broader cultural and political environment (Kruse Citation2016). I would like to argue that the fashioning of sound-based evidence in fonoscopy was deeply influenced by the existing cultural notions of sound and the epistemic value of listening in the sciences. The contested nature of sound-based objectivity, I will show, contributed to an important shift in the formulation of forensic claims, which, beginning in the 1980s, began to be expressed on a verbal probability scale.

In the history of modern science, objective knowledge has been most strongly associated with seeing and visual methods of analysis (Crary Citation2001; Daston and Galison Citation2007), although a growing body of sound studies research has now convincingly portrayed listening as a form of scientific observation, which, like vision, can be disciplined, trained, and moulded into an analytical instrument (Bijsterveld Citation2019; Bruyninckx Citation2018; Morat Citation2014; Tkaczyk, Mills, and Hui Citation2020). In forensics, the belief in epistemic clarity and truthfulness of visual information was perhaps most profoundly represented by the camera, which was able to fix criminal events in time, make them re-visitable, and amplify traces that could otherwise be missed (Bell Citation2018; Neale Citation2020). Photography, scale models, diagrams, and other visual means of representing the crime have been instrumental in enacting objectivity in forensics.Footnote28 Although much scholarly attention has now been paid to the relationship between forensic objectivity and visual representation of evidence, the role of sonic skills in gathering, interpreting, and producing different kinds of forensic evidence have rarely been considered by historians, which may partly stem from the fact that hearing long enjoyed the status of a “dangerously subjective” epistemic instrument (Daston and Galison Citation2007, 17). A more pragmatic explanation might derive from the fact that sound recordings were usually not made at the scene of the crime,Footnote29 and acoustic traces were thus connected with the crime scene in a less immediate way than fingerprints or blood stains.

The idea that a voice is as unique as a fingerprint goes back to the nineteenth century,Footnote30 but the notion of the “voiceprint” as a sound-based equivalent of fingerprints saw its heyday in the 1960s, when the spectrograph began making easy-to-compare graphic inscriptions of the voice.Footnote31 Unlike musical or phonetic notations, which are arbitrary representations of sound, spectrograms revived the idea that sound itself can produce an objective visual record, which is a direct reflection of its physical properties.Footnote32 The forensic application of voiceprint technology became popular thanks to its commercialisation by the US engineer Lawrence Kersta (Citation1962),Footnote33 and although the idea had already been challenged in the 1960s, the idea of “vocal fingerprints” turned out to be an enduring explanatory metaphor. Although Czechoslovak experts never subscribed to the idea that voice spectrograms were comparable to fingerprints, the notion itself proved to be a powerful rhetorical tool and was repeatedly evoked when fonoscopy experts addressed non-specialised audiences.Footnote34

In my examination of Czechoslovak audio forensics, I draw upon history of science and STS research that highlights the role of bodily skills, especially listening, in the process of knowledge-makingFootnote35 as well as STS scholarship that attends to the relationship between forms of (mostly visual) representation and the production of knowledge (Burri and Dumit Citation2008; Coopmans et al. Citation2014). Although the fonoscopic dissection of the voice relied on sound visualisations performed by the spectrograph, trained listening remained a key epistemic instrument in laboratory practice, where acoustic raw material (including original sound recordings submitted to the analysis and speech samples obtained by fonoscopy staff members) was analysed to serve as forensic evidence.

The use of sound recording in the sciences has been repeatedly associated with what Lorraine Daston and Peter Galison have described as mechanical objectivity: a tendency in the modern sciences to eliminate the subjective human factor in recording and analysing natural phenomena by replacing it with machine-generated automatic registration.Footnote36 The phonograph, the tape recorder, and the sound spectrograph were technologies that met the demands for mechanical objectivity. The practices and rhetorical strategies of the fonoscopy lab, where the spectrograph’s guarantee of a “completely objective”Footnote37 analysis featured prominently beginning in the 1970s, concurred with this discourse.

At the same time, there was always the need for a fonoscopy expert who could listen to the recorded voices, carry out speech tests at the lab, and interpret phonetic, linguistic, and electroacoustic traces so that together they could be made into reliable forensic evidence. This complies with what Daston and Galison called “trained judgement”, a discourse emphasising the role of professional training and experience in interpreting evidence, which supplemented the ideal of mechanical objectivity in the second half of the twentieth century. As Michael Mopas shows in his accountFootnote38 of the backlash against Kersta’s voiceprint identification in the late 1960s and 1970s, the sole proficiency in audio engineering did not itself constitute authoritative expertise on the human voice. To be able to scrutinise voices for forensic purposes, Málek, too, combined his original expertise in electroacoustics with subsequent university training in phonetics.Footnote39 Together with Musilová, they were able to offer comprehensive voice analysis from the time the department came into existence. In fonoscopy, objectivity was constructed as a combination of mechanical dissection and trained judgement. The role of expert listening, however, was downplayed in the final presentation of fonoscopic evidence, which foregrounded the objective visual record of the voice.

The 1981 film concludes by emphatically stating that fonoscopy expert reviews are fully accepted as evidence in criminal proceedings. But what must be done with sonic traces to translate them into a category of proof accepted and understood both by criminal investigators and a court of law? To understand how audio forensic evidence is constructed and represented in the process of investigation and in court, the notion of “inscription” is useful, as it pays attention to the process in which material traces are transformed into symbolic forms (Latour Citation2005; Latour and Woolgar Citation1986). A successful inscription requires a lot of “paperwork” to translate the original phenomena into an intelligible, convincing, and easily moveable object shared across a wide array of professional actors.Footnote40 In his ethnography of the Conseil d’Etat, Bruno Latour observes that one of the differences between science and law is that in the legal context the subject matter becomes virtually invisible under layers of textual inscriptions (Latour Citation2010, 128–9).The fact that judicial evidence is constructed primarily in documents, reports, notes, maps, and receipts, heightens the requirement to translate sound-based evidence into a from that would be (literally) legible to the court of law. In the case of fonoscopy, the material trace (sound recording) was turned both into writing (in the form of laboratory notes and written reviews) and into visual inscription (sonograms).Footnote41

Forensic evidence is never transparent but always constructed through collective practices. Charles Goodwin describes the process as essentially visual, when complex and often ambiguous events are understood through what he calls “professional vision”, that is “socially organised ways of seeing … that are answerable to the distinctive interests of a particular social group”, whose members, including scientists, transform raw data into “objects of knowledge” (Goodwin Citation1994, 606).Footnote42 Inscriptions (such as diagrams, graphs, or lists) are supposed to make complex information manageable, easily readable, and presentable to non-expert audiences (Latour Citation1990).

In the discussion about audio forensics, the notion of “professional vision” could perhaps be replaced by that of “professional audition”, which served as the main point of departure for making acoustic raw data into sonic evidence.Footnote43 The importance of trained hearing remained undisputed by fonoscopy experts, who were well aware of both the benefits and limits of spectral analysis. Although their expert reviews had to be accompanied by speech sonograms, they combined aural analysis with visual inspections of spectral images in their laboratory work. In the process of analysing and producing evidence, sonograms were used in conjunction with trained listening; this was not the case in the courtroom, however, where the participants of legal processes relied on written and visual inscriptions. Sound recordings were not played aloud in the courtroom. In Czechoslovak judicial history, recorded voices did indeed resound in courtrooms from time to time, but not in cases where fonoscopic expertise was required – which is still true today.Footnote44 In terms of voice identification and recognition, there was no added epistemic value in playing the voices out loud, because, as experts confirmed,Footnote45 the acoustic quality of such playbacks would not be high enough to provide relevant information for auditory identification.

It remains unclear, however, to what extent sonograms contributed to the clarity of expert evaluation. Visual inscriptions of speech were not easy to interpret for the untrained eye, and their immediate epistemic value for participants in court trials was limited. I would like to argue that the sonograms did not necessarily make the written evaluation more accessible, as they themselves were in need of interpretation. More than illuminating the expert review, they functioned as a token of a certain type of visually grounded evidence, which was shared across forensics.Footnote46 Hence also the continuous recourse to the analogy of the fingerprint, which increased the credibility of the method and connected voice analysis to an existing notion of objectivity, intelligible to judges and prosecutors. Sonograms, too, made sound-based evidence appear more objective and therefore convincing.Footnote47 In judicial practice, however, the court sometimes summoned fonoscopy experts to orally summarise their reviews and explain the accompanying visual material, which, as Musilová has written, merely “illustrated the results” to non-experts.Footnote48 It seem clear that the formation of sound-based evidence stemmed not only from the nature of fonoscopic laboratory work, but it was directly shaped also by the legal system and it expectations.

The explanatory power of forensic inscriptions used in the courtroom often seems limited. It is not only the spectrographic images of sound that are difficult to decipher, but even crime scene photography or fingerprints – prime examples of “mechanical objectivity” – are in need of expert interpretation and explanation (Adam Citation2020; Cole Citation2002, 180). Although trial participants are usually invited to “see for themselves”, the visual inscription of forensic evidence, it seems, rarely ever “speaks for itself” but requires that an expert speak on its behalf.

In Czechoslovak fonoscopy, forensic objectivity was not enacted simply by recurring to visually grounded evidence – although this was an integral part of expert analysis – but by making the relative uncertainty manageable by offering nuanced probabilistic conclusions.Footnote49 The final evaluation regarding the identity of the speaker was increasingly formulated in probabilistic, rather than conclusive, terms. The 1981 film presents the results of fonoscopic dissection in a categoric manner: the identity is either confirmed or ruled out. Towards the end of the decade, however, expert conclusions were expressed on a scale from: “no”, “it cannot be ruled out”, “it might be possible”, “probably”, “with high probability”, and “with the highest probability” (Málek and Musilová Citation1989, 76). There is clearly a more general shift to be observed from categoric to probabilistic conclusions in the audio forensics of the 1980s, not only in Czechoslovakia but also in the GDR or the UK (Bijsterveld Citation2021; French Citation2017). This contrasts with Simon A. Cole’s observation that the move away from certainty claims typical of fingerprint identification (i.e. a match, a non-match, or inconclusive),Footnote50 towards probabilistic conclusions in forensics was prompted by genetic identification methods (Cole Citation2002, 290). DNA profiling was probably the first forensic area to produce its results in the form of numerically quantifiable percentage probabilities/hits, while audio forensics relied on a verbally formulated probability scale. Still, it seems that fonoscopy was at the forefront of a more general shift in grasping the relationship between evidence, objectivity, and the means of formulating an expert opinion in forensics.

Due to the complexity of human voice and the inconclusiveness of speech sonograms, forensic fonoscopy contributed to the framing of forensic science in terms of probabilities, which made the uncertainty deliberately visible to judges, lawyers, and criminal investigators, and, consequently, made fonoscopic analysis more transparent and reliable. I have noted above that forensic inscriptions are hybrid artefacts of laboratory and judicial work. The probability scale introduced by audio forensics shows how science and law are “mutually constituted” (Jasanoff Citation1995, xv). Although the fonoscopy experts submitted speech sonograms along with their written reports and had recourse to the idea of “completely objective” voiceprints, their adoption of probability scale effectively counterbalanced the rhetoric of mechanical objectivity and redefined scientific authority in the courtroom. The fonoscopy opened up about the nature of forensic sound dissection in the second half of the 1980s and showed that doubts were indeed part and parcel of scientific work. As Sheila Jasanoff repeatedly observed, the “discourse of infallibility”, historically associated with fingerprinting and DNA profiling, runs the risk of “reading scientific information beyond what it can establish with reasonable certainty” (Jasanoff Citation2006, 337). Audio forensics, whose methods and subjected matter were historically associated with subjectivity and uncertainty, never subscribed to such a discourse. This, I argue, was an asset rather than a disadvantage of sound-based expertise as it introduced elements of manageable uncertainty to the performance of scientific expertise in the courtroom and thus put the process of both scientific and judicial evidence-making on display.

Conclusion: the afterlives of Czechoslovak fonoscopy

In 2008, Málek was awarded the prize for police veteran of the year.Footnote51 This, along with the fact that he kept performing fonoscopic voice dissections at the Institute of Criminalistics throughout the 1990s and early 2000s, testifies to the degree of continuity between audio forensic practice under the Communist regime and after 1989.Footnote52 Despite the political, legal, and technological changes that have taken place over the last fifty years, the basic principles, outcomes, and forensic roles of criminalistic sound analysis have remained largely unchanged to this day.

After receiving his award, Málek was asked by a journalist about the future of the field; he emphatically replied that it would be automated and that “software is being developed that could identify the speaker, or at least make the pre-selection”. At the same time, he remained certain that the final interpretation of sonic evidence would still require expert judgement. The newspaper article also highlighted the fact that audio experts were required to have excellent hearing range, and that Málek himself was able to “hear the grass grow”.Footnote53 This not only corresponds to 1980s rhetoric, in which a growing degree of automation was envisioned by Czechoslovak experts,Footnote54 but it represents a common, if not mainstream, contemporary view of the future of forensic voice analysis shared by phonetic and linguistic specialists.Footnote55 At the same time, as Mopas argues in this issue, forensic automatic speech recognition (FASR) software, especially the way it is employed, might pose a serious problem for the accuracy of sound analysis. Even FASR performed since 1990s, where matches are found in a completely automated way, requires a human expert to not only operate the system but correctly interpret the results. Whether the expert should be a speech scientist, an engineer, or anyone who receives basic operational training varies across countries and is a controversial topic.Footnote56

As far as speech scientists are concerned, the expert ear remains the key component of forensic evaluation. In the Czech Republic, two out of three authorised audio forensic experts combine auditory and mechanical spectral analysis in their work.Footnote57 The third, in contrast, relies on “fully automatic” speaker recognition “without human intervention”.Footnote58 In judicial practice, reviews from both types of expertise are sometimes combined, which is not common practice in today’s European courtrooms.Footnote59 A judgement issued by County Court Brno in 2013 offers a glimpse into the two types of expert reasoning.Footnote60 While Zdeněk Švenda, who performed voice identification based on a combination of signal processing by linear predictive analysis, believed that the credibility of the method derived from the fact that it was “fully objective and automated”; Marie Svobodová, the expert witness from the Institute of Criminalistics, used mechanical spectral analysis together with aural evaluation based on trained hearing, which she, somewhat surprisingly, described as “purely subjective”. She further explained to the court that the combination of both methods strengthens the final decision and went into some detail to lecture the court about the parameters of probability scale used for formulating the conclusion. Additionally, she discussed the nature of the “voiceprint”, explaining that it was not similar to DNA profiling and that the results of these two types of biometric analysis cannot be formulated in a similar manner.Footnote61

In the court decision, we encounter similar issues and formulations present in the practice of Czechoslovak fonoscopy since the 1970s: the voiceprint, the notion of mechanical objectivity, the probability scale, and the association of hearing with subjective evaluation. Still, the importance of trained hearing was not disputed by the court, especially when accompanied by “objective”, that is mechanical or automatic, means of analysis. Perhaps, as Josephine Hoegaerts observes for the nineteenth century, the analysis of the human voice will continue to escape a fully mechanical dissection, as some of its components are best understood by the human ear. At the same time, the development of forensic audio analysis demonstrates that the complexities of the human voice, together with the contested nature of speaker identification – including the explanatory power of speech sonograms and the debate regarding the epistemic status of listening – are best understood as productive elements in the development of modern forensic science as they compellingly connected objectivity with probability claims.

The verbal probability scale introduced in the Czechoslovak fonoscopy in the 1980s entailed a strong element of subjectivity: the final interpretation of the results and their rating on a scale 1–6 was a “contingent, sociotechnical achievement” (Semel Citation2022, 282) and a direct expression of a cooperation between the human listening subject, audio technologies and acoustic measurement instruments. This is an element of speaker identification, and, by extension, voice recognition and sound detection, which has not been overcome by automation. As Beth M. Semel has recently shown in her analysis of a smartphone application which listens to speech sounds that might signal an approaching bipolar episode, the notion of mechanical objectivity embodied by a listening computer does not hold true even when digital phenotyping is concerned. She uses the term “heteromation” to emphasise the analytical move from automation to the role of technicians, designers, and annotators who enable machine listening (Semel Citation2022). The history of criminalistic speaker identification in Czechoslovakia shows that the ideal of mechanical objectivity was an inherent part of forensic sound dissection since the establishment of fonoscopy as a specific branch of forensic expertise: it first materialised in the notion of voiceprint – which continues to inform the public image of the field until this day – and was revived in the debate around the use of automatic methods of speech recognition in forensics. The probability scale, which stemmed from the interactions between audio forensic laboratory work, judicial system of evidence-making, and cultural notions of sound and hearing was an attempt to dismantle the fantasy of objectivity of machine-based analytical methods and as such it prefigured much of the later debates about the nature and status of scientific expertise in the courtroom.

Acknowledgments

The writing of this article was funded by the Czech Science Foundation under the research project The Second Sense: Sound, Hearing and Nature in Czech Modernity (20-30516). The first version was written for the workshop Forensic Voices: Cultures of Identification Through Sound held at Maastricht University in June 2022. The author would like to thank Václava Musilová and Jan Málek for sharing details about their professional work at the Institute of Criminalistics, Radek Skarnitzl for his insights about the recent development in audio forensics, and the anonymous reviewers for their comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Anna Kvicalova

Anna Kvicalova is a historian of science, religion, and the senses. In her work, she deals with the history of sound-based knowledge and listening skills. She is a permanent research fellow at the Centre for Theoretical Study (Charles University and the Czech Academy of Sciences), where she is the leader of the research project The Second Sense: Sound, Hearing and Nature in Czech Modernity. She received an MA from the University of Amsterdam and a PhD from Freie Universität Berlin. Between 2013 and 2017, she was a research fellow at the Max Planck Institute for the History of Science in Berlin. She is the author of Listening and Knowledge in Reformation Europe (Palgrave, 2019) and other publications on sound, hearing, and acoustics (published with Annals of Science; Sixteenth-Century Journal; or Technology and Culture). She is also an assistant professor in the Department for the Study of Religions at Masaryk University, Brno

Notes

1. https://www.novinky.cz/krimi/clanek/p olicie-bude-hlucha-opousti-ji-jediny-expert-na-analyzu-hlasu −62,843); https://www.irozhlas.cz/zpravy-domov/policii-od-noveho-roku-opousti-jediny-expert-na-analyzu-hlasu_201010111632_kbrezovska;https://www.idnes.cz/zpravy/cerna-kronika/policii-chybeji-pismoznalci-odchazi-i-jediny-expert-na-rozbor-hlasu.A101011_114708_krimi_js (accessed September 16, 2022). All translations from Czech are mine. I am grateful to Jan Málek for sharing details about his work at the Institute of Criminalistics in the interviews held on July 25 and September 6, 2019.

2. The department was established under the authority of Public Security (Veřejná bezpečnost) by the Ministry of the Interior, Resolution ČSSR No. 35/73 (1975). Criminalistics is the application of scientific methods to the investigation of crime; its results are applied to matters of criminal and civil law. The expressions criminalistics and forensics and treated as synonymous in this article.

3. I have translated the Czech word fonoskopie as “fonoscopy” in English to refer to a specific method and tradition of forensic acoustics that emerged in Czechoslovakia in the 1970s and Poland in the 1960s. The English term “phonoscopy”, in contrast, has a narrow medical meaning and refers to sound analysis made by the phonoscope.

4. See, for example, Theen (1980). A different approach is represented by Iacob et al. (Citation2018). The notion of “expertise” I adopt in this paper de-emphasises the expert’s direct intervention in public affairs.

5. In the 1960s, the Czech phonetician Přemysl Janota conducted original, internationally acknowledged research on speech’s personal characteristics by investigating the speech spectrum and artificial speech. For Janota’s listening experiments at Charles University and the exact nature of knowledge transfer between phonetics, aviation, and fonoscopy, see Kvicalova (Citation2023).

6. See the essay by Michael Mopas in this issue. These issues are also discussed in the “witness seminar” on speaker identification (Bijsterveld and Kvicalova Citation2023).

7. See Kvicalova (Citation2023, 388–9). Despite the initiation of fonoscopy in Poland in the early 1960s, its institutional development has yet to be researched. See Maciejko, Rzeszotarski, and Tomaszewski (Citation2010); and Málek and Musilová (Citation1989, 7). The term fonoscopy was first used in Polish forensics by Andrzej Szwarc in 1964 (Szwarc Citation1964).

8. Christian Koristka was in charge of audio forensics at the Department of Criminalistics established at Humboldt University in 1968 (Bijsterveld Citation2021).

9. Czechoslovak fonoscopy never assumed a standardised form to be applied to every fonoscopic analysis. See Kvicalova (Citation2023).

10. A27 (1981), 268, ABSCR. Compare to the project of “The Voice Operated Typewriter” (Li and Mills Citation2019, 143). See also Karin Bijsterveld’s article in this issue.

11. Compare to Bijsterveld (Citation2021).

12. For the regulations concerning short and long-term eavesdropping see Povolný Citation2001 as well as TA-111, TA-122, TA-133, “Situace ve využívání operativní techniky”, and “Návrh prognózy zpravodajské činnosti”. Many of those listening to the permanently wiretapped lines were women. See A27, Books of permanent service 1975–1989, ABSCR.

13. The potential of directional parabolic microphones used in the field recordings of bird songs was explored in this respect, but a similar device that could record human voice over a long enough distance without being detected was never constructed. A27/191 (1975), 22. ABSCR. A microphone capable of performing the desired function would have to be very large, thus undermining the operation’s secrecy. Despite the Czechoslovak state’s best efforts to invent original eavesdropping devices based on novel technological principles that would be difficult for foreign enemies to detect, many of the recording and monitoring devices were purchased abroad in capitalist countries, primarily West Germany (many electroacoustic devices were purchased from PK Elektronic in Hamburg) but also Denmark and the US. See ZSGŠ, BF-S53, 1979, 9, See also II. Správa FMV [“Situace ve využívání operativní techniky a některé náměty na její zefektivnění”], (May 5, 1975), ABSCR.

14. Resolution MV 1960/194–10/12–47–III/2 of the Ministry of the Interior (December 10, 1947). See also Hlaváček (Citation2007–2010).

15. Resolution of the Ministry of the Interior No.166 (December 12.,1958), appendix č. 1, čl. 11, ABSCR.

16. See the administrative book of the Department of Fonoscopy (henceforth ABD), AFD.

17. As was the case in W. Doegen’s project in the Berlin Lautarchiv in the 1920s. See Li and Mills (Citation2019, S130)

18. The relationship between sound and the construction of race has been described in connection to the supposed vocal and aural markers of Black identity in the United States. See Stoever (Citation2016) and Eidsheim (Citation2019).

19. See ABD, July 26, 1979 (I MV ČSSR), March 7, 1980 (XI/s FMV & SL I), helicopter crash; August 14, 1981, dangerous landing; plane crash L-39 OK 184, Kbely; October 21, 1981, plane crash HA-LCF, AFD.

20. A serial killer, Jiří Straka, attacked eleven women in Prague in 1985, killing three of them. The case received significant attention in the media as the attacks spanned almost seven month and took place when Prague was hosting the Spartakiad, a mass gymnastics event, which took place every five years beginning in 1955.

21. For example, the request to copy the sound files from the wiretapped landline of the famous dissident Alexandr Vondra (August 8, 1989), ABD.

22. Some of the cases were Říha (1978, requested by defence lawyers); Machart (1982); Šiff (1984, anonymous calls), ABD.

23. The notion of sonic skills expands on Jonathan Sterne’s term audile technique, which describes listening as a distinctively modern technical skill (Bijsterveld Citation2019; Sterne Citation2003).

24. The film was produced for the Ministry of the Interior’s internal use by Krátký film Praha (Prague Short Films) at the film labs in Barrandov. Most copies of the 16 mm film were destroyed or lost; the only copy I managed to retrieve, thanks to the assistance of Zdeněk Kopecký, is archived at the Prague Police Museum.

25. Compare to Li and Mills, who describe airline bomb threats as being one of the reasons why the FBI turned to the Bell Labs for a technical solution in the 1960s (Li and Mills Citation2019, S139).

26. Voiceprint Laboratories, Inc., Somerville, N.J. The exact story of how the instrument reached Communist Czechoslovakia remains unknown.

27. Sound spectrograms are visual representations of sound frequency and intensity as it unfolds in time, which is close to how sound is perceived by the human ear. Although the words sonogram and spectrogram were often used interchangeably, the former was preferred in the Czechoslovak context.

28. The use of miniatures in courtrooms during the first half of the twentieth century is discussed by Neale (Citation2020).

29. This is not necessarily the case today, when an increasing number of sound recordings are made by mobile phones. Compare to the interview with Peter French, Comparing Voices.

30. See the essay by Josephine Hoegaerts in this special issue. See also Li and Mills (Citation2019).

31. The first sound spectrographs were used at Bell Telephone Laboratories as early as the 1940s. For this history – the research program and its later military applications – see Mills (Citation2010); Li and Mills (Citation2019); and Potter, Kopp, and Green (Citation1947).

32. This tradition goes back to Erns Chladni’s Klangfiguren – spatial figurations of sound frequencies as they appear in thin layers of sand on glass or metal plates (Zielinski Citation2006, fn 16, 159–204), and Erlmann (Citation2010, 155–56, 189–94).

33. Kersta founded his own company, Voiceprint Laboratories, becoming a renowned expert in voice identification in the US. For more information about the nature of his collaboration with police forces and legal authorities, see Zbikowski (Citation2002).

34. The critical reflection of the idea of voiceprint in Czechoslovak fonoscopy is thoroughly discussed in Kvicalova (Citation2023).

35. The epistemic roles of sound and hearing have been studied in various experimental, academic, cultural, and technological contexts; see Sterne (Citation2003), Bruyninckx (Citation2018), Hui, Kursell, and Jackson (Citation2013), Birdsall and Tkaczyk (Citation2019), Tkaczyk (Citation2023), Mody (Citation2005), Rice (Citation2012), Thompson (Citation2002), and Bijsterveld (Citation2019).

36. Daston and Galison (Citation2007).

37. “Such phonetic characteristics can usually be measured by instruments, which implies a completely objective assessment of these phenomena” (Musilová Citation1977, 369).

38. See Mopas (2023), in this issue.

39. Málek was a graduate of CTU, Faculty of Electrical Engineering, where he majored in audio. He also took classes in phonetics at the Faculty of Arts with the well-known phoneticians Přemysl Janota and Milan Romportl (Kvicalova Citation2023).

40. See Bruno Latour’s notion of “immutable mobiles” (Latour and Woolgar Citation1986).

41. The view that the photograph represented the human body better than its written description accompanied already its first forensic uses in the second half of the nineteenth century (Cole Citation2002, 20).

42. Although Goodwin begins his article with the 1992 Rodney King case, where forensic sound analysis played a key role, he does not consider professional listening as a viable category and instead concentrates on visual aspects of expertise. For the use of audio forensic expertise in this case, see Angelika Braun’s recollection, Comparing Voices.

43. The notion of the “professional audition” in the context of audio engineering was proposed by Thomas Porcello (Citation2004). The professional audition in aural field observations is discussed in Bruyninckx (Citation2018, 536).

44. Recorded confessions were instrumental in Stalin’s show trials (1948–1953), when they were played back in courtrooms for performative and emotional, rather than forensic, motives (Vorel, Šimánková, and Babka Citation2003, 224–34).

45. See the interview with Peter French, Comparing Voices.

46. Compare to the 1969 conclusions of the Technical Committee on Speech Communication of the Acoustic Society of America, which explicitly stated that spectrograms could have misled the jury (Bolt et al. Citation1969, 600, 602).

47. Compare to the present-day audio forensic evaluations before the court, in which “objectivity” is often explicitly evoked by the expert. See the discussion in the final part of this essay.

48. Interview with Václava Musilová, June 9, 2020. The appearances of fonoscopy experts before different courts are recorded in ABD.

49. This seems to have been a more general trend in forensic audio analysis towards the end of the 1980s (Bijsterveld Citation2021; French Citation2017).

50. The current use of automated and semi-automated fingerprint identification system is further discussed in Dror and Mnookin (Citation2010).

51. See the police announcement: https://adoc.pub/policista-roku-2008-str-.html (accessed October 15, 2022).

52. This contrasts with the situation in the GDR, where most experts involved in voice identification before the fall of the Berlin Wall in 1989 did not pass the so-called Gauck examinations and left the police in the 1990s. See the discussion with Angelika Braun, Comparing Voices.

54. An interest in new methods of automatic speech recognition was among the reasons for Málek’s first trip to Berlin in 1980. See Málek’s field notes, AFD.

55. The combination of aural, mechanic and, in certain cases, automatic analysis is recommended by most members of the International Association for Forensic Phonetics and Acoustics (IAFPA). See French (Citation2017) and Gold and French (Citation2019); see also the essay by Michael Mopas in this special issue.

56. For the discussion see Kvicalova and Bijsterveld (Citation2023).

57. Only one of the three expert witnesses, Marie Hes Svobodová, is based at the Institute of Criminalistics. Radek Skarnitzl, the head of the Institute of Phonetics at Charles University, is a leading member of the IAFPA, active in drafting new standards for forensic expert witnessing as a member of the Ministry of Justice’s advisory board. See Skarnitzl (Citation2022), and his illustrative report at https://znalci.justice.cz/dokumenty/#dokumenty. The third expert is Zdeněk Švenda, whose work is based solely on automatic recognition (“Svendaz Atomatic Speaker Recognition”).

59. In the UK, for example, expert testimony based solely on automatic speaker recognition is not admissible. In Germany, it is rare for more than one audio forensic expert to be asked to submit a review. See Kvicalova and Bijsterveld (Citation2023).

60. County Court Verdict from November 26, 2013, Brno/Zlín (No. 61T 7/2013–5030).

61. Ibid, 60–63.

References

  • Archival Sources and Oral Sources
  • Archive of the Fonoscopy Department, Prague, Czech Republic (AFD).
  • Archive of the Prague Police Museum.
  • Security Services Archive of the Czech Republic, Prague, Kanice, Czech Republic (ABSCR).
  • Málek, Jan, interview with author, July 26 and September 6, 2019.
  • Musilová, Václava, interview with author, June 9, 2020
  • Published Sources
  • Adam, A. 2020. Crime and the Construction of Forensic Objectivity from 1850. Cham: Palgrave Macmillan. https://doi.org/10.1007/978-3-030-28837-2.
  • Badenoch, A., A. Fickers, and C. Henrich-Franke, eds. 2013. Airy Curtains in the European Ether: Broadcasting and the Cold War. Baden-Baden: Nomos.
  • Bell, A. 2018. “Crime Scene Photography in England, 1895– 1960.” Journal of British Studies 57 (1): 53–78. https://doi.org/10.1017/jbr.2017.182.
  • Bijsterveld, K. 2019. Sonic Skills: Listening for Knowledge in Science, Medicine and Engineering (1920s–Present). Basingstoke: Palgrave Macmillan. https://doi.org/10.1057/978-1-137-59829-5.
  • Bijsterveld, K. 2021. “Slicing Sound: Speaker Identification and Sonic Skills at the Stasi, 1966-1989.” ISIS 112 (2): 215–241. https://doi.org/10.1086/714826.
  • Birdsall, C., and V. Tkaczyk. 2019. “Listening to the Archive: Sound Data in the Humanities and Sciences.” Technology and Culture 60 (2): S1–13. https://doi.org/10.1353/tech.2019.0061.
  • Bolt, R., F. S. Cooper, E. E. David Jr., P. B. Denes, J. M. Pickett, and K. N. Stevens. 1969. “Identification of a Speaker by Speech Spectrograms.” Science: Advanced Materials and Devices 166 (3903): 338–342. https://doi.org/10.1126/science.166.3903.338.
  • Broeders, A. P. A., and A. C. M. Rietveld. 1995. “Speaker Identification by Earwitnesses.” In Studies in Forensic Phonetics, edited by V. Braun and J.-P. Köster, 24–40. Trier: Wissenschaftlicher Verlag.
  • Bruyninckx, J. 2018. Listening in the Field: Recording and the Science of Birdsong. Cambridge, MA: The MIT Press. https://doi.org/10.7551/mitpress/10307.001.0001.
  • Burri, R. V., and J. Dumit. 2008. “Social Studies of Scientific Imaging and Visualization.” In The Handbook of Science and Technology Studies, edited by E. J. Hackett, O. Amsterdamska, M. Lynch, and J. Wajcman, 297–317. 3rd ed. Cambridge, MA: MIT Press.
  • Činátl, K., J. Mervart, and J. Najbert. 2018. Podoby československé normalizace: Dějiny v diskuzi. Praha: NLN, USTR.
  • Cole, S. A. 2002. Suspect Identities: A History of Fingerprinting and Criminal Identification. Cambridge, MA/London: Harvard University Press. https://doi.org/10.4159/9780674029682.
  • Coopmans, C., J. Vertesi, M. Lynch, and S. Woolgar, eds. 2014. Representation in Scientific Practice Revisited. Cambridge, MA: The MIT Press. https://doi.org/10.7551/mitpress/9780262525381.001.0001.
  • Crary, J. 2001. Suspension of Perception: Attention, Spectacle, and Modern Culture. Cambridge, MA: MIT Press.
  • Daston, L., and P. Galison, eds. 2007. Objectivity. New York: Zone Books.
  • Dror, I. E., and J. L. Mnookin. 2010. “The Use of Technology in Human Expert Domains: Challenges and Risks Arising from the Use of Automated Fingerprint Identification Systems in Forensic Science.” Law, Probability and Risk 9 (1): 47–67. https://doi.org/10.1093/lpr/mgp031.
  • Eidsheim, N. S. 2019. The Race of Sound: Listening, Timbre, and Vocality in African American Music. Durham/London: Duke University Press.
  • Erlmann, V. 2010. Reason and Resonance: A History of Modern Aurality. New York: Zone Books.
  • French, P. 2017. “A Developmental History of Forensics Speaker Comparison in the UK.” English Phonetics 21:271–286.
  • Fulbrook, M. 2009. “The Concept of ‘Normalization’ and the GDR in Comparative Perspective.” In Power and Society in the GDR, 1961-1979: The ‘Normalisation of Rule’? edited by M. Fulbrook, 1–30. New York: Berghahn Books. https://doi.org/10.1515/9781845459130-002.
  • Gold, E., and P. French. 2019. “International Practices in Forensic Speaker Comparisons: Second Survey.” International Journal of Speech, Language & the Law 26 (1): 1–20. https://doi.org/10.1558/ijsll.38028.
  • Goodwin, Ch. 1994. “Professional Vision.” American Anthropologist 96 (3): 606–633. https://doi.org/10.1525/aa.1994.96.3.02a00100.
  • Hlaváček, J. 2007–2010. Kriminalistický ústav a jeho publikační tvorba. Prague: Sborbík kriminalistického ústavu.
  • Hui, A., J. Kursell, and M. Jackson, eds. 2013. “Music, Sound and the Laboratory from 1750–1980.” Special issue, Osiris 28 (1): 1–11. https://doi.org/10.1086/671360.
  • Iacob, B. C., S. Dobos, R. Grosescu, V. Iacob, and V. Pasca, eds. 2018. “State Socialist Experts in Transnational Perspective: East European Circulation of Knowledge during the Cold War (1950–1980).” Special issue, East Central Europe, 45: 145–300.
  • Jasanoff, S. 1995. Science at the Bar: Law, Science, and Technology in America. Cambridge, MA/London: Harvard University Press. https://doi.org/10.4159/9780674039124.
  • Jasanoff, S. 2006. “Just Evidence: The Limits of Science in the Legal Process.” In DNA Fingerprinting and Civil Liberties , edited by A. Noble and B. W. Moulton, 326–341. Special issue, A Journal of the American Society of Law, Medicine, and Ethics 34 (2).
  • Kersta, L. G. 1962. “Voiceprint Identification.” Nature 196 (4861): 1253–1257. https://doi.org/10.1038/1961253a0.
  • Kolář, P., and M. Pullman. 2016. Co byla Normaliazce: Studie o pozním socialismu. Praha: NLN, USTR.
  • Kopeček, M., ed. 2019. Architekti Dlouhé Změny: Expertní Kořeny Postsocialismus v Československu. Prague: Argo.
  • Kruse, C. 2016. The Social Life of Forensic Evidence. Oakland: University of California Press.
  • Kvicalova, A. 2023. “Sound on the Quiet: Speaker Identification and Auditory Objectivity in Czechoslovak Fonoscopy, 1975–90.” Technology and Culture 64 (2): 379–406. https://doi.org/10.1353/tech.2023.0104.
  • Kvicalova, A., and K. Bijsterveld, eds. 2023. Comparing Voices: Speaker Identification Witness Seminar. Maastricht: Maastricht University.
  • Latour, B. 1987. Science in Action: How to Follow Scientists and Engineers Through Society. Cambridge, MA: Harvard University Press.
  • Latour, B. 1990. “Drawing Things Together.” In Representation in Scientific Practice, edited by M. Lynch and S Woolgar, 19–68. Cambridge, MA: MIT Press.
  • Latour, B. 2005. Reassembling the Social: An Introduction to Actor-Network-Theory. Oxford: Oxford University Press.
  • Latour, B. 2010. The Making of Law: An Ethnography of the Conseil d’Etat, translated by M. Brilman and A. Pottag. Cambridge: Polity Press.
  • Latour, B., and S. Woolgar. 1986. Laboratory Life: The Construction of Scientific Facts. Princeton, NJ: Princeton University Press. https://doi.org/10.1515/9781400820412.
  • Li, X., and M. Mills. 2019. “Vocal Features: From Voice Identification to Speech Recognition by Machine.” Technology and Culture 60 (2): S129–S160. https://doi.org/10.1353/tech.2019.0066.
  • Maciejko, W., J. Rzeszotarski, and T. Tomaszewski. 2010. “50 lat polskiej fonoskopii.” Problemy Kryminalistyki 269:69–83.
  • Málek, J. 1981a. “Fonoskopická expertýza.” Informační věstník VB 7 (4): 3–5.
  • Málek, J. 1981b. “Předměty pro kriminalistickou fonoskopickou expertizu.” Kriminalistický sborník 25 (7): 445–447.
  • Málek, J., and V. Musilová. 1989. Fonoskopie. Prague: Kriminalistický Ústav VB.
  • Mills, M. 2010. “Deaf Jam: From Inscription to Reproduction to Information.” Social Text 28 (1): 35–58. https://doi.org/10.1215/01642472-2009-059.
  • Mody, C. 2005. “The Sounds of Science: Listening to Laboratory Practice.” Science, Technology and Human Values 30 (2): 175–198. https://doi.org/10.1177/0162243903261951.
  • Morat, D. 2014. Sounds of Modern History: Auditory Cultures in 19th- and 20th-Century Europe. Oxford/New York: Berghahn Books.
  • Musilová, V. 1977. “Odlišnosti ve zkoumání psaných a mluvených jazykových projevů.” Kriminalistický sborník 21 (6): 367–371.
  • Neale, A. 2020. Photographing Crime Scenes in Twentieth-Century London: Microhistories of Domestic Murder. London: Bloomsbury. https://doi.org/10.5040/9781350089440.
  • Nolan, F. 1983. The Phonetic Bases of Speaker Recognition. Cambridge: Cambridge University Press.
  • Pješčak, J. 1976. Základy kriminalistiky. Prague: Naše vojsko.
  • Porcello, T. 2004. “Speaking of Sound: Language and the Professionalization of Sound-Recording Engineers.” Social Studies of Science 34 (5): 733–758. https://doi.org/10.1177/0306312704047328.
  • Potter, R. K., G. A. Kopp, and H. C. Green. 1947. Visible Speech. New York: Van Nostrand.
  • Povolný, D. 2001. Operativní technika v rukou StB. Prague: Úřad vyšetřování a dokumentace zločinů PČR.
  • Rak, R., V. Matyáš, and Z. Říha. 2008. Biometrie a identita člověka ve forenzních a komerčních aplikacích. Prague: Grada.
  • Rice, T. 2012. “Sounding Bodies: Medical Students and the Acquisition of Stethoscopic Perspectives.” In The Oxford Handbook of Sound Studies, edited by K. Bijsterveld and T. Pinch, 298–320. Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195388947.013.0074.
  • Semel, B. 2022. “Listening Like a Computer: Attentional Tensions and Mechanized Care in Psychiatric Digital Phenotyping.” Science, Technology & Human Values 47 (2): 266–290. https://doi.org/10.1177/01622439211026371.
  • Skarnitzl, R. 2022. “O fonetické identifikaci mluvčího ve forenzním kontextu.” Naše řeč 105 (3): 117–132.
  • Sommer, V., M. Spurný, and J. Mrňka. 2019. Řídit Socialismus jako firmu. Technokratické vládnutí v Čskoslovensku, 19561989. Prague: NLN.
  • Sterne, J. 2003. The Audible Past. Durham: Duke University Press.
  • Stoever, J. L. 2016. The Sonic Color Line: Race and the Cultural Politics of Listening. New York: New York University Press. https://doi.org/10.2307/j.ctt1bj4s55.
  • Szwarc, A. 1964. Kryminalistyczna ekspertyza zapisu magnetofonowego. Warsaw: Wydawnictwo Zakladu Kryminalistyki KGMO.
  • Thompson, E. 2002. Soundscape of Modernity: Architectural Acoustics and the Culture of Listening in America, 1900–1933. Cambridge, MA: MIT Press.
  • Tkaczyk, Viktoria. 2023. Thinking with Sound: A New Program in the Sciences and Humanities Around 1900. Chicago: University of Chicago Press.
  • Tkaczyk, V., M. Mills, and A. Hui. 2020. Testing Hearing: The Making of Modern Aurality. Oxford: Oxford University Press. https://doi.org/10.1093/oso/9780197511121.001.0001.
  • Vorel, J., A. Šimánková, and L. Babka. 2003. Československá justice v letech 1948–1953 v dokumentech. Vol. 1. Prague: Úřad vyšetřování a dokumentace zločinů PČR.
  • Zbikowski, D. 2002. “Listening Ear: Phenomena of Acoustic Surveillance.” In CTRL: Rhetorics of Surveillance from Bentham to Big Brother, edited by T. Y. Levin, U. Frohne, and P. Weibel, 33–47. Cambridge, MA: MIT Press.
  • Zielinski, S. 2006. Deep Time of the Media: Toward an Archaeology of Hearing and Seeing by Technical Means. Cambridge, MA: MIT Press.