1,300
Views
0
CrossRef citations to date
0
Altmetric
Articles

“Did you know that David Beckham speaks nine languages?”: AI-supported production process for enhanced personalization of audio-visual content

ORCID Icon
Pages 265-280 | Received 14 May 2020, Accepted 29 Dec 2021, Published online: 17 Jan 2022

Abstract

The introduction of artificial intelligence (AI) into the media production process has contributed to the automation of selected tasks and stronger hybridization of man and machine in the process; however, the AI-supported production process has expanded from the traditional, three-stage model by a new phase of consumer evaluation and feedback collection, analysis, and application. This has opened a way for far-reaching content personalization and thus offers a new type of media experience. Powering the production process with a constant stream of consumer data has also affected the process itself and changed its nature from linear to cyclical.

The audio-visual production process is, and always has been, complex in nature. Although the digitization of the media industry seemingly simplifies this process, offering new and increasingly accessible tools to support each of its stages, the continuous evolution of (new)media ecosystem increases its complexity. A glut of new content formats and technological standards, combined with market globalization and the development of new platforms, make it necessary to revise not only the tools but also the entire production process (Fuschi and Badii Citation2013).

The changes to the audio-visual media production process are multifold. Faced with the need to create multiple versions of content, which would take into account the specifics, nature and requirements of numerous media platforms, there is a growing pressure to improve production workflow. The changes are aimed at reducing the number of human errors in the process, and thus the delays resulting from such errors, but above all the intent is to increase automation by delegating tedious, repetitive and time-consuming tasks to machines, thereby increasing workflow efficiencies and reducing production costs (Dorai Citation2001). At the same time, the oversupply of content available to viewers not only forces content providers to improve content search and discovery systems, but also pushes creators to garner a better understanding of how the audiences interact with their content, and create content that is better suited to their needs, expectations and consumption habits. The practice occurs at the level of increasingly smaller audience groups and significantly affects the shape of the production process (Badii et al. Citation2008).

In response to these needs, artificial intelligence (AI) powered tools rapidly increase in popularity and are (more) commonly used to administer production tasks, handle content editing and media management. Although the changes affect not only the production process but also have far-reaching consequences for the very shape of the content and audience experiences, the literature on this subject is scarce.

In the context of artificial intelligence-assisted media production practices, most of the scientific works focus on journalistic practice (Bodó Citation2019; Diakopoulos Citation2019; Hansen et al. Citation2017; Stray Citation2019), programmatic advertising (Busch Citation2016; Huang Citation2015), computational creativity (Chen et al. Citation2019; Jordanous 2016) and broadly defined technological solutions (explored by numerous authors). In terms of audio-visual production, one can additionally find the reflections on machine-based support for decision-making (Fuschi and Badii Citation2013; Ghiassi, Lio, and Moon Citation2015), as well as the role of metadata (Bailer and Schallauer Citation2008) or computational narrative intelligence (Riedl Citation2016). However, there are no discussions regarding changes in the internal audio-visual production process caused by the use of artificial intelligence and machine learning. It is essential to study the nature of these changes, as the use of new tools has not only influenced the way specific tasks are performed in the production process but has also changed its very nature. Moreover, as the content emerges directly from the nature and organization of media work, production organization procedures have a significant impact on the shape of the content (Gans Citation1979; Gitlin Citation1980). Given that one of the main objectives of AI in audio-visual production is to enable increased responsiveness to audience needs (New European Media Citation2018; Zapata-Rivera and Katz Citation2014), it should be assumed that the changes in production practices are indifferent not only to the dynamics of the process and the shape of the content but also to how the audience experiences this content. Thus, the article tries to shed light on existing and developing practices regarding the use of artificial intelligence in the audio-visual media production process. In particular, the research aims to answer the questions of how AI-supported workflows affect the production process and whether and how it supports the personalization of audio-visual content.

Media production process as a hybrid workflow

A series of activities, which bring content into existence and shape its final form are collectively termed as the media production process (Hesmondhalgh Citation2010). Traditionally, the audio-visual production process consists of three phases (): pre-production, production, and post-production, which are preceded by a period of conceptual and script work (sometimes referred to as preliminary development), which ends with film/video being send to production (green-lit), and subsequently distributed (Gross, Foust, and Burrows Citation2005). Except for its basic structure, this process is neither constant nor stable over time. It constantly adapts to the specific circumstances of ‘production"’, scenario requirements, circumstances of development, available budget and entities involved. The creation context is complex and is characterized by a multitude of roles, which are often concentrated in a single person – a director or producer (Fuschi and Badii Citation2013).

Figure 1. High level traditional media production process.

Figure 1. High level traditional media production process.

In the pre-production phase, the audio-visual content is designed and conceptualized by translating the script into specific directorial and production solutions, and its implementation is planned in detail. The content is then produced during the shooting phase. At the post-production stage, the footage, often coming from multiple sources, is combined into the media content, which is then archived. Of course, the boundaries between these phases are blurred, which is still particularly evident, for example, in the production of entirely computer-generated films, where production and post-production intertwine, or in interactive media, when the content takes final shape during its presentation (Bailer and Schallauer Citation2008).

In addition, in the context of audio-visual production, the term of ‘production’ can be defined in various ways and be seen as (1) the creation of a sound sequence of images, (2) the manipulation of existing film to create a new content (for example, using filters or sound and visual modifications) or (3) the process of using pre-existing images to create other media products (New European Media 2018). What these approaches have in common is the fact that each of them requires the presence of both a human factor (as a creator), and more or less advanced technologies, making the production process itself a kind of hybrid workflow.

In the context of the current media world, the production process is characterized by hectic timing and tight delivery deadlines (Fuschi and Badii Citation2013). The process is oriented to permanently maintain or increase high productivity, which is often achieved by reducing the time dedicated to content creation (Žnidaršič and Jereb Citation2011). To reduce development time, it is necessary to use techniques, which can act as catalysts for the creative and organizational process, which will give the project team different ways to find possible solutions (Bertoncelli et al. Citation2016), as well as streamline the process by automating repetitive and long activities, and allow creative teams to deal with creative activities (Diakopoulos Citation2019).

AI in the audio-visual production process

Artificial intelligence has a dynamic influence on the development and transformation of many industries. It is estimated that spending on AI development in 2024 will reach $110 billion (International Data Corporation 2020). Its implementations aim to "make a computer do things that, at the moment, people do better" (Rich and Knight 2004, 3) and serve as support, thus reducing time, increasing robustness, reliability and availability of products and services (Silva Citation1999). They are also used to support the decision-making processes of specialists or provide second opinions (Giarratano and Riley Citation2005). The audio-visual industry is no exception to this and is increasingly taking advantage of AI to support the objectives mentioned above.

In the media production process multiple tasks are outsourced to the machines, with most common being those related to content identification (such as voice, text and image recognition within the film and audio content used, and automatic enrichment of such content with metadata to help with further work; which is particularly important when selecting content for editing) and adaptation (for example, in creating different versions of the same final edit file) (Jarek and Mazurek Citation2019).

Furthermore, computer-aided decision-making in the production process is gaining particular attention nowadays (Botega and Silva Citation2015; Fuschi and Badii Citation2013; Ghiassi, Lio, and Moon Citation2015; Patel and Kannampallil Citation2015). One area of application is business decision-making support in the film industry, where it is used, among other employments, for forecasting production revenues. Modeling, which is performed by using a dynamic artificial neural network, takes into account such factors in the predictive variables as production budgets, pre-show advertising expenditures and seasonality. The solution seeks to resolve issues related to revenue estimation based on historical data and opening weekend box office results (Ghiassi, Lio, and Moon Citation2015).

At the same time, the knowledge-based systems (KBS) used in decision support processes are tested for their ability to support creativity in content creation (Botega, Silva, and Murphy Citation2018). KBS is an artificial intelligence (AI) approach capable of imitating human decision-making skills, providing empirical knowledge in a more accessible, reliable and permanent structure (Giarratano and Riley Citation2005). This system can help creative teams select the creative tools and techniques, which are best suited to the implementation of a specific task in the production process (Botega, Silva, and Murphy Citation2018), and thus improve and accelerate production work.

However, it is necessary to stress that the above list does not in any way exhaust the tools and solutions used, aiming only to identify the most common practices and emerging trends. Furthermore, it should be emphasized that the purpose of this article is not to provide a technological or programming analysis of applied solutions, but rather to describe their role in shaping a new type of production process.

Mass-personalization of content

For decades, the experience of the media has been quite uniform – content presented on TV, in the press or cinemas in a given location (city or country) has been the same, its number was relatively limited, and could be independently navigated and selected by the individual recipient (Kress Citation2003). With the development of the internet, navigation through content-rich repositories began to be overwhelming for the audience (Napoli Citation2011), at the same time providing easy access to content, which was hitherto difficult and expensive to access e.g. such as news in multiple foreign languages. This brought new limitations to accessibility, such as the language barrier (Napoli Citation2011).

The development of technology has made it possible to customize content catalogs, enabling audiences to experience and consume media more personally. Recommendation systems making use of collaborative filtering, i.e. filtering based on similarity of behavior and content-based filtering based on similarity of content (Tarka Citation2013), are intended to describe and present such an offer, which would meet the interest of a given recipient (Brusilovsky, Kobsa, and Nejdl Citation2007).

However, it is worth looking at personalization solutions more broadly, looking at them as the act of creating unique solutions tailored to the unique needs of individual consumers (Bardakci and Whitelock Citation2003) or even providing every consumer with exactly what they want, in the place and time they expect (Hart Citation1995).

From the perspective of the content provider, personalization can take various forms. On one hand, it may manifest itself in offerings consisting of one-of-a-kind, products tailor-made for individual customers. On the other, it can also boil down to offering practically identical products to different customers who have revealed identical needs (Borusiak et al. Citation2015). Such delivery of seemingly personalized content to groups of recipients in clusters can be called mass personalization or customization (Bardakci and Whitelock Citation2003; Hart Citation1995).

Computerization finds an important place in the context of content personalization. First of all, it opens up new possibilities for automating the personalization and adjusting content through contextualization (Chen et al. Citation2019). Just as the demographic data and indicators of the audience are usually stable, the needs and attitudes of viewers change from time to time, and interests shift daily. Thus, the context for content, such as time of day, location or current events can affect how the audience processes the content (Ghose, Goldfarb and Han 2013). The creation of content in a reactive way, using the current context, through the use of programmatic creative platforms (PCP) allows for the creation of content corresponding to current interests of viewers, even in real-time, (Chen et al. Citation2019), potentially increasing their satisfaction and pleasure.

Advanced computerization also allows achieving scalability of media projects and, with the use of far-reaching personalization and contextualization, becomes an indispensable element of the new production process. Matching the content to a wide range of audiences requires the creation of a huge amount of diverse content and the use of many different creative designs. With the use of artificial intelligence technology, PCP can increase the number of delivered designs, in an automated fashion, and on a large scale (Chen et al. Citation2019). And although these solutions are still mainly used in the creation of advertising content, work is being carried out towards methods of getting to know the interests of user groups (through the analysis of multimedia data made available in social media), and adapting the semantic/emotional content of film trailers and TV programs to address the interests of specific communities (Smith et al. Citation2017).

Method

This qualitatively oriented study is exploratory in nature. Expert interviews were conducted with 12 representatives of the media industry, who are directly involved in the development of tools integrating AI into production processes (n = 5) or are implementing these solutions, and using them in practice, in the production process (producers of audio-visual products using AI-supported production workflows, n = 7). It has been proven that well-selected experts provide a sound knowledge base and a good starting point for a comprehensive and targeted synthesis of the most frequent and pressing problems in different sectors (Liebold and Trinczek Citation2009). Additionally, what is particularly important in the context of this study, expert interviews allow for early detection of emerging trends (Dorussen, Lenz, and Blavoukos Citation2005). The in-depth process knowledge of the experts and the privileged observation point they have due to their professional roles make their observations potentially valuable (Liebold and Trinczek Citation2009). Therefore, it was decided that expert interviews make the best method to investigate the rapidly changing phenomenon.

As the validity of the information collected with the help of expert interviews depends significantly on the quality of the interviewees (Dorussen, Lenz, and Blavoukos Citation2005), much attention has been paid to sampling. Convenience sampling was used to identify best candidates for interviews, based on their reputation and professional achievements regarding the use of AI and ML in the production processes. In addition, a snowball method of selecting interviewees was used, in which some of them suggested further experts who met the criteria. All participants represented a high level of seniority on their positions and are employed in capacities such as product director, development director, research and development director or senior producer, vice president for production, head of production and executive producer.

Analysis of interviews was carried out using the grounded theory (Glaser and Strauss Citation1967), which is highly applicable to media production studies (Rodon and Pastor-Collado 2007). This method of analysis focus on process-based descriptions of the implementation within a specific context (Myers Citation1997), which is audio-visual production.

In the data analysis process, three procedures were used to inductively generate data-based categories. These were: open coding, axial coding and selective coding, during which the principle of a continuous interplay between data collection and data analysis was applied (Rodon and Pastor-Collado 2007).

At the same time, as this study focuses on the “process”, which can be understood as “a series of evolving sequences of action/interaction that occur over time and space, changing or sometimes remaining the same in response to the situation or context’” (Strauss and Corbin 1990, 165), a simplified version of the paradigm model developed by Strauss and Corbin (1990) was used. This allowed the phenomenon to be studied by clearly defining conditions, actions/interactions and consequences (Strauss and Corbin 1990) in a process.

Results

AI-Supported audio-visual production process

Although some creative tasks in the media production process have long since been replaced or supported by a machine (for example, in editing works), the use of artificial intelligence and machine learning significantly affects the shape of the production process. This hybrid process of human and machine work has taken on a new dimension, not only allowing to automate an increasing number of tasks but also to use, on a continuous basis, consumer data, including their behaviors, preferences and ways of interaction with content.

The process, which traditionally consisted of three primary stages – pre-production, production and post-production, along with any preliminary work, such as conceptual work and script development, as well as a whole chain of distribution activities, has now been widened. The AI-Supported Production Process, as shown in , has been expanded by the "consumer data collection, analysis and application" stage, which is embedded in all other stages of audio-visual work production, including both preliminary works and distribution/publication phase.

Figure 2. AI-supported audiovisual production process.

Figure 2. AI-supported audiovisual production process.

The willingness to broadly use consumer data in the production process, while at the same time maintaining high process efficiency, had further consequences for production practices, making the effective disposal of the collected data challenging in practice.

Delivering audience insights

Digitization of the media world and the possibility of collecting information on consumer preferences and habits, has significantly influenced approaches to the media production process. With the help of natural language processing (NLP) and deep learning technologies, creators can continuously receive feedback to support decision-making in the creative process. Predictive analytics create opportunities, which help assess what content will cause specific audience reactions. Thus, they provide tools, which not only provide efficiencies in marketing (for example, when creating film trailers) but also reduce risks associated with production (I7):

Entertainment is a risky business. Even if we produce only small video clips from the red carpet, each such production still takes up man-hours and equipment, therefore it generates costs. Using analytics to predict what is currently ‘hot’ helps us reduce these costs. Not only can we STOP producing things that nobody wants to see, but most importantly, we now know enough to do so before we start the production; and we can also observe trends. At the moment, we don’t need a large team of researchers and assistants to know who is suddenly getting more buzz. We get to know who is getting clicks and what sentiment they evoke in the audience. We can follow current or immediate future trends, without wasting resources".

At the same time, attempts are made to use behavioral and psychological data: "By determining the affective and cognitive states of our audience, we can steer the pleasure they get from interacting with the content" (I2). This solution could potentially enable advanced profiling and modeling of audience groups while taking into account their capabilities and needs and well as designing content appropriate to these audiences. Thus, it would be possible to personalize content to a significant degree, taking into account the new, hitherto elusive aspects of profiling, but, as our interviewee notes, “we are still far away from being able to collect such data in the real world, outside of laboratory and experimental work and to use it in practice" (I2).

Supporting pre-production work

Although the utilization of artificial intelligence at the pre-production stage mainly serves optimizing processes and thus help save time and resources, it can potentially affect the shape of the content itself. In addition, some solutions can be applied in the creation of personalized content.

Development of tools supporting this production stage is focused on increasing the effectiveness of the implementation team on many levels and departments of the production. With the help of natural language processing, machines can assist in the planning and budgeting process for production work, by simplifying and partly automating tasks, which enable the development of a script from a conceptual stage to implementation. Computerized tasks can include automated extraction of portions of a script/shooting script into different categories, such as a list of characters, lines of each character, type of scene, props needed, or visual or audio effects required, which can significantly speed up preparation for shooting. Decision-making support systems are also being developed to help with estimating scene length and the number of shots.

Solutions supporting creative choices are also being introduced, for example, in storyboard development. Dedicated programs, using text-to-image classifier, can estimate the number of shots based on the script and assign the type of required shot (such as close-up or full shot), type of camera movement and the equipment to be used. Based on such data, the system searches its archives and selects prefabricated images to match the expected shots, creating a sequence of images (in this case in a form of collection of photographs) to generate a storyboard. However, these tools are characterized by a number of restrictions resulting from the use of a pre-defined database of photographs unable, at this stage of development, to reflect the visual character of the presented objects, but rather only provide a visual list of shots.

These solutions, although not applied directly to the creation of personalized content, use similar logic and mechanics – they are based on recognizing and grouping elements, as well as learning what sequence, when compiled, will be the most satisfactory for the user (initially to a member of the production team, and further on, to the end-user). This approach enables a wide range of possibilities to take audience preferences into account, for example, by including in prop lists or storyboards items of specific brands or with specific parameters.

Searching, selecting and compiling footage

As already mentioned, many of the tasks in the production process are repetitive, time-consuming and do not require creative input from the creator, therefore, one of AI’s primary tasks is to optimize the process of reviewing and managing footage and multimedia resources. These materials, thanks to automated content analysis in the form of technological image, text and voice processing, enable content enrichment through automatic tagging, theme extraction and keyword search.

In the next step, this content can be recommended to creators smartly and contextually and can be grouped thematically and objectively around given or repeated themes. This enables, e.g. automatic selection of shots including a particular actor, made in a given location or containing dialogues assigned to the scene. However, as one of the interviewees notes (I8), it is crucial that the systems focus on intelligent information management and not exclusively on the automatic content processing. Otherwise, "systems will create thematic catalogs with the same enormous amount of content, which humans will have to physically go through anyway in order to assess their value and usefulness" (I8). The machines’ ability to not only catalog but also to analyze and group materials in order to provide real support for the decision-making process based on processed information regarding the analyzed content is vital in this context.

In practice, automatic tagging and content selection can be utilized in many areas – from reviewing a large amount of content from the set, to providing thematic suggestions based on viewers’ preferences (for example, in the context of increasing visibility of a given subject, actor or new information). Thus, increasing the ability to react quickly to the current events, personalize produced content, or automatically select content aimed to form of a compilation of shots or even a ready-made video clip.

The latter functionality is increasingly used to automatically create short videos, which are intended to be a synthesis of a larger whole. The process starts with deep learning of media habits and audience preferences regarding content and is followed by automated content analysis, to conclude with pre-programmed or fully automated creation of editing paths and final clips. This solution gains popularity in the production of sports highlights, which, as I3 notes, stems from the repetitiveness and predictability of the broadcast – “Sport is simple. There’s the whistle, a corner kick, a penalty and a goal. Nothing fancy or complicated. You know what the viewer is looking for in a highlight. It’s easy to mix a clip together and there’s no need to maintain any continuity of narrative" (I6). However, this solution can also be used to create other thematic audio-visual clips, e.g. a compilation of news about a person from a specific period or about specific topics (I6, I11). This allows imagining a situation in which, different movies can be made from the same footage based on viewer-related data, taking into account the tastes and preferences of the viewers from whom this data was collected (or groups of viewers with similar preferences) (I11).

Personalization

When discussing the computerized selection of shots, topics, or characters, and the automatic creation of highlights and compilations, it is impossible to omit the personalization potential offered by these facilities. Machines try to discover and understand the audience’s preferences using deep learning and predictive analytics performed on data collected from various sources (mainly content distribution platforms). Based on the observed interactions with content and in-depth semantic and sentiment analysis in social networks, the systems are to translate this information into content proposals tailored to these preferences and to predict the reactions, which may be triggered by the upcoming content. These systems may also be used to create content tailored to expectations in an automated fashion, by way of using algorithms to design content, which the viewer may find potentially desirable; synthesize audio-visual works to create new content, compile these works according to a specific key, or enrich it with information that the viewer finds attractive, for example with statistical data given during a match, information about the characters in the film, or extending and supplementing historical information appearing in the content. This technology enables the creation of content, which is not only tailored to the preferences of a particular recipient (or group of recipients) but provides the ability to directly personalize messaging (for example, by addressing the consumer by name when producing a promotional film).

Versioning

Another form of content adaptation within the process is versioning, which is used to create a mass amounts of content with the highest possible level of matching. Digital media characterizes itself, among other things, by the multiplicity of platforms, which are used by consumers to consume content, and these platforms differ in terms of the technical requirements for the content they publish, but often also in terms of language, expression and content characteristics. Examples include popular social networking platforms, and the popularization of square-shaped images by the Instagram social platform, the way the language is used on Twitter, or the specific nature of TikTok, to which users and content creators are informally obliged to adapt. Due to formal and informal requirements, as well as those caused by the desire to be authentic, content creators use computational solutions to create simple workflows, which allow for the creation of content tailored to multiple platforms, language versions, styles or resolutions (as noted by 9 of the interviewees). Thus, the recipient can receive, from the same provider, content conveying the same message, but shaped in different ways on multiple platforms. It also offers the potential for the development of transmedia (or in this case ‘transplatform’) content.

Localization

The versioning process allows creating multiple language variants of the same content. Localization is an important strand of AI/ML research, developed in the context of many sectors, and the media industry is not alone in this respect. With the help of speech recognition, speech synthesis and natural language processing, automated (or partially automated) translation, creation, application and synchronization of subtitles is possible.

At the same time, text-to-speech solutions have been developed, to allow, with the help of speech synthesizers, to provide voice-over for audio-visual content, but as experts emphasize (I1, I7, I10), this solution is far from capable of imitating a fully authentic human voice, especially when it comes to the reproduction of emotions and hidden intentions of the speaker.

At the same time, much more advanced solutions like AI-dubbing are being developed. This technology uses 3 D digital models of an actor’s or animated character’s face, which is to speak their lines in different languages. At the same time, the algorithm learns how to move the lips of people speaking the scripted words in target languages. The layout of these movements is then transferred to the created model, thanks to which the viewer is able to see the actor speaking a language he or she does not know. This solution was used in Malaria Must Die (malariamustdie.com) campaign, in which David Beckham "speaks" nine languages (I4).

This technology also applies to advanced content personalization. It allows to provide data points to content and replace keywords or phrases in the movie. Thus, the film’s protagonist can address the viewer by name or make a joke, contextually referring to the previously consumed content (I4, I9, I12).

The development of this AI application can be seen as problematic. First of all, ‘this is exactly the same “trick” that is used in deepfakes to sow disinformation. There is no chance that the untrained eye will catch a well done fake’ (I12). Doubts also arise in terms of execution: ‘it’s all very beautiful and smooth when we can shoot in the studio, with a well and evenly lit actor standing right in front of the camera. However, in uncontrolled conditions, when the actor turns around, for example, or something shades part of his face, it is easy to see past the facade, or the machine is not able to provide a solution at all (…) so, yes, I think it is ideal for animated and instructional films, but will we see Bond in 20 languages anytime soon? I highly doubt it’ (I8).

Feedback collection and application

Once the content is placed in their target digital platforms, real-time monitoring of content consumption across multiple platforms and devices and the collection of data on user preferences and behaviors, as mentioned above, begins. With the help of machine learning techniques, feedback is extracted and combined in such a way that the main feedback features can be defined (Le Thi, Le, and Dinh 2015). In this way, the information obtained can not only be compared with original expectations regarding results, but can also be used to (possibly) automate content editing, in order to improve it and put it back on the platform, and a database component, which will form the basis for creating new content. Thus, the technology feeds many stages of the production of audio-visual material and results in a cyclical nature of production.

Discussion

The introduction of artificial intelligence and machine learning into the production process has contributed to the automation of selected tasks and stronger hybridization of man and machine in the production process. By streamlining the works of the production team, the process becomes significantly more efficient and allows for shortening production time and/or increasing the amount of produced content. It brings the creators closer to the goal of high productivity described by both Fuschi and Badii (Citation2013) and Žnidaršič and Jereb (Citation2011). This conclusion also coincides with the results of previous research on the use of AI in advertising (Huang Citation2015; Qin and Jiang 2019).

Above all, however, the use of AI in audio-visual production has caused a disruptive change in the very structure of the production process. The use of a repetitive, or even continuous, evaluation process of the performance of the content and analysis of audience feedback, influenced the change in the nature of the process from (relatively) linear to cyclical. Consumer evaluation and feedback became an integral part of the production process, which must directly influence the shape of content and how that content is experienced by the audience (see Gans Citation1979; Gitlin Citation1980).

Of course, in the industry context, it should be noted that each step in the same process will be different, depending on individual production characteristics and context in which it is used. On the meta-level, however, it can be concluded that the AI-Supported Production Process has expanded from the traditional model by a new phase – consumer evaluation and feedback – which involves the collection, analysis and application of data obtained from the audience feedback, and ways it interacts with the content. Thus, traditionally three-phased process (consisted of pre-production, production and post-production), now consists of four phases. The process has simultaneously lost its linear structure, and insight acquisition and application take place throughout its entirety.

AI-Supported Production Process potentially offers producers advanced support in terms of reducing production risks by providing them with a better understanding of the customers’ needs and ways they interact with content. At the same time, it offers a promise of far-reaching content personalization, and thus offer a new type of media experience, corresponding not only to the viewers’ interests and needs but also to their cognitive and affective capabilities. It is worth noting, however, that research work on advanced content personalization should be not considered to be advanced, but somewhat experimental (Smith et al. Citation2017). This concerns both the study of the cognitive and affective levels of viewers in their daily media consumption routine (Smith et al. Citation2017) and the narrative level of content, which is limited by the still-limited and experimental computational intelligence (Riedl Citation2016).

In addition to a broad scope of potential applications, the intent to use consumer data in the production process, while maintaining high process efficiency at the same time, brings with it several challenges. High efficiency of the process depends on the skillful use of the solutions at each individual stage, which should be reflected in a smooth and stable configuration of the whole process flow (I6). As technology is becoming more and more central to many elements of the process, and large-scale data processing and metadata management become an integral part of audio-visual production, technological reliability and ease of process management are becoming crucial for a smooth production process. All components – hardware and software alike – used by production processes contributing to the media production workflow should be easy to integrate into a single digital infrastructure (Van Rijsselbergen et al. 2010) and be as intuitive and straightforward to use as possible (I2, I9). As Van Rijsselbergen and his research team (2010) point out, in order to achieve this successfully, metadata for both production and workflow should be correctly collected and standardized in a single data model and stored in a central system, which would be accessible to a wide range of participants in the production process. Thus, the production process must become more inclusive for production team members.

Conclusion

This article aimed to provide insight regarding changes in the internal audio-visual production process caused by the use of artificial intelligence and machine learning. It was proven that increasing computerization changes the nature of the process, which becomes less linear and more cyclical. The study shows how the application of AI at different stages of production opens the way for progressive personalization of content, built on consumer insights gained in the digital environment while increasing the efficiency and scalability of each task. The article thus provides a conceptual framework, which can be used to further reflect on the changing media production landscape.

In terms of future research, it is worth looking at three aspects directly related to the new audio-visual production process. First of all, it is crucial to investigate how the new process affects the structure and dynamics of the production team – what new roles have appeared, how have changed the relationships of team members, their workflow and tasks. It is also interesting to understand how the nature of the new process affects the content and which tools and phases of the process have a defining meaning for its final shape.

It is also necessary to look at the relation between viewer and content. It is vital to understand not only the ways, in which the audiences are interacting with the content, or how they perceive new content, but above all how they perceive and execute their own role as co-creators (as they support the process with data). It would be interesting to explore these new relationships in the context of consumer data usage in the face of the European Union’s General Data Protection Regulation.

In addition, it is necessary to follow further possible changes in the process, which, together with the development of algorithmic tools, may evolve and influence the workflow hybridization procedure, or verify the role of machines in the process.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by the Netherlands Organisation for Scientific Research under Grant KI.18.044.

Notes on contributors

Izabela Derda

Izabela Derda Ph.D. is a researcher and lecturer at Erasmus chool of History, Culture and Communication in Rotterdam, Netherlands. Her research interests lie in the investigation of the media as a practice, including networks, relationships and dynamics between the involved stakeholders, experience (co-)design and media production practices.

References

  • Badii, A., A. Khan, A. Adetoye, and D. Fuschi. 2008. “Personalised Digital Media Adaptation and Delivery to Mobile Devices.” In Proceedings Third International Workshop on Mobile and Networking Technologies for Social Applications, Monterrey, Mexico.
  • Bailer, W., and P. Schallauer. 2008. “Metadata in the Audiovisual Media Production Process.” In Multimedia Semantics — The Role of Metadata. Studies in Computational Intelligence. Vol. 101, 65–84, edited by M. Granitzer, M. Lux and M. Spaniol. Berlin, Heidelberg: Springer. doi:10.1007/978-3-540-77473-0_4.
  • Bardakci, Ahmet, and Jeryl Whitelock. 2003. “Mass-Customisation in Marketing: The Consumer Perspective.” Journal of Consumer Marketing 20 (5): 463–479. doi:10.1108/07363760310489689.
  • Bertoncelli, T., Mayer, O., and Lynass, M. 2016. Creativity, learning techniques and TRIZ. Procedia CIRP 39: 191–196. doi:10.1016/j.procir.2016.01.187.
  • Bodó, B. 2019. “Selling News to Audiences – A Qualitative Inquiry into the Emerging Logics of Algorithmic News Personalization in European Quality News Media.” Digital Journalism 7 (8): 1054–1075. doi:10.1080/21670811.2019.1624185.
  • Borusiak, B.,. B. Pierański, R. Romanowski, and S. Strykowski. 2015. “Automatyzacja personalizacji reklamy internetowej.” Marketing i Rynek 3: 36–43.
  • Botega, L. F., and J. C. Silva. 2015. “Knowledge-Based System for Categorization and Selection of Creativity Support Techniques.” International Journal of Knowledge Engineering and Management 4: 10.
  • Botega, L. F., J. C. Silva, and G. R. Murphy. 2018. “Usability Study on the Interface of an Artificial Intelligence System for Creativity Support.” Human Factors in Design 7 (13): 41–60. doi:10.5965/2316796307132018041.
  • Brusilovsky, P., A. Kobsa, and W. Nejdl. 2007. The Adaptive Web: Methods and Strategies of Web Personalization. Berlin: Springer.
  • Busch, O. 2016. The Programmatic Advertising Principle. Heidelberg: Springer. doi:10.1007/978-3-319-25023-6_1.
  • Chen, G., P. Xie, J. Dong, and T. Wang. 2019. “Understanding Programmatic Creative: The Role of AI.” Journal of Advertising 48 (4): 347–355. doi:10.1080/00913367.2019.1654421.
  • Diakopoulos, N. 2019. “Towards a Design Orientation on Algorithms and Automation in News Production.” Digital Journalism 7 (8): 1180–1184. doi:10.1080/21670811.2019.1682938.
  • Dorai, C. 2001. “Bridging the Semantic Gap in Content Management Systems: Computational Media Aesthetics.” In Proceedings of Cosign 2001, 91–99. Amsterdam, Netherlands.
  • Dorussen, H., H. Lenz, and S. Blavoukos. 2005. “Assessing the Reliability and Validity of Expert Interviews.” European Union Politics 6 (3): 315–337. doi:10.1177/1465116505054835.
  • Fuschi, D. L., and A. Badii. 2013. “Media Production Process Management Support: The Challenges Faced by Decision Makers.” In Proceedings of the European, Mediterranean and Middle Eastern Conference on Information Systems, edited by A. Ghoneim, S. AlShawi, and M. Ali, 1–8. Windsor, UK.
  • Gans, H. J. 1979. Deciding What’s News: A Study of CBS Evening News, NBC Nightly News, Newsweek, and Time. New York: Pantheon Books.
  • Ghiassi, M., D. Lio, and B. Moon. 2015. “Pre-Production Forecasting of Movie Revenues with a Dynamic Artificial Neural Network.” Expert Systems with Applications 42 (6): 3176–3193. doi:10.1016/j.eswa.2014.11.022.
  • Ghose, A., A. Goldfarb, and S. P. Han. 2013. “How is the Mobile Internet Different? Search Costs and Local Activities.” Information Systems Research 24 (3): 613–631. doi:10.1287/isre.1120.0453.
  • Giarratano, J. C., and G. Riley. 2005. Expert Systems: Principles and Programming. Boston: Thomson Course Technology.
  • Gitlin, T. 1980. The Whole World is Watching: Mass Media in the Making and Unmaking of the New Left. Berkeley: University of California Press.
  • Glaser, B. G., and A. L. Strauss. 1967. The Discovery of Grounded Theory. Strategies for Qualitative Research. Chicago, IL: Aldine.
  • Gross, L. S., J. C. Foust, and T. D. Burrows. 2005. Video Production Discipline and Techniques. 9th ed. New York: McGraw Hill.
  • Hart, C. 1995. “Mass Customization: Conceptual Underpinnings, Opportunities and Limits.” International Journal of Service Industry Management 6 (2): 36–45. doi:10.1108/09564239510084932.
  • Hansen, Mark, Roca-Sales, Meritxell, Keegan, Jon & King, George. 2017. Artificial Intelligence: Practice and Implications for Journalism. doi:10.13140/RG.2.2.17735.39849.
  • Hesmondhalgh, D. 2010. “Media Industry Studies, Media Production Studies.” In Media and Society, edited by J. Curran, 3–23. New York: Bloomsbury Publishing.
  • Huang, J. 2015. “A Study on Programmatic Buying in the Era of Big Data.” News Research 4: 58–60.
  • International Data Corporation. 2020. Worldwide Artificial Intelligence Spending Guide. https://www.idc.com/tracker/showproductinfo.jsp?containerId=IDC_P33198
  • Jarek, K., and G. Mazurek. 2019. “Marketing and Artificial Intelligence.” Central European Business Review 8 (2): 46–55. doi:10.18267/j.cebr.213.
  • Jordanous, A. 2016. “Four PPPPerspectives on Computational Creativity in Theory and in Practice.” Connection Science 28 (2): 194–123. doi:10.1080/09540091.2016.1151860.
  • Kress, G. (2003). Literacy in the new media age. Routledge.
  • Le Thi, Hoai An, Hoai Minh Le, and Tao Pham Dinh. 2015. “Feature Selection in Machine Learning: An Exact Penalty Approach Using a Difference of Convex Function Algorithm.” Machine Learning 101 (1–3): 163–186. doi:10.1007/s10994-014-5455-y.
  • Liebold, R., and R. Trinczek. 2009. “Experteninterview.” In Handbuch Methoden Der Organisationsforschung: Quantitative Und Qualitative Methoden, edited by S. Kühl, P. Strodtholz, and A. Taffertshofer, 33–56. Wiesbaden: VS Verlag für Sozialwissenschaften.
  • Myers, M. D. 1997. “Qualitative Research in Information Systems.” MIS Quarterly 21 (2): 241–242. doi:10.2307/249422.
  • Napoli, P. M. 2011. Audience Evolution: New Technologies and the Transformation of Media Audiences. New York: Colombia University Press.
  • New European Media. 2018. AI in the Media and Creative Industries. https://nem-initiative.org/wp-content/uploads/2018/10/nem-positionpaper-aiinceativeindustry.pdf
  • Patel, V. L., and T. G. Kannampallil. 2015. “Cognitive Informatics in Biomedicine and Healthcare.” Journal of Biomedical Informatics 53: 3–14. doi:10.1016/j.jbi.2014.12.007.
  • Qin, X., and Z. Jiang. 2019. “The Impact of AI on the Advertising Process: The Chinese Experience.” Journal of Advertising 48 (4): 338–346. doi:10.1080/00913367.2019.1652122.
  • Rich, E., and K. Knight. 2004. Artificial Intelligence. New York: Tata McGraw.
  • Riedl, M. 2016. “Computational Narrative Intelligence: A Human-Centered Goal for Artificial Intelligence.” In CHI 2016 Human Centred Machine Learning Workshop, Atlanta: Georgia Institute of Technology.
  • Rodon, J., and J. Pastor-Collado. 2007. “Applying Grounded Theory to Study the Implementation of an Inter-Organizational Information System.” The Electronic Journal of Business Research Methods Volume 5: 71–82.
  • Silva, J. C. 1999. “Concurrent Engineering Perspective of Maintenance Aspects through an Expert System Prototype.” In AAAI’s Spring Symposium Series. Vol. 4.
  • Smith, J., D. Joshi, B. Huet, W. Hsu, and J. Cota. 2017. “Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation.” In Proceedings of the 25th ACM International Conference on Multimedia, 1799–1808. Mountain View, CA. doi:10.1145/3123266.3127906.
  • Strauss, A., and J. M. Corbin. 1990. Basics of Qualitative Research: Grounded Theory Procedures and Techniques. Newbury Park, CA: Sage Publications.
  • Stray, J. 2019. “Making Artificial Intelligence Work for Investigative Journalism.” Digital Journalism 7 (8): 1076–1097. doi:10.1080/21670811.2019.1630289.
  • Tarka, P. 2013. “Media Społecznościowe a Metody Personalizacji i Rekomendacji Treści Reklamowych i Oferty Produktowej.” Marketing i Rynek 6: 24–28.
  • Van Rijsselberge, D., M. Verwaest, E. Mannens, and R. Van de Walle. 2010. “How Metadata Enables Enriched File-Based Production Workflows.” SMPTE Motion Imaging Journal 119 (4): 27–38. doi:10.5594/J11380.
  • Zapata-Rivera, D., and I. Katz. 2014. “Keeping Your Audience in Mind: Applying Audience Analysis to the Design of Interactive Score Reports.” Assessment in Education Principles Policy and Practice 21: 442–463. doi:10.1080/0969594X.2014.936357.
  • Žnidaršič, J., and E. Jereb. 2011. “Innovations and Lifelong Learning in Sustainable Organization.” Organizacija 44 (6): 185–194. doi:10.2478/v10051-011-0020-y.