194
Views
0
CrossRef citations to date
0
Altmetric
Editorial

Being informed by artificial intelligence

For several years, we could clearly distinguish the trichotomy of data, information, and knowledge with some level of precision. Data are raw facts and figures that can become information when massaged and placed in the right context. The value-adding activities, from data to information, are largely the domain of information systems. Knowledge, however, adds experience and expertise to the information and often resides in tacit form in people’s heads. So, the Knowledge Management System (KMS) popularity that started in the 1990s was intended to capture knowledge (largely tacit) and put it in a system that could benefit others in the organization. KM processes include externalization (taking tacit knowledge and representing it in a KMS) and internalization (making this knowledge accessible to people who might need it). For example, a consulting firm might have a team that concluded a multi-year project in Malaysia – and their experiences, successes, failures, precautions, and guidance in a KMS, could be invaluable for other teams initiating projects in that region. KM practices included creating the right incentives for knowledge to flow between people and the system, as well as embedding knowledge into products and services offered.

Today’s Large Language Models (LLMs) that drive generative AI (gAI) are confounding the neat evolutionary pattern between data, information, and knowledge. While human knowledge is largely built on learning through experience with the world, AI-based knowledge relies on patterns in data. gAI, in some sense, has a representation of a massive corpus of words (in LLMs) that can establish predictive relationships. Given that human knowledge, in its various manifestations, trains and builds AI-based knowledge, there is a parallelism between these forms of knowledge, and with the evolution of human knowledge and the training of AI-based knowledge, it is likely that both will advance in tandem. However, what is surprising is how these predictive language models can deliver useful knowledge with the right input cues. It is almost like the massive corpus of training data fed into these foundational models can layer complex neural structures, resulting in valuable (and even creative) outputs typically offered by humans with skills or expertise. Even the designers of these models cannot understand their ability to surprise by providing useful knowledge. This raises the question of whether we really need information to mediate between data and knowledge at all. Is data itself knowledge? The argument implicit in this is that knowledge has largely been a correspondence between our observation of the world and its interpretation in our brain, in the form of models that we use to make sense of observations. Now, with the plethora of data, we don’t need to understand the world to know it. All knowledge can be extracted through the data. So, in this view, knowledge is not internalized through human assessment but externalized through data.

This data-is-knowledge view is being bought by many companies as they jump on the gAI bandwagon. These companies intensively invest in data and analytical capabilities and might even accept a data and AI culture that comes at the cost of human interpretation and judgment. This view is reinforced by the many successful practical applications of AI, from automated cars to reading radiology reports. If the AI can achieve a 99% percent diagnostic accuracy in reading radiology reports when fed massive training data, then isn’t the predictive accuracy all we need if it exceeds human predictive accuracy based on experience? It is true that a radiologist can explain their diagnosis, while the AI can only observe patterns in data – but why do we need an explanation when we can predict well? After all, accurate prediction is valuable to function well in the world.

I would argue that the data-is-knowledge view is dangerous. I present three reasons below, but there could be more. First, AI in general (from data), including gAI, is largely about prediction; knowledge is about explanation and causality. So, the presumption is that if data can be used to predict accurately, then there is little reason to understand why the prediction works. While this could be true for machine learning algorithms that read an MRI, most businesses deal with customers and employees and human behaviors. Managers would want a rationale behind a decision, customers would want an explanation of why their load was denied by the bank, and even patients might want the reasoning and tradeoffs behind a diagnosis. Second, the AI is only as good as its training data set – which, if trained on the Internet, could be subject to significant biases. So, predicting the profile of effective US Senators based on historical data will inevitably have “male” in the profile as the dominant group in the population, reinforcing prior prejudices. Similarly, evaluations of products and services could be due to herd effects, where users provide good ratings because other ratings are positive. Such effects may not be visible to the AI in the data patterns and could cause hallucinations. Third, the data-is-knowledge assumption suggests some kind of blind dependency on AI. It is like mindless scrolling on our smartphones (as many of us tend to do), responding to algorithmic cues, and training the AI on our data. Without cognitive engagement with AI, human learning will be subservient to machine learning. Such engagement involves human cognitive effort in the best problem formulation to extract value from the AI, asking the right questions, probing through conversational interactions, and using critical thinking in evaluating the outputs.

The point I make is simple. Let’s not be too quick to abandon the data-information-knowledge trichotomy and rely heavily on the data-to-knowledge box. Bringing the information layer back into the AI is not trivial. On the one hand, it requires translation from deep neural network language, often embedded in “hidden” layers, to logic that can be interpreted by humans. As AI engines interpolate and re-interpolate data, the audit trail becomes tough to follow and mysterious. For instance, AI has accurately predicted a person’s race from X-rays – but we have no idea how it does that. Without this understanding, the AI can use these results to undermine positive patient outcomes based on its large corpus of data tying race to health outcomes. Moreover, our incentives work against explainability – since with the increasing complexity of AI and larger training sets, its accuracy might go up, but its explainability goes down. Explainable AI is important, however, to engender trust in AI outputs, trust in automation, facilitate adoption, and provide a basis to recalibrate and improve the AI.

So, let’s be wary of the data-is-knowledge presumption. It is a risky basis on which to build AI, and it is easy to get carried away in its reinforcement cycle of synthetic data training and ubiquitous automation. The AI can run away from us. The information layer is needed on both sides of AI – to frame the right questions, access the right data sources, and interpret the veracity and viability of the results for decision-making. Some might refer to this as keeping the “human in the loop” or simply keeping the human edge over the machine.

Disclosure statement

No potential conflict of interest was reported by the author.

Additional information

Notes on contributors

Varun Grover

Varun Grover is the George & Boyce Billingsley Endowed Chair and Distinguished Professor of IS at the Walton College of Business, University of Arkansas. He has published over 400 refereed journal articles in IS. Over his 30+ year career, he has consistently been ranked among the top five researchers globally, based on his publications in top journals (like MISQ, JMIS, ISR, JAIS, etc.), citations (>52,000) and h-index (of 100). Recently, Thompson Reuters recognized him as one of 100 Highly Cited Scholars globally in all Business disciplines, and a Stanford University study ranked him 6th (out of 17,971 authors) in the IS discipline. He is Senior Editor for MISQ Executive, Editor of the Journal of the AIS Section on Path Breaking Research and has served as Senior Editor for MISQ (2 terms), the JAIS (4 terms), among others, and as Associate or Advisory Editor for 13 journals (including JMIS, ISR, JSIS, ISJ). Dr. Grover’s current work focuses on the impacts of digitalization on individuals and organizations. He is the recipient of numerous awards from USC, Clemson, University of Arkansas, AIS, Academy of Management, DSI, the OR Society, Anbar, and PriceWaterhouse, among others, for his research and teaching. He is an AIS Fellow and recipient of the prestigious AIS LEO recipient for lifetime achievement.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.