Human trust in AI: a relationship beyond reliance

Sara Blanco ORCID: orcid.org/0000-0002-5170-7095¹

4318 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Trust in artificial intelligence (AI) is often discussed by both the general public and a part of academia. The discourse on trust in AI is often presented as analogous to trust in people. However, it is unclear whether the concept of trust can suitably be extended to describe relationships between humans and other entities. In this article, I will argue that the main features of trusting relationships apply both when the trustee is a human or an AI system. This view is opposed to the claim that only humans can be trusted and that technology, at its best, can be just relied on. However, it is commonly accepted that reliance has weaker implications than trust. We often rely on those whom we need or want to do something for us, regardless of their motivation to act. I will argue that motivation is relevant for trust, both in humans and in AI. Because of this, I propose trust as a suitable goal to aim for when shaping human-AI relationships.

Social Integration of Artificial Intelligence: Functions, Automation Allocation Logic and Human-Autonomy Trust

Article Open access 14 January 2019

In AI We Trust Incrementally: a Multi-layer Model of Trust to Analyze Human-Artificial Intelligence Interactions

Article Open access 23 October 2019

In AI We Trust: Ethics, Artificial Intelligence, and Reliability

Article Open access 10 June 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Trust is often considered the glue of a healthy society. We need others and others need us. As artificial intelligence (AI) rapidly integrates into our lives, it becomes an integral part of our social reality. Consequently, the concept of trustworthy AI has emerged as a critical consideration when assessing AI’s role in society and the implications of its implementation. Trust appears as the central topic of numerous ethic guidelines, white papers and scientific publications [1,2,3,4,5]. However, the paradigmatic cases of trust in the literature are interpersonal. This makes it unclear whether the concept of trust can suitably be extended to describe relationships between humans and artificial systems. In this article, I argue that the main features of trusting relationships apply both when the trustee is a human or an AI system.

Trusting someone means making oneself vulnerable towards that someone and willingly accepting the risk of being betrayed [6, p. 235, 7, p. 20]. To trust a trustee, it is not only relevant whether the trustee has the skills to do what they are trusted with; the motives that lead them to do so matter too [7, pp. 11–12]. What a trustor considers an appropriate motive to trust is not necessarily rational. Trustors often trust based on their intuitions, feelings and their personal history with the trustee. I will argue that, at first sight, this emotional, and potentially irrational, side of trust makes it questionable whether trust can and should be the goal for human-AI relationships.

In the context of AI, the most popular alternative to trust is reliance. Reliance on technology has been conceptualised in terms of computational reliabilism (CR) [8], rational trust [9] or a thin notion of trust [10]. The bottom line of those approaches is the same: only humans can be trusted, whereas technological artefacts, can at best be relied on. From that view, it follows that, when using AI, the emphasis should be placed on its technical success. By technical success, I mean that the system arrives at satisfactory results most of the time in its used contexts. For example, for a language model such as Open AI’s GPT-4, the system is considered technically successful if it is able to provide adequate answers to most^{Footnote 1} of the inputs it receives. I will argue that technical success is not enough to satisfy social demands for AI, such as avoiding bias or respecting users’ privacy, for example. Therefore, reliance falls short to capture the kind of relationship that would be desirable between humans and artificial systems. The novelty of AI creates a conceptual gap between this kind of system and other technological artefacts. The situations and applications of AI differ significantly from previous technologies, as AI possesses an unprecedented ability to replicate human decision-making processes more closely than ever before. Furthermore, the outcomes or predictions generated by AI systems can be unforeseeable even by their designers.

This has led some authors to discuss these systems as having (quasi-)agency [11, p. 184]. This unpredictability, a byproduct of complex algorithms and learning processes, further distinguishes AI from more traditional, predictable technological tools. Because of this, the relationships that humans can (and should) establish with AI are also different. In this paper, I will argue that such relations should be trusting ones.

To develop the ideas above, the paper is structured as follows. In Sect. 2, I introduce the concept of ‘trust in AI’ and briefly outline its use in the literature. I highlight the contrast between the latter use of the term ‘trust’ and how it is used when referring to human relationships. This contrast opens the question of whether it is legitimate to talk about trust in AI at all. In Sect. 3, I address the positions of authors who claim that AI cannot be trusted but just relied on, taking Ryan [9] as a representative example of such positions. In Sect. 4, I will respond to Ryan’s arguments and provide reasons why I disagree with his conclusion. In Sect. 5, I revisit the concept of ‘trust in AI’, focusing on the common features of cases of interpersonal trust and cases in which the posit trustee is an AI. In Sect. 6, I will point out future research directions. Finally, I will conclude with the claim that interpersonal trust and ‘trust in AI’ systems are similar enough for the latter to be considered trust and become the goal to aim towards when shaping human-AI relationships.

2 The conceptual problem of ‘trust in AI’

2.1 Trust as an interdisciplinary concept

Trusting relationships are a necessary part of everyday life. Nobody is completely self-sufficient since, at some point, everyone needs to delegate something to someone else. Such a need for delegation makes the party who trusts vulnerable towards the trusted one. The vulnerability arises from the fact that trusting relationships always imply the possibility of the trusted party not accomplishing what is expected of them [6, 7, 12]. Nevertheless, they are trusted because the party who trusts perceives the risk of the trusting relationship ending up in disappointment as low enough [13, 14].

Trust is a widely studied concept that has been approached from a variety of disciplines, such as cybernetics, psychology, sociology, game theory or philosophy.^{Footnote 2} Within the first, trust can be understood through the lens of second-order cybernetics, which emphasizes dynamic feedback loops between trustors and trustees. This framework aligns with the transitional order systems discussed by Marlowe [15, p. 5], where entities like AI exhibit reflexivity^{Footnote 3} not in full agency but through their design and interaction with users. Understanding trust in AI as part of a second-order cybernetic system underscores its relational and iterative nature, positioning trust as an emergent property of the human-AI ecosystem.

In psychology, trust is often seen as “a generalized expectancy” that the words, promises, or written statements of others can be relied upon [16, p. 651]. Such an expectancy is key to developing a healthy personality and social success, influencing the individual’s learning process while trust is the degree to which they believe their informants without independent evidence. Other views in psychology, influenced by philosophical accounts, characterise trust as a psychological state involving the intention to accept vulnerability based on positive expectations about another’s behaviour [17, pp. 571–72]. These views put the focus on the vulnerable state of the trustor, which is a product of the uncertainty they have regarding the trustee’s behaviour.

In sociology, Luhmann’s account stands out [18, p. 102]. According to him, trust is a mechanism that reduces complexity in social systems by allowing individuals to act despite uncertainty about others’ actions. In short, trust is essentially cooperation. This view applies when there is a small number of parties involved, but in the case of institutions (or in Luhmann's terminology, “systems”), we can only talk about confidence, but not trust. Interestingly, there is a parallel to draw between these claims and those made by authors such as Nickel [10] or Ryan [9] regarding artificial trustees, who argue that AI can only be relied on, but not trusted.

In game theory, Gambetta’s [13] account of trust as risk assessment has been highly influential. He argues that trust is a way for the trustor to achieve their goals while minimizing risk, recognising that the trustee is as free to betray the trustor as vice versa. Gambetta even suggests the possibility of parameterizing trust on a scale from 0 to 1, where 0 represents distrust and 1 represents complete trust. In this framework, a value of 0.5 would indicate a point at which the trustor is entirely uncertain about the trustee's actions. According to this theory, the trustee’s motivations are irrelevant; what matters is whether their actions meet the trustor’s expectations and the likelihood that such motivations will persist. Although Gambetta’s account is framed in the context of interpersonal trust, it offers a broader perspective that examines how trust functions across different types of relationships between trustor and trustee. This conceptualization of trust aligns with the rational expectations that human trustors often have when engaging with artificial trustees.

Despite extensive efforts across various disciplines to define trust, no single, universally agreed-upon definition has emerged. As a result, the term is employed in diverse ways, often reflecting conflicting interpretations. In the remainder of this paper, I will understand trust as the kind of relationship that one party has with another when the former can rely on the latter and is willing to do so (if necessary).^{Footnote 4} The party who trusts is commonly referred to as the trustor and the trusted party as the trustee [7]. More formally:

Definition 1

Trust is the kind of relationship that a trustor has towards a trustee when the trustor believes that they could delegate something to the trustee and they would be willing to do so if necessary.

It is worth noting that delegation^{Footnote 5} does not always require trust. A weaker form of relationship, where one party delegates a task without necessarily believing the other will succeed, is called reliance.

Definition 2

Reliance is the dependence that one party has towards another when the first needs to delegate something to the second.

The above definitions are tentative approaches^{Footnote 6} to the concepts of trust and reliance that fall into the scope of philosophy. Such definitions are highly influenced by the approaches of other authors within the discipline, in which trust has been extensively explored, with interpersonal trust widely regarded as the paradigm [20]. In the next section, I will provide a brief overview of some of the most influential philosophical accounts of trust within this context, hoping to frame and properly contextualise my own definitions.

2.2 Interpersonal trust

Generally, the trustor places trust in people to whom the trustor believes they can delegate something.^{Footnote 7} The reasons for forming such a belief are various. In day-to-day situations, people usually trust based on their intuitions and past experiences [19, 23]. However, neither intuitions nor personal experiences can be legitimately generalised to support the conclusion that a trustee should be trusted or distrusted, objectively. For example, I can trust my neighbour to take care of my cat because they have done so successfully in the past. However, my personal experiences with my neighbour do not constitute a solid reason for anyone else to trust them. The reason is that my relationship with my neighbour is unique, and the behaviours that result from it –such as the acts of trusting and delegating– cannot be extrapolated to anyone outside that relationship. In this sense, trust is subjective: the trustor establishes a trusting relationship with the trustee based on the former’s intuitions and their subjective experience with the latter’s.

There is a vast body of literature that aims to characterize trust. To shed light on the nature of ‘trust in AI’, it seems reasonable to begin with the literature on interpersonal trust. Given its influence, the work of Annette Baier [6] is a good starting point. In her foundational text, Baier describes trust as a normatively loaded attitude. That is, trust comes with some implicit “discretionary responsibility” (p. 236) towards the trustor. When analysing how the trustor can predict whether the trustee will pick up on such a responsibility, Baier points to goodwill. According to her, the trustor trusts the trustee when the former relies on the latter’s goodwill (p. 251). When one person expects goodwill from another and such goodwill ends up being absent, the former can feel betrayed by the latter (pp. 234–35). This is a dynamic that can only take place among humans: it would not make sense to expect goodwill from non-human entities, nor to feel betrayed by them, since betrayal is a feeling that is associated to intention [10, pp. 236–38, 12, p. 8]. If someone fails me on purpose, I may feel betrayed, whereas if the same happens by accident, it would seem unreasonable to feel that way [12, p. 8]. On this basis, Baier considers trust as an exclusively human attitude.

Building on Baier’s work, many authors have followed in her footsteps. While not everyone agrees that trust necessarily implies goodwill, most of the literature—whether through that idea or alternative arguments—agrees in characterising trust as uniquely human [7, 12, 24].^{Footnote 8} In this paper, I challenge that claim. However, to use the term ‘trust in AI’ in a legitimate manner, rather than as a vague or lazy analogy, it is crucial to clarify its meaning.

2.3 Trust in AI

Nowadays, the term ‘trust’ is being used not only to describe relationships among humans. Both the general public and a part of academia often talk about trust and distrust in artificial intelligence (AI). In 2019, the European Commission published ethics guidelines that put the focus on the concept of trustworthy AI [2]. The document framed trust as a necessary condition for AI’s successful implementation in society, parting from the assumption that trusting non-human entities is possible. During the last couple of years, many other documents have shared the same assumption and arrived at similar conclusions about the role of trust in human-AI relationships [27].

The reason for the intuitive use of the term ‘trust’ to describe human-AI relationships is that AI systems are not mere objects. AI is widely considered not a mere technological artefact but a socio-technical tool [28,29,30,31]. This means that its successful implementation concerns the interaction of social and technical factors (see [32, p. 72]). In this sense, an AI system is conformed not only by hardware and software, but also by human input and data, from which the system ‘learns’. In addition, what I mean by socio-technical tool^{Footnote 9} is that, unlike mere tools, the use of AI systems goes beyond extending a particular user’s ability. AI systems –particularly machine learning (ML) ones^{Footnote 10}– seemingly “master” abilities themselves, conducting apparent decision-making. By this, I do not mean that AI systems have (full) agency. In any case, some sort of quasi-agency [33] could be attributed, more as a result of how users engage with the systems than of the systems’ abilities per se. Hence, such an appearance of agency (or quasi-agency) gives rise to the intuition of anthropomorphising some AI systems and engaging with them in a similar way in which we interact with other humans [34]. One reason for this intuitive anthropomorphisation is that the user does not perceive the mechanisms through which the system mimics human skills, but only the appearance of such replicated skills. Even if the user is familiar with how AI works, they still interact with an artefact that somehow mimics human behaviours. Therefore, the natural response is to react to such behaviours as one normally does with fellow humans. This dynamic of mimicking and engaging with mimicked behaviours includes trusting relationships.

Such a perception of AI as a potential trustee comes from its apparent ability to achieve results owing to its capacity to identify patterns and derive solutions (relatively) autonomously, beyond the explicit programming by its developers. I qualify this ability as apparent because, as mentioned, humans interact with AI systems as if they were fellow humans when such systems succeed at mimicking typical human skills. Therefore, whether AI systems achieve results autonomously or not is beside the point: as long as they are perceived as autonomous, they will be perceived as potential trustees.

In this article, I focus on artificial neural networks (NNs),^{Footnote 11} although its conclusions can be easily applied to other ML methods. The goal of NNs is to minimize their loss function; that is, to make predictions as accurate as possible, balanced against the risk of overfitting, particularly over the training data set. In adjusting its weights, the neural network follows a predetermined algorithmic process, primarily through backpropagation and gradient descent. While this process is deterministic, meaning it follows a set of defined rules and calculations, it is important to note that the designers do not explicitly set the final weights. Instead, the network iteratively adjusts its own weights based on the data it processes. This gives the impression of autonomy, but it is within the confines of the algorithm’s structure and the data it is trained on. Therefore, the ‘freedom’ of the system lies in its ability to autonomously find the best solution within the parameters set by its algorithm, rather than the designers specifying weight values directly. Thus, how we use and relate to AI differs from how we relate to previous technology.

Unlike previous technology, AI uses data to achieve results that are unpredictable even for its designers. This is so because, even if AI systems are not ‘free’ to pursue autonomous thinking as a person would, the system is able to iteratively adjust its weights based on the specific dataset and the guiding principles of backpropagation. Because of this, even if AI ‘learns’ in a significantly different way from humans,^{Footnote 12} its way of managing data poses a novelty that requires its users to engage with the technology in a novel way. Unlike previous technology, AI systems do not give us the means to merely extend our own abilities: they seem to exhibit abilities of their own. Therefore, the user does not merely use what the system provides, but they interact with the results, continuing to feed the system with new data and making its results better over time.

Additionally, the results that can be achieved with ML are being used in a variety of domains. Often, ML is used in high-stakes situations that deeply affect people’s lives, such as medical diagnosis (e.g. IBM Watson for Oncology), loan grants (e.g. ZestFinance), sentence attribution (e.g. COMPAS), etc. In these cases, a successful outcome is not just a technical solution to a technical problem.

When using AI for these purposes, the goal is not just to minimise a loss function so the risk of its predictions being inaccurate is as low as possible. The goal is to provide correct diagnosis to real patients, and accurately predict who will be able to pay a loan back or which convicts are likely to reoffend. Ultimately, the use of AI in making predictions extends beyond simple accuracy guesses; it significantly impacts people’s lives in profound ways. Such a potential to affect people’s lives is a consequence of the interplay between technical and social aspects that comes with AI. Thus, it is only reasonable to consider that we, as a society, should be able to trust these kinds of systems. However, the assumption that humans can actually trust AI has not been properly argued for in the literature.^{Footnote 13} This gap leaves it unclear whether the concept of trust can suitably be extended to describe relationships between humans and AI systems. Hence, I refer to ‘trust in AI’ in quotation marks, in order to flag its status as a posit concept.

Using the term ‘trust’ to refer to human-AI relationships without specifying how the term is being used leads to problematic ambiguities. How can non-human entities be the object of trust if they cannot exhibit goodwill [6]? How is this possible for entities that cannot be held responsible for their actions [24]? Or hold commitments [12]? Authors such as Nickel et al. [10] or Ryan [9] have pointed out the problem of using the term ‘trust’ for cases which involve AI systems as trustees. Their solution to resolve such ambiguities is to use the term ‘trust’ only to refer to relationships among humans. In contrast, when it comes to human-AI^{Footnote 14} relationships, these authors propose adhering to the concept of reliance. In the next section, I will use Ryan’s argument as an example of this kind of positions.

3 Reliance as an alternative to trust

3.1 Reliance in the literature

Despite the attention that the topic of ‘trust in AI’ has gained in the last years, there is yet no clear solution for the conceptual problems that it poses. Using the term ‘trust’ to describe human-AI interaction in the same way that we talk about human–human relationships seems to be a problematic option. Therefore, there are authors who propose alternative concepts to describe human-AI relationships. Before exploring alternatives to trust, it is worth noting that trust is a gradual concept. That is, relationships cannot be binarily categorized in either trust or non-trust (not to be confused with distrust). There is a rich gradation in between in which a trustee can be trusted to a lower or a higher degree, depending on the trustor’s perception of them.

In contraposition of such a gradation, we can find the concept of reliance, which, so far, appears as the most popular alternative to the concept of ‘trust in AI’. In this paper, I understand trust and reliance as mutually exclusive states, meaning that any given relationship can be distinctly categorized as either based on trust, on reliance, or on their complementary concepts (distrust and lack of reliance, respectively). In the literature of interpersonal trust, reliance is the most widely used term to refer to a weaker version of trust that can be placed in non-humans [7, 12, 24]. When it comes to technology, Nickel et al. [10] refer to a thin notion of trust, Durán and Formanek [8]^{Footnote 15} use the term computational reliabilism and Ryan [9] talks about rational trust. Although these concepts are not interchangeable, –in what concerns this paper– they share enough similarities, so that for simplicity I will group them under the umbrella term ‘reliance’.

3.2 AI as object of reliance

I have chosen Ryan’s [9] paper to represent the position according to which AI cannot be trusted but merely relied on.^{Footnote 16} In his paper, Ryan argues that AI does not have the capacity to be trusted. His main argument is that trust requires the attribution of affective or normative motives to act. Affective motives primarily refer to goodwill guiding the trustee’s actions towards the trustor. Normative motives refer to the trustee’s drive to act according to what they should do (pp. 4–5). Since AI neither possesses emotive states such as goodwill nor can be held responsible for its actions, AI cannot be the object of trust (p. 2). Ryan describes trust as a phenomenon involving a trustor A, a trustee B and some action X. Influenced by the literature on interpersonal trust, he characterises trust as follows:

(i)
A has confidence in B to do X.
(ii)
A believes B is competent to do X.
(iii)
A is vulnerable to the actions of B.
(iv)
If B does not do X then A may feel betrayed.
(v)
A thinks that B will do X, motivated by one of the following reasons:
1. (v.i)
  Their motivation does not matter (rational trust).
2. (v.ii)
  B’s actions are based on a goodwill towards A (affective trust).
3. (v.iii)
  B has a normative commitment to the relationship with A (normative trust) [9, p. 5].

Following the account above, the main features of trust are described in points (i)–(v). When it comes to the motivation that A ascribes to B, Ryan points to three different hypotheses. Thus, options (v.i), (v.ii) and (v.iii) refer to different accounts of interpersonal trust, which Ryan labels in parentheses. As pointed out above, AI lacks emotive states and it cannot be held responsible for its actions. Therefore, Ryan concludes that rational trust is the only kind of trust that can be legitimately placed on AI. Rational trust refers to the kind of trust in which the trustor makes themselves dependent on the trustee concerning some specific action, regardless of the trustee’s motivation to execute such action. This concept is equivalent to reliance [10, 12]. Consequently, Ryan concludes that it is possible for humans to rely on AI, but not to trust it.

4 The conceptual problem of reliance on AI

While I agree with Ryan that rational trust is actually mere reliance, I do not believe that this is the only kind of ‘trust’ that can be placed in AI. I define reliance as follows^{Footnote 17}:

Definition 2’

Reliance is the dependence that party A exhibits towards party B regarding some task or role δ when A needs and/or wants to delegate δ to B. A relies on B regardless of B’s motivation to do δ.

I believe that the relationships between humans and AI can go beyond reliance, as characterised above. Therefore, I disagree with Ryan and I will argue that it is possible to trust AI. The core of my disagreement lies in the above characterization of trust, particularly in point (v). From (v) it follows that the trustor trusts the trustee when the former thinks that the latter will perform some action X motivated by affective or normative reasons. I beg to disagree. I proceed to show why the attribution of neither affective nor normative motives is strictly necessary for trust to develop.

4.1 Affective motives

Ryan’s allusion to affective motives is inspired by philosophers such as Anette Baier and Karen Jones, who hold views on trust revolving around the concept of goodwill. In her foundational paper, Baier [6] characterizes trust as an attitude of accepted vulnerability. A decade later, Jones [7] redefined trust as an affective attitude. According to her, the trustor trusts the trustee when the former perceives the latter both as competent and good-willed. That is, the trustee normally acts as the trustor expects them because the trustee knows they are being counted on [7, p. 8] On their part, the trustor trusts because they think that the trustee cares about the trustor and aims not to let the trustor down. Both Baier and Jones propose accounts which assume an emotional connection between the trustor and the trustee in virtue of which the latter cares about not letting the former down.

While I share Baier and Jones’ idea that the trustee’s motivation to act plays a key role in trust, I will argue that such motivation does not need to have an affective or emotional character. Often, trust is paired with an emotional relationship between the trustor and the trustee. Therefore, the focus that a part of the literature on interpersonal trust has put on the trustee’s goodwill is understandable. However, there are trusting relationships that involve less emotion than what Baier and Jones sketch. Most of the time, trustors base their trust on past experiences and the nature of such experiences is varied. Sometimes, we trust others based on their professionalism or expertise, rather than their goodwill. Examples of this can be found in employer-employee or doctor-patient relationships. I like to think that my doctor cares about my well-being and, because they care about me, they will do their best to help me. In other words, I like to think that my doctor has goodwill towards me. However, I do not know that. All I know is that my doctor’s job is to take care of my health the best they can. They may find me irritating, unpleasant and have no positive feelings towards me at all. However, the assumption that my doctor’s professionalism is above their personal judgments is enough for me to trust [40, p. 14]. We trust judges, civil servants, police officers, etc. Not all of these trustees necessarily care whether trustors are let down. However, it is important to notice that the trustees do have motives to not let trustors down, even in the absence of goodwill. A judge may not have personal concern for me, but there can be other motives that lead them to deliver a fair judgment. For instance, they might be motivated by a desire to cultivate a reputation as a fair person, which indirectly influences the outcome of my case (positively). As long as the judge has reasons to uphold justice in my case, I can trust their decision-making. In sum, my only remark in this regard is that the nature of such motives does not need to be affective.

4.2 Normative motives

Following Ryan [9], when characterising trust, another possibility is to focus on the trustor’s attribution of normative motives to the trustee. This kind of account goes a step further with respect to the affective motives summarized above. Normative accounts of trust state that the trustor does not only rely on the trustee’s goodwill, but they feel that the trustee owes them such goodwill [41, p. 31]. According to this view, the trustor does not trust the trustee with whatever action X.^{Footnote 18}X must be moral, and the trustee should be able to recognise it as such. Thus, qualifying as a trustee implies holding moral agency and, consequently, being subject to potential blame [9, p. 13].

In a trusting relationship, the trustor believes that the trustee will adhere to some sort of norm that the two of them share. For example, when I trust my neighbour to take care of my cat, I do it because I believe that both of us agree that my cat’s wellbeing is important and that looking out for the animal is the right thing to do. There is some sort of internal normativity that applies to the members of the trusting relationship. However, the term “normative motives” refers to the motives behind actions that are based on adherence to certain norms that a wider social group shares. In trusting relationships, the trustor and the trustee do not necessarily adhere to moral or social norms, but to the norms that both of them implicitly agree on. The trustee behaves how both the trustor and themselves believe they should behave. However, an external party may very well disagree. For example, members of a criminal organization can trust each other to perpetuate crime. Criminals who trust each other are motivated to act according to their shared values and norms. However, these motives are hardly normative since the norms to which criminals adhere are shared by too few people to be proper norms. Because of this, even if trust implies some sort of internal normativity, the above example involving trust without normative motives shows that we better avoid talking about normative motives in general.

In relation to the trustor’s attribution of normative motives to the trustee, Ryan introduces the idea that the trustee needs to be a full moral agent [9, p. 10]. One of his main arguments that he proposes is that, if a party cannot be held responsible for their actions, then they do not qualify as a trustee.^{Footnote 19}

However, there are cases in which the responsibility should not be traced to the trustee since they could never be fully responsible. I proceed to explain why. When we trust, we assume the risk of things not going as we want or expect. Such a risk can materialise due to a variety of reasons, and not all of them can be attributed to the trustee. For example, if I trust my neighbour to keep my cat safe while I’m on holiday, I am aware that a series of unfortunate events could happen to my cat over which my neighbour has no control. My cat could eat a bee, have a heart attack or jump through the window, despite my neighbour’s desperate efforts to chase after the animal. If any of that happened, I would not blame my neighbour, or at least not fully. Trust implies moral responsibility. However, I disagree with Ryan on how such a responsibility could be potentially distributed in different kinds of situations. Trust is a complex phenomenon and the responsibility for the trusted action X does not necessarily fall completely over the trustee B. Because of this, I do not consider that the trustee needs to be a full moral agent since the full responsibility for X does not necessarily lie with them.

In sum, my critique of Ryan’s view lies in his characterization of trust. If one understands trust as Ryan characterises it –a relationship between two parties in which one attributes affective or normative motives to the other–, then the conclusion that AI cannot be trusted does indeed follow. But this disjunctive characterization is not necessary, as I will show. In the next section, I propose a different way to understand trusting relationships and clarify what consequences my account has for ‘trust in AI’.

5 Rescuing trust in AI

Assuming that AI can only be, at best, merely relied on does not seem to satisfy the ethical demands that guidelines and white papers flag [1,2,3,4,5]. There is a mismatch between the concept of reliance and what is expected from AI in some of the high-stake scenarios which is already in use. If in those same scenarios, if the trustee were human, one could confidently say that trust is required. However, it remains unclear whether it is legitimate to use the concept of ‘trust’ to describe human-AI relationships. I will argue that it is. To do so, it is necessary to be clear on what is meant by ‘trust’ in order to start using the concept without quotation marks.

5.1 Trust in humans as the paradigm

For decades, interpersonal trust has been the paradigm in the literature [6, 7, 12, 14, 18, 40]. There is also research on trustees that are not individual persons, such as institutions [42,43,44], or oneself [45,46,47], for example. There are important differences between all these kinds of trust. However, all of them share some sort of family resemblance that is strong enough to fall under the umbrella of trust. I will argue that ‘trust in AI’ can be designated by the same term. To build my case, I propose the following features as basic characteristics of trust^{Footnote 20}:

(1)
Trust is a relationship in which the trustor has positive expectations towards the trustee.
(2)
Trust entails the risk of the trustee not behaving as the trustor wants^{Footnote 21} them to.
(3)
Trusting relationships are built on the (potential) need or wish of delegation.
(4)
Trust is motives-based (which makes it distinct from reliance).

Features (1)–(2) are uncontroversial and do not enter into conflict with Ryan’s view on trust.^{Footnote 22} Feature (3) is relatively uncontroversial too. However, I would like to briefly highlight the “potential” in parenthesis before continuing with my argument. I make this nuance to flag that trust is associated not only with the current state of things but with hypothetical scenarios.^{Footnote 23} The trustor believes that the trustee would behave in a certain way given certain circumstances. However, such circumstances may never materialise. For example, I trust my neighbour to take care of my cat in case I were ever away and I needed to delegate such a task. But maybe, I will never go on holiday and therefore I will never need my neighbour to take care of my cat. This does not decrease the trust I place in my neighbour. Finally, feature (4) is analogue to Ryan’s point (v) (see Sect. 3.2). But unlike Ryan, I do not specify what kind of motives should be attributed to the trustee in order to qualify as a trustee. The reason is that, as long as there is a (perceived)^{Footnote 24} alignment between the trustor’s and the trustee’s motives, there can be a trust relationship. When describing trust at this very general level, such motives do not need to be associated with goodwill or normative motives. As I pointed out in Sect. 4, making a necessary connection between trust and goodwill fails to explain trusting relationships that are based on professionalism or expertise. Regarding normative motives, as developed in the previous section, my disagreement with Ryan is more subtle. Nevertheless, it weighs enough to not include the trustor’s attribution of normative motives to the trustee as a necessary condition to trust.

On the basis of the discussion above, I offer the following working definition of trust^{Footnote 25}:

Definition 1’

Trust. A trustor A trusts a trustee B regarding some task or role δ (belonging to a domain ∆) iff^{Footnote 26} both of the following hold:

(a)
A has a continuing belief that B is likely to perform δ successfully, moved by motives deemed appropriate by A.
(b)
If A wished or needed B to do δ, then A would be willing to delegate δ to B.

Let me briefly develop the definition above. Let us start with the first condition. Condition (a) serves two purposes. The first one is to define trust as belief-based. This means that it is not possible to trust a trustee without believing that they are likely to act as the trustor expects, moved by motives shared by both the trustor and the trustee. I call this a trust belief.^{Footnote 27} It is possible, and in fact common, to delegate tasks to others without much thought. Sometimes, we delegate tasks to people whom we do not consider trustworthy. If we are lucky, those people may perform the task under our command successfully. However, without trust belief, this is not a case of trust but of reliance. Condition (a) enables me to make this distinction. The second purpose of (a) is to ensure that A holds the trust belief in every possible world used to evaluate (b). That is, the trust belief continues over all the relevant hypothetical scenarios, namely, every scenario in which A possibly wishes or wants B to do δ. This way, trust belief operates as a background condition for trust.

Condition (b) refers to the potential need or wish of delegation that characterises trusting relationships. As stated above, trust has to do not only with current scenarios but with hypothetical ones. Thus, trusting a trustee does not mean just delegating something to the trustee, but believing that delegation is a reasonable option in some hypothetical future scenario (hence the use of the progressive conditional). Using once again my go-to example, trusting my neighbour means that, in the hypothetical case I ever went on holiday, I would actually be willing to leave my cat to their care. Thus, while condition (a) describes the internal mental state of the trustor, condition (b) refers to the external consequences of such a state.

At first sight, definition 1' may appear as weak, given the conditional structure of (b). However, the strength of the definition lies in (a): only when A holds a trust belief about B does A trust B. Believing that someone is trustworthy, unlike believing they are reliable, goes beyond expecting them to perform a task. Trust belief also involves motives, meaning that A trusts B not only when they believe B will do δ, but when they believe B will do δ because of some motive that A deems appropriate. This nuance makes trust a more refined concept than reliance, with the former being harder to foster than the latter. With the above characterization of trust in mind, let me go back to human-AI relationships. Now, the question is whether these relationships can fit the description I just provided.

5.2 Trust in AI as a legitimate possibility

Let us start by briefly going over features (1)–(4) (from Sect. 5.1) and checking whether they work for artificial trustees.

(1)
Positive expectations. Human trustors aim to establish ‘trust’ relationships with artificial trustees when they expect the system to achieve whatever it is meant to achieve. For example, users who trust generative language model systems expect the system to give coherent answers to the prompt provided.
(2)
Need or wish for delegation. In a human-AI ‘trust’ relationship, the human needs or wishes to delegate certain kinds of tasks to the AI. In high-stakes situations, like medical diagnosis or sentence attribution, trust is required in order to use the system. It would be irresponsible to use AI that affects people’s lives in such a significant way if the system cannot be trusted.
(3)
Risk of failure. When a human trustor ‘trusts’ an artificial trustee, the trustor makes themselves vulnerable since they willingly accept the risk of failure of the system. Trusting a system implies expecting that such a system successfully accomplishes the task it was designed for. However, human trustors are aware that systems are fallible.
(4)
Motives-based. As pointed out in (2), trusting relationships are built on the possibility of delegation, which will materialise in case the trustor wishes or needs to do so. Thus, unlike reliance, trust goes beyond successful delegation. In the case of artificial trustees, successful delegation equals technical success (typically, the systems’ predictions are mostly accurate). Trust in AI goes beyond technical success, meaning that the process that the system follows to achieve results is relevant.

Let us elaborate on point (4) since it is the most likely to cause some controversy. I start by clarifying what I mean by “motives”. It is relatively uncontroversial to define motives as the combination of desire and belief that drives an agent’s actions [52, p. 687].^{Footnote 28} Intuitively, motives are exclusively human, since AI systems can neither hold beliefs nor have desires. This is a common critique against the concept of trust in AI. However, when it comes to AI, I adopt a broader definition of motives, considering them as the criteria that favours a course of action. I go back to my go-to example. Let us analyse the motives of a human trustee B to take care of A’s cat. B takes care of the A’s cat because they like cats and they believe that the opportunity to take care of A’s cat will bring joy to both A and B (and hopefully the cat). I call this motivation m, being m the reason for B to do δ (take care of A’s cat). B’s motivation m acts as their reason to do δ because the expectation of future joy favours B to do δ. Analogously, there are AI systems that operate following certain criteria that favours them to provide certain outcomes. For example, let us consider a NN^{Footnote 29} designed for brain tumour detection and classification. On some test data, it accurately identifies tumours in 99% of the MRI scans it processes, affirming its technical success. After some time, it is discovered that the NN detects tumours not by focusing on the pixels of the MRI scan in which the tumour can be found, but based on some other non-causal correlation factor.^{Footnote 30} The process that the system follows to achieve results is crucial to determine whether it is worthy of trust or not. In a wide sense, I understand the criteria used by the system to provide its outcome as the system’s “motive” to act. Thus, analogously to how human doctors can be motivated to treat their patients by their professionalism, altruism, scientific curiosity, etc., an AI diagnostic system can be “motivated” to aim for technical success by its design. In this case, the AI system operates based on the criterion that a certain pattern of pixels typically corresponds to the presence of a tumour. Not only the system’s accuracy is important, but also why the system is accurate. In this sense, trust in AI (as in humans) should be motives-based: the motives that lead a system to perform as it does play a central role when determining whether the system deserves human trust.

One could ask whether this way of attributing motives shifts the motivation back to the system’s designers, thus leading to the conclusion that only human designers, not the systems themselves, can be trusted. When it comes to this point, I find it helpful to shift the analogy to institutional trust,^{Footnote 31} rather than interpersonal trust. In cases of institutional trust, the members of an institution represent such an institution with their individual actions and behaviours. Thus, institutional motives are represented by individuals through individual actions, which transcend the individuals themselves. Similarly, the designers of AI systems encode their motives into the systems they develop. Such systems will execute these motives based on their programmed algorithms and operational settings. If one understands the relationship between AI systems, human designers and their motives in this manner, the following question arises: Why talk about trust in AI specifically? Aren’t we dealing with a case of institutional trust? What distinguishes AI systems like ChatGPT, developed by OpenAI, from products like Bosch’s washing machines? As I noted, there is an analogy to be drawn between trust in AI and institutional trust, but there are some fundamental differences that set them apart. Such differences arise from the special kind of technology that AI is, different from other products that are produced by institutions. Consider the example of a washing machine produced by Bosch, which is developed by a team of engineers representing the company. Similarly, ChatGPT is developed by a team at OpenAI. The main difference between Chat GPT and a washing machine is that the latter does not develop its own criteria to “decide” which course of action to take. A washing machine operates on fixed, deterministic programs without the capacity to modify its operations based on external data. In contrast, NNs like the one powering ChatGPT dynamically adjust their weights during the training process. Even if this process is predetermined, it is not possible for the designers to anticipate every possible system’s outcome, thus giving the AI a semblance of developing its own criteria for action. The distinction between AI having motives and seeming as if AI had motives is blurry. For what is worth, the users interact with devices that act as if they had motives, and based on such a perception, a human-AI relationship is shaped. I label this motives-based relationship as trust. Because this kind of trust is analogous, but distinctively different, from trust in people or institutions, the label “trust in AI” becomes necessary.

To emphasize my conclusion, let us revisit definition 1'. If we accept the use of the term “motives” as clarified above, definition 1' is formulated in such a way that trustee B could be either a person or an AI system.^{Footnote 32} Human-AI relationships are typically characterised by the human’s need or wish to delegate a task δ to the AI. According to definition 1', humans trust AI when they believe that the system is likely to perform δ successfully and they approve the process that leads the system to such success. This idea is reflected in condition (a). As stated in condition (b), such a belief makes the trustor willing to delegate δ to the system, in case the trustor’s need or wish for it materialised. In short, when applied to AI, definition 1' would become the following^{Footnote 33}:

Definition 1’’

Trust. A human trustor trusts an AI system to perform some task δ iff both of the following hold:

(a)
The trustor has the continuing belief that the system is likely to perform δ successfully, following criteria deemed appropriate by the trustor.
(b)
If the trustor wished or needed the system to do δ, then the trustor would be willing to delegate δ to the system.

For the sake of making my argument complete, I apply definition 2’ to the AI context as well:

Definition 2’’

Reliance is the dependence that a human exhibits towards an AI system regarding some task δ when the human needs and/or wants to delegate δ to the system, with the system’s criteria to do δ being irrelevant to the human.

6 A theoretical model of trust (in AI)

As the discussion above suggests, trust relates to attributes such as reliability, safety, or security, yet it cannot be fully described as a linear combination of these attributes. My aim is to build a theoretical model of trust that moves beyond a checklist of measurable qualities to capture its unique relational and evaluative dimensions. Trust not only considers the trustee's performance but also takes into account the perceived motives behind their actions. For example, reliability ensures that a system functions consistently as intended, and safety minimizes risks to users or stakeholders. However, these qualities alone do not determine whether a system is deemed trustworthy, particularly in contexts where human values, ethical considerations, or stakeholder expectations come into play.

This distinction becomes particularly evident when we examine the role of motives, which are central to trust and form the third element of my model. Trust involves assessing why the trustee acts as they do, aligning the trustor’s expectations with the trustee’s underlying rationale or guiding principles. For instance, an AI diagnostic tool might demonstrate high reliability in identifying diseases, but if its decision-making process relies on ethically questionable shortcuts—such as biases in its training data—it may fail to earn trust. Thus, trust transcends technical attributes by incorporating an evaluative judgment of the system’s alignment with broader social and ethical norms.

Building on these considerations, I propose a theoretical model of trust that applies to both interpersonal trust and trust in AI. The model captures the complexities of trust as a spectrum of relationships that range from reliance to appropriate trust. The model is defined as^{Footnote 34}

$$T=(reliance, general \; trust, appropriate \; trust)$$

whose elements are constructed on the 4-tuple (A, B, δ, m). Here, A and B represent the trustor and trustee respectively, δ is the task or role performed by B, and m refers to the motive(s) driving B's actions.

The first two elements of T have already been discussed in Sect. 5 (see definitions 1’ and 2’). The difference between them is that, while (general) trust requires trust belief, reliance does not. In consequence, for A to trust B, B’s actions need to be driven by some motive m that A approves of. In contrast, for A to rely some task δ to B, B’s motives are irrelevant.

Let us discuss now the third element of T. While general trust describes how trust works, I introduce the concept of appropriate trust to describe how trust should normatively work. The difference with a more generalised notion of trust is that appropriate trust requires justified trust belief. Building on definition 1’, appropriate trust can be defined as follows:

Definition 3

Appropriate trust. A trustor A appropriately trusts a trustee B regarding some task or role δ (belonging to a domain ∆) iff both of the following hold:

(a)
A has a continuing justified belief that B is likely to perform δ successfully, moved by motives deemed appropriate by A.
(b)
If A wished or needed B to do δ, then A would be willing to delegate δ to B.

I consider that a trust belief is justified if it is supported by previous beliefs that are justified in turn. Here, I am relying on a coherentist view of justification, according to which a belief b held by an agent X is justified iff such a belief fits X's belief system.^{Footnote 35} I use ‘fit’ in a wide sense, meaning that b does not enter in contradiction with the rest of X's beliefs, but also that the epistemic attitude of X towards the content of b (in this case, their degree of belief) is consistent with the rest of their epistemic attitudes. This way, a coherent set of beliefs is conformed by beliefs that are supported by each other, meaning that a belief b₁ is supported by a belief b₂ iff b₁ is more credible if b₂ is true than if it was false ([61, p. 338, 62, p. 3840]).^{Footnote 36} A coherentist view of justification fits well with the understanding of trust I aim to present in this paper. Trust is a complex phenomenon with different roots. People trust different trustees for different reasons, making it difficult to pinpoint a single source of justification for trust beliefs—just as it is challenging to determine the basis for belief justification in general. Therefore, when defining appropriate trust, my focus is on setting more practical goals: avoiding situations where there are no clear reasons to trust and preventing contradictions, both within the trustor’s beliefs and, ideally, with external evidence.

Lee and See [56] also include in their model the concept of appropriate trust, which aligns closely with the third element of my proposed model. Thomer^{Footnote 37} [63, pp. 69–81] adapts this model to human-guided algorithms (HGA), a focus closely related to that of this paper. Both models emphasize the need for trust to be proportional to the trustee’s actual capabilities, rejecting blind trust or over-reliance. Their detailed analysis of calibration,^{Footnote 38} resolution,^{Footnote 39} and specificity^{Footnote 40} enriches the understanding of appropriate trust, particularly in the context of AI. For instance, their notion of calibration mirrors my requirement for justified trust beliefs, ensuring that trust is rational and aligned with empirical evidence. Similarly, functional and temporal specificity extend the scope of my model by highlighting the importance of context and component-level trust.

Appropriate trust is a key element of T and becomes particularly relevant in cases of trust in AI. The reason is that subjective reasons should not be enough to trust AI systems, especially when used in high-stakes situations. As Ryan [9] pointed out, trust in AI demands the possibility of rational evaluation of the reasons to trust. By introducing appropriate trust as the third element of T, I propose a way in which trust can be rational without necessarily equating reliance, and therefore being a suitable goal for human-AI relationships.

The three elements of T—reliance, general trust, and appropriate trust—form a spectrum that captures varying degrees of trust. These elements are related incrementally, with reliance being the least demanding and appropriate trust being the most stringent. While all three belong to the trust spectrum, they are suited to different contexts depending on the nature of the task δ and the relationship between A and B. Appropriate trust, as the most demanding element, requires a justified trust belief and is therefore best suited for high-stakes scenarios, such as those involving human-AI relationships in critical applications. By contrast, general trust may suffice for less sensitive situations, and reliance is appropriate when the motives behind B’s actions are irrelevant to A. Ultimately, appropriate trust represents the ideal kind of relationship to strive for between humans and AI systems in contexts where errors carry significant consequences, aligning with the need for rational evaluation and justified confidence in such interactions.

7 Conclusion

In this paper, I have presented trust as a suitable term to describe human-AI relationships. Moving beyond the conventional notion of trust as an inherently human attribute, I introduced a definition of trust that encompasses both human and AI systems as trustees.

By including AI systems as potential trustees, I reject the position according to which technology can be merely relied on. A key difference between trust and reliance is that the former is motives-based, while the latter is not. This difference applies to artificial trustees too, meaning that human trust in AI goes beyond the technical success of the system: for trust to be built, the system needs not only to be accurate but to reach accuracy following processes that the human trustor deems appropriate. Analogous to human motivation to act, I refer to such processes as ‘motives’ (to perform).

Building on this foundation, I proposed a theoretical model of trust that captures the complexities of human-AI relationships. This model introduces a spectrum that ranges from reliance to general trust and culminates in appropriate trust, which demands justified trust beliefs. The model provides a structured approach to fostering trust in AI, particularly in high-stakes scenarios.

AI poses a special case in which humans can engage with technical artefacts analogously as they do with human trustors. Because of the phenomenon described above, it is plausible that AI systems behave according to the trustor’s expectations. Consequently, trust is an appropriate concept to strive for in the successful integration of AI into society.

Notes

Here, I use “most” in a loose way. There is no exact percentage of successful cases from which a system is considered technically successful. What is considered a high rate of success varies depending on the system to be rated and its application.
A more detailed overview on different accounts of trust can be found in Blanco [64], pp. 36–48.
In cybernetics, reflexion refers to the system’s ability to not only respond to input but also to reflect on its own operations, adapt, and improve its performance.
This definition of trust is quite wide and, therefore, relatively uncontroversial. However, it contrasts with the views of authors such as Hardin [19] or Gambetta [13] who rather focus on the instrumental value of trust and see it as a form of cooperation.
From now on, I will often use the phrasings such as ‘delegation materialising’, ‘delegation taking place’, ‘leading to delegation’ or similar, in which the term ‘delegation’ is used as an abbreviation of ‘the fact that trustor delegates something to the trustee’.
I will revisit and refine definitions 1 and 2 later in the paper.
Understanding that trust is rooted in belief is not uncontroversial. Examples of this kind of doxastic view can be found in Keren [21, p. 109] or Hieronymi [22, p. 216].
Exceptions to this rule can be found in Nguyen [25], who analyses trust in a general sense, and Alvarado [26], who focuses on trust in AI and concludes that the only kind of trust that can be placed on AI systems is epistemic.
This is not the only way to understand socio-technical tools (more commonly referred to as socio-technical systems). For contrast, see the (related) views of [28,29,30,31].
In the following, my reference to AI specifically pertains to ML algorithms, and more precisely, to neural networks (NN). Considering the vast scope of the AI domain, my decision to concentrate on this particular type of system is due to practicality.
I understand the workings of NN as presented in [35] and take the falsificationist account on such workings from [36].
It is not yet fully known how human minds work and how we learn entirely. However, what it is known, is that we do not use methods such as backpropagation.
There are three worth-noting exceptions that I am aware of [26, 37, 38]. Even though they reach different conclusions from the one offered in this paper, they provide valuable conceptual analyses of ‘trust in AI’.
Nickel et al. [10] address the relationship between humans and technology, not humans-AI relationships specifically. However, since their treatment of the topic is wider than the scope of this paper, their conclusions can be restricted to the particular case of AI.
In a more recent paper, Durán and Jongsma [39] apply CR specifically to the use of AI systems in healthcare. This application of CR is closer to the kind of human-AI relationships treated in this paper than the computer simulations context from [8].
Nickel et al. [10] consider trust in technology more generally and Durán and Jongsma [39]focus on medical AI. Conversely, Ryan [9] targets AI in general, which matches the scope of the present paper.
Definition 2’ offers a more formal version of definition 2.
Cogley [41] coins this restriction in order to avoid what he calls “the trickster problem”. This problem refers to situations in which a party relies on another’s goodwill to pursue something against the good-willed party’s interest. Typical examples comprise scammers, criminals and, in general, anyone who takes advantage of a good-willed victim.
However, that does not mean that the trustee is always fully responsible for the consequences that follow from performing (or failing to perform) the action X with which they have been trusted by the trustor. Adopting such a position would be oversimplistic and Ryan does not do so. He states that trustees can only be full moral agents, but he does not make the explicit jump to the conclusion that trustees are always fully responsible for the outcome of the actions they are trusted with.
This list is even less restrictive than Ryan’s (see points (i)–(v) in Sect. 3.2) since I’m targeting not only interpersonal trust but trust understood in a wider sense. Therefore, I won’t engage in the relationship between trust and betrayal, since it is unclear how such a relationship applies to cases of institutional trust or self-trust, for example.
I talk about what the trustor wants rather than about what they expect because trust goes beyond mere prediction (unlike reliance). For example, I can rely on someone’s incompetence, but I cannot trust someone who I know is incompetent.
Views supporting (1) can be found in [7, p. 8, 48, p. 306, 49, p. 256]. Views aligned with (2) can be found in [19, pp. 4–5].
In contrast, authors like Baier [6] or Jones [7] consider trust as an acceptance of vulnerability to the actions of others, without necessarily linking it to specific conditions or outcomes.
In a trusting relationship, the trustor believes that their motives align with the trustee’s. If the trustor’s perception of the trustee matches reality, most likely trust will be warranted (as long as other requirements such as the trustee’s competence are also satisfied). However, the trustor’s perception of the trustee’s motives may be distorted, leading the trustor to place unwarranted trust in the trustee. For that reason, I specified that in trusting relationships there is a perceived alignment of motives, not necessarily a real alignment. From now on, I will assume this clarification and not insist on it in virtue of clearer readability.
Definition 1’ builds up on definition 1. At this point, I demarcate myself from Ryan’s notation and introduce δ instead of X as the action that the trustor trusts the trustee with. The reason behind the notation change is that, usually, trust is attached not to specific actions but to domains. Thus, when the trustor trusts the trustee with a specific task or role δ, usually, their trust can be expanded to other tasks belonging to the same domain ∆ that δ belongs to. This is a side note that does not affect much the conclusions of this paper. Nevertheless, I find it pertinent to include it as it is relevant for the trust definition above.
I will be using “iff” as an abbreviation of “if and only if”.
In the literature, the positions that characterise trust as being belief-based are commonly referred to as doxastic accounts of trust [21]. This kind of view can be found in [19, 22, 50, 51].
Opposed to Davidson’s psychologism, other authors [53] claim that only truths or true knowledge can constitute motivational reasons. This debate falls beyond the scope of this paper. More details can be found in [54], under motivational reasons.
The following example is inspired by Windisch et al. [55].
Imagine that it turns out, that the NN used was mainly trained using images of diagnosed patients. The most popular treatment for the tumour in question is to surgically extirpate it. Thus, the NN had been focusing on the aftermath of the surgery rather than on the tumour itself in order to classify the MRIs. Of course, such aftermath would not appear on scans of undiagnosed patients. This makes the system untrustworthy, even if it has a good record of accurate predictions.
For a detailed account on institutional trust, see [42, 43].
Or an institution.
I’d like to thank an anonymous reviewer for pointing out the need to explicitly apply definitions 1 and 2 to the AI context.
This model has been influenced by Lee and See’s dynamic trust model [56] and Ferrario, Loi and Viganò's incremental model of trust [37].
For a detailed view on a coherentist notion of epistemic justification see [57, p. 58, 58, p. 158, 59, 60, pp. 1–2]).
Here, I have attempted to apply Spohn's notion of credibility to Lewis' congruence, which is equivalent to what I intuitively understand by coherence. Note that in his paper, Spohn talks about b₁ being a reason for b₂, rather than b₁ justifying b₂. The nuances and differences between these kinds of epistemic relationships go beyond the scope of this paper.
I thank an anonymous reviewer for bringing Thomer’s master thesis to my attention. Besides Thomer’s own model of trust in HGAs, his work offers a great overview of different models of trust applied to the field of automation [63, pp. 44–68].
Trust calibration refers to the process of aligning the trustor’s expectations with the system’s actual capabilities, avoiding extremes such as overtrust or undertrust [63, pp. 59–60].
Resolution refers to how accurately trust reflects differences in a system's abilities. For instance, if a system's performance improves significantly but user trust increases only slightly, the resolution is poor because trust does not closely align with changes in capability [63, p. 56].
Specificity describes how trust is focused within a system. Functional specificity means trust is directed at individual parts of the system rather than the system as a whole. For example, a user with high functional specificity might trust one part of the system more than another. Temporal specificity refers to trust that adjusts based on the situation or time, meaning a user with high temporal specificity changes their trust level depending on how the system performs in different contexts [63, pp. 56–57].

References

Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., Yang, G.-Z.: Explainable artificial intelligence (XAI). Def. Adv. Res. Proj. Agency, no. December, p. 1, 2016.
HLEG, “Ethics Guidelines for Trustworthy AI.” European Commission, 2019, Accessed: Aug. 30, 2022. [Online]. Available: https://ec.europa.eu/futurium/en/ai-alliance-consultation.1.html.
Leslie, D.: Understanding Artificial Intelligence Ethics and Safety. Alan Turing Institute, 2019, Accessed: Aug. 30, 2022. [Online]. Available: http://arxiv.org/abs/1906.05684.
OECD. OECD Guidelines on Measuring Trust. OECD Publ. (2017). https://doi.org/10.1787/9789264278219-en.
United Nations. Recommendation on the Ethics of Artificial Intelligence (2021) Accessed: Feb. 20, 2024. [Online]. Available: https://unesdoc.unesco.org/ark:/48223/pf0000379920.
Baier, A.: Trust and antitrust. Ethics 96(2), 231–260 (1986). https://doi.org/10.1086/292745
Article Google Scholar
Jones, K.: Trust as an affective attitude. Ethics 107(1), 4–25 (1996). https://doi.org/10.1086/233694
Article Google Scholar
Durán, J.M., Formanek, N.: Grounds for trust: essential epistemic opacity and computational reliabilism. Minds Mach. 28, 645–666 (2018). https://doi.org/10.1007/s11023-018-9481-6
Article Google Scholar
Ryan, M.: In AI we trust: ethics, artificial intelligence, and reliability. Sci. Eng. Ethics 26, 2749–2767 (2020). https://doi.org/10.1007/s11948-020-00228-y
Article Google Scholar
Nickel, P.J., Franssen, M., Kroes, P.: Can we make sense of the notion of trustworthy technology? Knowl. Technol. Policy. Technol. Policy 23(3), 429–444 (2010). https://doi.org/10.1007/s12130-010-9124-6
Article Google Scholar
Coeckelbergh, M.: Virtual moral agency, virtual moral responsibility: on the moral significance of the appearance, perception, and performance of artificial agents. AI Soc. 24(2), 181–189 (2009). https://doi.org/10.1007/s00146-009-0208-3
Article Google Scholar
Hawley, K.: Trust, distrust and commitment. Nous 48(1), 1–20 (2014). https://doi.org/10.1111/nous.12000
Article MathSciNet Google Scholar
Gambetta, D.: Can we trust trust? In: Gambetta, D. (ed.) Trust. Making and Breaking Cooperative Relations, pp. 213–237. Basil Blackwell, Oxford (1988)
Google Scholar
Hardin, R.: The street-level epistemology of trust. Polit. Soc. 21(4), 505–529 (1993). https://doi.org/10.1515/auk-1992-0204
Article Google Scholar
Marlowe, T., Joseph, F., Laracy, R.: Philosophy and cybernetics: questions and issues. Syst. Cybern. Inform. 19(4), 1–23 (2021)
Google Scholar
Rotter, J.B.: A new scale for the measurement of interpersonal trust. J. Pers. (1967). https://doi.org/10.1111/j.1467-6494.1967.tb01454.x
Article Google Scholar
Kramer, R.M.: Trust and distrust in organizations: emerging perspectives, enduring questions. Annu. Rev. Psychol.. Rev. Psychol. 50(1), 569–598 (1999). https://doi.org/10.1146/annurev.psych.50.1.569
Article Google Scholar
Luhmann, N.: Trust and Power. Wiley, New York (1980)
Google Scholar
Hardin, R.: Trust and Trustworthiness. Russell Sage Foundation, New York (2002)
Google Scholar
McLeod, C.: Trust. In Zalta, E.N., Nodelman, U. (eds.) The Stanford Encyclopedia of Philosophy, Fall 2023 Edition (2021).
Keren, A.: Trust and belief. In: Simon, J. (ed.) The Routledge Handbook of Trust and Philosophy, pp. 109–120. Routledge, New York (2020)
Chapter Google Scholar
Hieronymi, P.: The reasons of trust. Australas. J. Philos.. J. Philos. 86(2), 213–236 (2008). https://doi.org/10.1080/00048400801886496
Article Google Scholar
Hardwig, J.: The role of trust in knowledge. J. Philos. 88(12), 693–708 (1991). https://doi.org/10.2307/2027007
Article Google Scholar
Holton, R.: Deciding to trust, coming to believe. Australas. J. Philos.. J. Philos. 72(1), 63–76 (1994). https://doi.org/10.1080/00048409412345881
Article Google Scholar
Nguyen, C.T.: Trust as an unquestioning attitude. Oxford Stud. Epistemol., p. Forthcoming, 2019, Accessed: Aug. 30, 2022. [Online]. Available: https://philpapers.org/archive/NGUTAA.pdf.
Alvarado, R.: What kind of trust does AI deserve, if any? AI Ethics 3(4), 1169–1183 (2023). https://doi.org/10.1007/s43681-022-00224-x
Article Google Scholar
Reinhardt, K.: Trust and trustworthiness in AI ethics. AI Ethics 3(3), 735–744 (2022). https://doi.org/10.1007/s43681-022-00200-5
Article MathSciNet Google Scholar
Ananny, M.: Toward an ethics of algorithms: convening, observation, probability, and timeliness. Sci. Technol. Hum. Values 41(1), 93–117 (2016). https://doi.org/10.1177/0162243915606523
Article Google Scholar
Benk, M., Tolmeijer, S., von Wangenheim, F., Ferrario, A.: The value of measuring trust in AI: a socio-technical system perspective. In: CHI 2022-Workshop Trust Reli, pp. 1–12. AI-Human Teams (2022). https://doi.org/10.48550/arXiv.2204.13480.
Jones, A.J.I., Artikis, A., Pitt, J.: The design of intelligent socio-technical systems. Artif. Intell. Rev.. Intell. Rev. 39(1), 5–20 (2013). https://doi.org/10.1007/s10462-012-9387-2
Article Google Scholar
Rieder, G., Simon, J., Wong, P.-H.: Mapping the stony road toward trustworthy AI. In: Machines We Trust: Perspectives on Dependable AI, pp. 27–39. The MIT Press (2021)
Van House, N.: Science and technology studies and information studies. Annu. Rev. Inf. Sci. Technol.. Rev. Inf. Sci. Technol. 38, 3–86 (2003)
Google Scholar
Coeckelbergh, M.: Artificial intelligence, responsibility attribution, and a relational justification of explainability. Sci. Eng. Ethics 26(4), 2051–2068 (2019). https://doi.org/10.1007/s11948-019-00146-8
Article Google Scholar
Placani, A.: Anthropomorphism in AI: hype and fallacy. AI Ethics (2024). https://doi.org/10.1007/s43681-024-00419-4
Article Google Scholar
Nielsen, M.A.: How the backpropagation algorithm works. In: Neural Networks and Deep Learning. Determination Press (2015).
Buchholz, O., Raidl, E.: A falsificationist account of artificial neural networks. Br. J. Philos. Sci. (2022). https://doi.org/10.1086/721797
Article Google Scholar
Ferrario, A., Loi, M., Viganò, E.: In AI we trust incrementally: a multi-layer model of trust to analyze human-artificial intelligence interactions. Philos. Technol. 33(3), 523–539 (2020). https://doi.org/10.1007/s13347-019-00378-3
Article Google Scholar
Jacovi, A., Marasović, A., Miller, T., Goldberg, Y.: Formalizing trust in artificial intelligence: prerequisites, causes and goals of human trust in AI. In: FAccT 2021—Proc. 2021 ACM Conf. Fairness, Accountability, Transpar., pp. 624–635 (2021). https://doi.org/10.1145/3442188.3445923.
Durán, J.M., Jongsma, K.R.: Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. J. Med. Ethics 47(5), 329–335 (2021). https://doi.org/10.1136/medethics-2020-106820
Article Google Scholar
O’Neill, O.: A question of trust. Cambridge University Press, Cambridge (2002)
Google Scholar
Cogley, Z.: Trust and the trickster problem. Anal. Philos. 53(1), 30–47 (2012). https://doi.org/10.1111/j.2153-960X.2012.00546.x
Article Google Scholar
Bachmann, R.: Trust and institutions. In: Poff, D.C., Michalos, A.C. (eds.) Encyclopedia of Business and Professional Ethics, pp. 1–6. Springer International Publishing, Cham (2020)
Google Scholar
Lahno, B.: Institutional trust: a less demanding form of trust? Rev. Latinoam. Estud. Av. 15, 19–58 (2001)
Google Scholar
Pettit, P.: The cunning of trust. Philos Public AffAff. 24(3), 202–225 (1995). https://doi.org/10.1111/j.1088-4963.1995.tb00029.x
Article Google Scholar
Foley, R.: Intellectual Trust in Oneself and Others. Cambridge University Press, New York (2001)
Book Google Scholar
Jones, K.: Trustworthiness. Ethics 123(1), 61–85 (2012)
Article MathSciNet Google Scholar
Lehrer, K.: Self-trust: a study of reason, knowledge and autonomy. Philos. Phenomenol. Res. 59(4), 1049–1055 (1999). https://doi.org/10.2307/2653569
Article Google Scholar
Grodzinsky, F., Miller, K., Wolf, M.J.: Trust in artificial agents. In: The Routledge Handbook of Trust and Philosophy, pp. 298–312. Routledge (2020)
de Laat, P.B.: Trusting the (ro)botic other: by assumption? ACM Sigcas Comput. Soc. 45(3), 255–260 (2016). https://doi.org/10.1145/2874239.2874275
Article Google Scholar
Adler, J.E.: Testimony, trust, knowing. J. Philos. 91(5), 264–275 (1994). https://doi.org/10.2307/2940754
Article Google Scholar
Keren, A.: Trust and belief: a preemptive reasons account. Synthese 191, 2593–2615 (2014). https://doi.org/10.1007/s11229-014-0416-3
Article Google Scholar
Davidson, D.: Actions, reasons, and causes. J. Philos. 60(23), 685–700 (1963). https://doi.org/10.2307/2023177
Article Google Scholar
Raz, J.: Authority and justification. Philos. Public Aff. 3–29 (1985)
Alvarez, M.: Reasons for action: justification, motivation, explanation. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy, Winter 2017 Edition., no. Winter 2017 Edition (2016).
Windisch, P., et al.: Implementation of model explainability for a basic brain tumor detection using convolutional neural networks on MRI slices. Nat. Neuroradiol. 62, 1515–1518 (2020). https://doi.org/10.1007/s00234-020-02465-1
Article Google Scholar
Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. Factors 46(1), 50–80 (2004). https://doi.org/10.1518/hfes.46.1.50_30392
Article Google Scholar
Lehrer, K.: Theory of Knowledge. Westview Press, Boulder (1990)
Google Scholar
Elgin, C.Z.: Non-foundationalist epistemology: holism, coherence, and tenability. In: Steup, M., Sosa, E. (eds.) Contemporary Debates in Epistemology, pp. 156–167. Blackwell Publishing, Malden (2005)
Google Scholar
Olsson, E.: Coherentist Theories of Epistemic Justification. In: Zalta, E., Nodelman, U. (eds.) The Stanford Encyclopedia of Philosophy, Winter 2023 Edition (2021).
Levi, I.: The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance. The MIT press, Cambridge (1980)
Google Scholar
Lewis, C.I.: An Analysis of Knowledge and Valuation. Open Court, LaSalle (1946)
Google Scholar
Spohn, W.: Epistemic justification: its subjective and its objective ways. Synthese 195(9), 3837–3856 (2018). https://doi.org/10.1007/s11229-017-1393-0
Article MathSciNet Google Scholar
Thomer, J.L.: Trust-Based Design of Human-Guided Algorithms. MIT (2007)
Google Scholar
Blanco, S.: Trusting as a Moral Act: Trustworthy AI and Responsibility (Doctoral thesis, unpublished) (2025)

Download references

Acknowledgements

The paper has benefited significantly from the regular feedback from Eric Raidl, Hong Yu Wong and the input from the participants of the closing conference for the AITE project. I would like to extend my gratitude to Andrew Kirton, whose insightful commentary on an early version of this paper was very helpful and contributed positively to the refinement of this work. I am also grateful for the comments by the participants of the Peritia Conference “Expertise and Trust” (University College Dublin), the fPET 2023 Conference (TU Delft), and the 17th CLMPST Conference (Universidad de Buenos Aires).

Funding

Open Access funding enabled and organized by Projekt DEAL.

SB is funded by the Baden-Württemberg Foundation (program “Verantwortliche Künstliche Intelligenz”) as part of the project AITE (Artificial Intelligence, Trustworthiness and Explainability). SB is also supported by the Deutsche Forschungsgemeinschaft (BE5601/4–1; Cluster of Excellence ‘Machine Learning—New Perspectives for Science’, EXC 2064, project number 390727645).

Author information

Authors and Affiliations

Cluster of Excellence, “Machine Learning: New Perspectives for Science”, University of Tübingen, Maria-von-Linden-Straβe 6, 72076, Tübingen, Germany
Sara Blanco

Authors

Sara Blanco
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Sara Blanco.

Ethics declarations

Conflict of interest

SB has no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Blanco, S. Human trust in AI: a relationship beyond reliance. AI Ethics 5, 4167–4180 (2025). https://doi.org/10.1007/s43681-025-00690-z

Download citation

Received: 29 April 2024
Accepted: 03 February 2025
Published: 09 April 2025
Issue date: August 2025
DOI: https://doi.org/10.1007/s43681-025-00690-z

Keywords

Profiles

Sara Blanco View author profile