A Software Agent Illustrating Some Features of an Illusionist Account of Consciousness
By Luke Muehlhauser
Editor’s note: This article was published under our former name, Open Philanthropy. Some content may be outdated. You can see our latest writing here.
Updated: November 2017
One common critique of functionalist and illusionist theories of consciousness[1] For a quick introduction to some functionalist theories of consciousness, see Weisberg (2014), chs. 6-8. On illusionist theories of consciousness in particular, see Appendix F of my earlier report on consciousness and moral patienthood, and also Frankish (2016a). is that, while some of them may be “on the right track,” they are not elaborated in enough detail to provide compelling accounts of the key explananda of human consciousness,[2]I make this critique of functionalist theories of consciousness in Appendix B of my report on consciousness and moral patienthood. Below I quote some additional examples of (roughly) this objection being made. During a February 2017 “Ask Me Anything” session on Reddit.com, David Chalmers … Continue reading such as the details of our phenomenal judgments, the properties of our sensory qualia, and the apparent unity of conscious experience.[3]On phenomenal judgments, see e.g. Chalmers (1996), pp. 184-186 and 288-292; Molyneux (2012); Graziano (2016). On the properties of our sensory qualia, see e.g. Clark (1993); O’Regan (2011). On the apparent unity of conscious experience, see e.g. Bayne (2010); Bennett & Hill … Continue reading
In this report, I briefly describe a preliminary attempt to build a software agent which critics might think is at least somewhat responsive to this critique.[4]Other motivations for this project included (1) the intuition that to understand something better, it is often helpful to try to build it, and (2) a desire to test the intuition that many theories of consciousness seem as though they’d be satisfied by a relatively simple computer program. On (1): … Continue reading This software agent, written by Buck Shlegeris,[5]Buck Shlegeris wrote the software agent, and the appendix containing notes for programmers who might want to review the agent’s code. I (Luke) suggested the specific theories to attempt implementing, and wrote the rest of the report. The main body of the report is written in my voice because it … Continue reading aims to instantiate some cognitive processes that have been suggested, by David Chalmers and others, as potentially playing roles in an illusionist account of consciousness. In doing so, the agent also seems to exhibit simplified versions of some explananda of human consciousness. In particular, the agent judges some aspects of its sensory data to be ineffable, judges that it is impossible for an agent to be mistaken about its own experiences,[6]The software agent would better-illustrate an illusionist approach if we could say that it is clear that it “mistakenly judges that it is impossible for an agent to be mistaken about its own experiences,” but we decided not to put in the extra work required for the agent to clearly satisfy … Continue reading and judges inverted spectra to be possible.
I don’t think this software agent offers a compelling reply to the critique of functionalism and illusionism mentioned above, and I don’t think it is “close” to being a moral patient (given my moral intuitions). However, I speculate that the agent could be extended with additional processes and architectural details that would result in a succession of software agents that exhibit the explananda of human consciousness with increasing thoroughness and precision.[7]In this sense, this project is similar in motivation to many other “machine consciousness” research projects (Cold Spring Harbor Laboratories 2001; Gamez 2008; Reggia 2013; Aleksander 2017). Arguably, the major distinguishing characteristic of the present project is merely its particular … Continue reading Perhaps after substantial elaboration, it would become difficult for consciousness researchers to describe features of human consciousness which are not exhibited (at least in simplified form) by the software agent, leading to some doubt about whether there is anything more to human consciousness than what is exhibited by the software agent (regardless of how different the human brain and the software agent are at the “implementation level,” e.g. whether a certain high-level cognitive function is implemented using a neural network vs. more traditional programming methods).
However, I have also learned from this project that this line of work is likely to require more effort and investment (and thus is probably lower in expected return on investment) than I had initially hoped, for reasons I explain below.
How the agent works
The explanation below is very succinct and may be difficult to follow, especially for those not already familiar with the works cited below and in the footnotes. Those interested in the details of how the agent works are encouraged to consult the source code.
The agent is implemented as a Python program that can process two types of text commands: either an instruction that the agent has “experienced” a color, or a question for the agent to respond to.
Each color is identified by its number, 0-255, such that (say) ‘20’ corresponds to my quale of ‘red,’ ‘21’ corresponds to my quale of something very close to but not quite ‘red,’ and so on.[8] This number is analogous to the gensym name is Drescher’s “qualia as gensyms” account (Drescher 2006, ch. 2). Upon being “experienced,” each color is stored in the agent’s memory in the order it was experienced (color1, color2, etc.).
To respond flexibly to questions, the agent makes use of the Z3 theorem prover. Upon receiving any question, all axioms (representing the agent’s knowledge) are passed to Z3, which serves as the agent’s general reasoning system. Z3 then returns a “judgment” in response to the query. These “phenomenal judgments” are meant to instantiate, in simplified form, some (but far from all) explananda of human consciousness.
First, consider judgments about colors — a familiar kind of phenomenal judgment in humans. The software agent also makes judgments about colors. Specifically, the agent judges that each color it has experienced has some absolute ‘value’ (red experiences are intrinsically red and not, say, blue), but (like a human) it doesn’t know how to say what that value is, other than to (e.g.) say whether a color is more similar to one color or another (e.g. red is more similar to orange than it is to blue). This is because the agent’s reasoning system doesn’t have access to the absolute values (0-255) of the colors it has seen (even though they are stored in memory), and it also doesn’t “know” anything about how its reasoning system works or why it doesn’t have access to that information. Instead, it only has access to information about the magnitude of the differences between the colors it has seen. Thus, when asked “Is the 1st color you saw the same as the 6th color you saw?” the agent will reply “yes” if the difference is 0, and otherwise it will reply “no.”[9]Technically, the agent’s “yes” reply is “necessarily true,” and its “no” reply is “necessarily false.” The other possible replies from Z3 are equivalent to “Both that statement and its negation are possible” (which I translate as “I don’t know”) and “The axioms I was … Continue reading And when asked “Is the 1st color you saw more similar to the 2nd color you saw, or the 3rd color you saw?” the agent is again able to reply correctly. But when asked “Is the 4th color you saw ‘20’?” it will respond “I don’t know,” because the reasoning system doesn’t have access to that information. This is somewhat analogous to Chalmers’ suggestion that ineffabilty is an inevitable consequence of information loss during cognitive processing, and our lack of direct cognitive access to the facts about that process of information loss.[10]Chalmers (1990): Very briefly, here is what I believe to be the correct account of why we think we are conscious, and why it seems like a mystery. The basic notion is that of pattern processing. This is one of the things that the brain does best. It can take raw physical data, usually from the … Continue reading
This agent design naturally leads to another phenomenal judgment observed in humans, namely the intuitive possibility of an inverted spectrum, e.g. a situation “in which strawberries and ripe tomatoes produce visual experiences of the sort that are actually produced by grass and cucumbers, grass and cucumbers produce experiences of the sort that are actually produced by strawberries and ripe tomatoes, and so on.” For our purposes, we imagine that the agent has spoken to other agents, and thus knows that other agents also talk about having color experiences, knows that they seem to believe the same things about how e.g. red is more similar to orange than to blue, and knows that they also don’t seem to have access to information about the ‘absolute value’ of their color experiences. In that situation, the agent concludes that inverted (or rotated) spectra are possible.[11]To elicit this judgment, we can ask the agent a question of the form “For all 2 agents and one hue, is it true that the the experience of agent 1 and that hue is the same as the experience of agent 2 and that hue?” The reasoning system replies: “Both that and its negation are possible.” For … Continue reading
Finally, another phenomenal judgment familiar to humans is the judgment that while one can be mistaken about the world, one cannot be mistaken about what one has experienced. In the software agent, this same judgment is produced via a mechanism suggested by Kammerer (2016), which Frankish (2016b) summarized this way:
[According to Kammerer’s theory,] introspection is informed by an innate and modular theory of mind and epistemology, which states that (a) we acquire perceptual information via mental states — experiences — whose properties determine how the world appears to us, and (b) experiences can be fallacious, a fallacious experience of A being one in which we are mentally affected in the same way as when we have a veridical experience of A, except that A is not present.
Given this theory, Kammerer notes, it is incoherent to suppose that we could have a fallacious experience [i.e. an illusory experience] of an experience, E. For that would involve being mentally affected in the same way as when we have a veridical experience of E, without E being present. But when we are having a veridical experience of E, we are having E (otherwise the experience wouldn’t be veridical). So, if we are mentally affected in the same way as when we are having a veridical experience of E, then we are having E. So E is both present and not present, which is contradictory…
For details on how this mechanism is implemented in the software agent, see the code.
Some lessons learned from this project
In my 2017 Report on Consciousness and Moral Patienthood, I listed a more ambitious version of the present project as a project that seemed especially promising (to me) for helping to clarify the likely distribution of phenomenal consciousness (and thus, on many theories, of moral patienthood).[12]In my earlier report, I mentioned the present project in section 5.1: …I’d like to work with a more experienced programmer to sketch a toy program that I think might be conscious if elaborated, coded fully, and run. Then, I’d like to adjust the details of its programming so that it more … Continue reading I still think work along the lines begun here could be helpful, but my estimate of the return on investment from such work has decreased, mostly (but not entirely) because my estimate of the cost of doing this kind of work has increased. In particular:
- Implementing the proposed mechanisms (e.g. from Chalmers and Kammerer) requires a large amount of “baggage” in the code (e.g. for using a theorem prover) that doesn’t illuminate anything about consciousness, but is required for the code to be set up so as to implement the proposed mechanism. This “baggage” requires substantial programming work, and also makes it more cumbersome to write (and read) a full explanation of how the program implements the proposed mechanisms.
- Before the project began, I guessed that in perhaps 20% of cases, the exercise of finding a way to program a suggested mechanism would lead to some interesting clarification about how good a proposal the mechanism was, e.g. because the proposed mechanism would turn out to be incoherent in a subtle way, or because we would discover a much simpler mechanism that provided just as good an explanation of the targeted explanandum. However, based on the details of our experience implementing a small number of mechanisms, I’ve lowered my estimate of how often the exercise of finding a way to code a proposed mechanism of consciousness will lead to an interesting clarification.
- A project like this would benefit greatly from career consciousness scholars who are more steeped in the literature, the thought experiments, the arguments, the nuances, etc. than either Buck or I are.
- I don’t think a program which implements three (or even five) mechanisms will be enough to learn or demonstrate the main thing I’d hoped to learn/demonstrate, namely that (as I write above) “the agent could be extended with additional processes and architectural details that would result in a succession of software agents that exhibit the explananda of human consciousness with increasing thoroughness and precision [such that] perhaps after substantial elaboration, it would become difficult for consciousness researchers to describe features of human consciousness which are not exhibited (at least in simplified form) by the software agent, leading to some doubt about whether there is anything more to human consciousness than what is exhibited by the software agent…”
- Even if we took the time to implement (say) 10 proposed mechanisms for various features of consciousness, it’s now clear to me that a compelling explanation of those mechanisms (as implemented in the software agent) would be so long that very few people would read it.
For these reasons and more, we don’t intend to pursue this line of work further ourselves. We would, however, be interested to see others make a more serious effort along these lines, and we would consider providing funding for such work if the right sort of team expressed interest.
Appendix: Notes to users of the agent’s code
This appendix is written by Buck Shlegeris, who wrote the code of the software agent, which is available on Github here.
In this appendix, I explain some of the decisions I made in the course of the project, and explain some of the difficulties we encountered.
I wrote the code in Python because it’s popular, easy to read, and has lots of library support. The main library we use is the Python bindings for Z3, which is a popular theorem prover.
Almost all of the complexity of this implementation is in the first order logic axioms that we pass to Z3. The rest of the code is mostly a very simple object oriented sketch of the architecture of an agent.
Implementing proposed mechanisms of conscious experience in Z3 was difficult. Expressing yourself in first order logic is always clunky, and Z3 often couldn’t prove the theorems we wanted unless we expressed them in very specific ways. I suspect that a programmer with more experience in theorem provers would find this less challenging.
Also, there were many ideas that we wanted to express but which first order logic can’t handle. I’ll mention three examples.
First, it would have been easier to express human-like intuitions about inverted spectra if the theorem prover could reason about communication between agents, e.g. if it could prove something like “No matter what question system A and system B ask each other, they won’t be able to figure out whether their qualia are the same or not.” This can’t be expressed in first order logic, but I believe it can be expressed in modal logic. Perhaps this kind of project would work better in a modal logic theorem prover.
Second, it’s not very easy to express the fuzziness of beliefs using first order logic. A lot of our intuitions about consciousness feel fuzzy and unclear. In first order logic (FOL), we’re not able to express the idea that some beliefs are more intuitive than others. We’re not able to say that you believe one thing by default, but could be convinced to believe another. For example, I think that the typical human experience of the inverted spectrum thought experiment is that you’ve never thought about inverted spectrum before and you’d casually assumed that everyone else sees colors the same way as you do, but then someone explains the thought experiment to you, and you realize that actually your beliefs are consistent with it. This kind of belief-by-default which is defeatable by explicit argument is not compatible with first order logic.
Logicians have developed a host of logical systems that try to add the ability to express concepts that humans find intuitively meaningful and that FOL isn’t able to represent. I’m skeptical of using the resulting logical systems as a tool to get closer to human decision-making abilities, because I think that human logical reasoning is a complicated set of potentially flawed heuristics on top of something like probabilistic reasoning, and so I don’t think that trying to extend FOL itself is likely to yield anything that mirrors human reasoning in a particular deep or trustworthy way. However, it’s plausible that some of these logics might be useful tools for doing the kind of shallow modelling that we attempted in this project. Some plausibly relevant logics are default logic and fuzzy logic, potentially combined into fuzzy default logic.
Third, I can’t directly express claims about the deductive processes that an agent uses. For example, Armstrong (1968) is a theory about a deductive process that humans might have; namely, that in certain conditions, we reason from “I don’t perceive that X is Y” to “I perceive that X is not Y.” To express this, we might need to use a logic that has features of default logic or modal logic.
In general, Z3 is optimized for projects which require the expression of relatively complicated problems in relatively simple logics, whereas for this project we wanted to express relatively simple problems in relatively complicated logics. Perhaps a theorem prover based on something like graph search over proofs would be a better fit for this type of project.
Sources
| DOCUMENT | SOURCE |
|---|---|
| Aleksander (2017) | Source (archive) |
| Armstrong (1968) | Source (archive) |
| Bayne (2010) | Source (archive) |
| Bennett & Hill (2014) | Source (archive) |
| Bjorner (2017) | Source (archive) |
| Brook & Raymont (2017) | Source |
| Buck Shlegeris | Source (archive) |
| Byrne (2015) | Source |
| Chalmers (1990) | Source (archive) |
| Chalmers (1996) | Source (archive) |
| Chalmers (2017a) | Source (archive) |
| Chalmers (2017b) | Source (archive) |
| Chalmers (2017c) | Source (archive) |
| Clark (1993) | Source (archive) |
| Cold Spring Harbor Laboratory (2001) | Source (archive) |
| Drescher (2006) | Source (archive) |
| Feynman (1988) | Source (archive) |
| Frankish (2016a) | Source (archive) |
| Frankish (2016b) | Source (archive) |
| Gamez (2008) | Source (archive) |
| Graziano (2016) | Source (archive) |
| Herzog et al. (2007) | Source (archive) |
| Kammerer (2016) | Source (archive) |
| Loosemore (2012) | Source (archive) |
| Marinsek & Gazzaniga (2016) | Source (archive) |
| Molyneux (2012) | Source (archive) |
| O’Regan (2011) | Source (archive) |
| Reggia (2013) | Source (archive) |
| Rey (1983) | Source (archive) |
| Rey (1995) | Source (archive) |
| Rey (2016) | Source (archive) |
| Shlegeris (2017) | Source (archive) |
| Tomasik (2014) | Source (archive) |
| Weisberg (2014) | Source (archive) |
| White (1991) | Source (archive) |
Footnotes