We've reached a point where machines can emulate humans at a very high level (ChatGPT, LaMDA, DALL-E). Computers can now generate art, music and language. But is there meaning in the output of a machine? Physics tells us that the world is purely material, e.g. the human experience is simply a computation that can be replicated by a machine. What are the implications of a purely physical world? With the acceleration of AI, will we reach a point where behavior alone implies consciousness, or is there something distinct about human experience, and as a result, human art? This is an opinion editorial piece from Brandon Tory Thorpe, Artist and former Staff Engineer at Google AI. It is intended for audiences interested in Artificial Intelligence and The Arts, with no prerequisite engineering experience required for readability.
Tonight I buzzed my lips, my five month old daughter buzzed back. We held a conversation this way for fifteen minutes. With no clue what the other meant, we only knew what we felt...in sync, in agreement––the fundamental unit of understanding. I know through the years that follow we'll build layers of communication on top of the protocol we defined today. We'll talk politics and travel. We'll talk religion and technology. But what it all compiles down to are two humans buzzing their lips at each other because of the feeling of mutual understanding, the empathy and beauty that it creates within us. Our feelings are not a side effect of our physical reality. Our feelings are our reality. All other communication, including the experience of life itself, is a way to encode these feelings at a higher level of abstraction.
There's a conversation in computer science today about what it means to be intelligent, and transitively, what it means to have consciousness. The majority of physicists believe that all things are physical, e.g. every phenomena in the universe is a composition of particles governed by known laws. This position, typically called mechanics (both Quantum and Newtonian), is what got us Planes, Self-Driving Cars, Wifi and to the Moon. Consequently, this also means that with enough computational power, every event in the universe can be predetermined. This is where if you don't pay close attention to the coded language you can miss the implication. If the universe is completely deterministic, then there is no free will.
By free will I do not mean omnipotence, where one can do anything, rather I mean that within any system, an entity can make a conscious decision which suspends the outcome of the system such that full knowledge of the outcome cannot be precomputed. With enough computational power we will see a day when a computer will predict everything that you will do and say, before you do or say it. Our science tells us that this is the truth, and yet our intuition makes humans cling to the idea that our thoughts are our own. Our decisions are our own. Our experiences are our own. Why do we cling to our intuition in instances when science tells us otherwise? Our intuition is wrong quite often and yet our intuition is what made us the dominant species on the planet. Is it worth reconciling our objective analysis of the world with our intuition?
Natural Language Understanding (NLU) is a form of Artificial Intelligence that can do things like answer hard questions really fast by learning the meaning of human language. Given a question, NLU systems can retrieve information where the intention seems to mean the answer with a high likelihood. In Machine Learning these tasks are sometimes called "retrieval", or "sequence to sequence" modeling. Once you have a vector of meaning you can do interesting things like search a space of all other "meaning" vectors to find the nearest neighbors, and that's where the ability to retrieve answers really fast starts to shine. You can also encode and decode meaning into alternate representations such as dialog used in conversational AI.
With unlimited compute, this architecture yields models that can empower machines to read all the books on earth and answer any question you can think of. This works really well for information grounded in objective reality. Where it breaks down is in subjective experience. Where art succeeds, physics, math and engineering all fail at encoding and transmitting subjective experiences.
In 2020 I lost a friend to the Pacific Ocean. A strong Brooklyn born fighter in peak athletic shape, he gave up his own life to save his son caught in a rip current. I say this to emphasize that the meaning of the word ocean may be different for me than it is for you. Aggregating statistical patterns produces an objective representation of the ocean, but cannot contextualize my individual fear of drowning, and hence the feeling that the ocean gives me.
It's a paradox that brings us back to the question of free will. It's not that subjective experience is too big to fit in memory, or too hard to encode—it's that it is by nature private. It is not possible for you to have the same subjective experience as me, without being me. The fact that my decisions and experiences are private is virtually synonymous with my being free. This is because if my inner thoughts and experiences are public, then there exists some objective description of my behavior which can be used to, for example, put me in jail before I've committed a crime.
The beauty of our human languages, of our art, our music all lie in the ability to imperfectly communicate a private subjective experience, from one human to another. The artist is the courier of a message. They cannot ensure that everyone will understand, but those that do, understand deeply.
We've taken it as a given that objectivity is the base of the meaning of the world, and that our emotions are built on top of those objective truths (mechanics) as a somewhat inconvenient side effect of being human. But what if our emotions are fundamental, and the entire point of objective reality is only to ground language, e.g. an environment in which to pull context when communicating sequences of emotions. A way to give the words meaning, such that we can communicate with each other more deeply.
In other words, my feelings about water will never mean the same thing to you unless you've experienced the ocean. There are messages in this world for us to receive that can only be understood through experiences. Language alone is meaningless.
To experience anything requires consciousness. The world was created as a way for us to communicate meaning between conscious entities. Where math is a lossless encoding that communicates objective reality, language is a lossy encoding that attempts to vectorize subjective experience and ground it within the subjective experience of someone else, e.g. by metaphor, or storytelling.
If AI is unable to have its own experiences it can never ground meaning subjectively, meaning it will always represent inputs using a learned space that's essentially based on the aggregate meaning across all of its training data. If AI is able to have its own experiences, then by metaphor it can build representations based on its own learned experiences, but those experiences will be limited to that which a machine can have.
Recommendations from unconscious AI are averages often lacking in polarity that makes art piercing. Recommendations from conscious AI would in theory be based on the AI's personal preferences, e.g. based on machine experience. The best recommender system would be a machine which experiences the world in the same way that we do, has its own private experience, and grounds meaning subjectively. When framed this way, the best recommender system is a human.
The question is this: is language meant to describe the world or is the world meant to describe language?
As discussed, meaning can be vectorized as a representation using an encoder. This is a sequence of numbers that can hold the meaning of a phrase, paragraph, entire novel––or song. Theoretically computing distances between these vectors should give the most objective analysis of the similarity of meaning. This is how recommender systems, like Google Search, or Spotify Music recommendations work. We represent art in the form of a number and recommend similar ones. But the numbers do lie. They lie because art is meant to communicate a subjective experience. Using metrics to quantify meaning only yields the aggregate meaning...not the personal, painful, beautiful, strange, meaning that only you understand––because you've been through the same experience as the artist. Worse yet, this creates a feedback loop, where successful content with the same "on average" meaning is constantly elevated in the ranking algorithm, which incentivizes artists to play the average in order to remain economically viable. As recommender systems become more prominent, all art may converge to meaning the same thing: pay me.
In conclusion, being you is a private experience. This is the beauty of life. This is why, particularly in art, word of mouth is so powerful...in our honest, imperfect attempts at sharing our private first hand experiences with one another, we gain trust in a way that cannot be replicated in an objective recommender system. As we increase compute power we must remember to value that which makes us human. And with that, think deeply on what the most important areas of work are for humans in the future.