Our contention is that the most consequential aesthetic and ideological debates today occur at the level of protocols, upstream of traditional media forms. Those who design the affordances, incentives, and structures of participation shape the conditions under which culture is produced and perceived. Yet this substrate remains largely invisible, uncritiqued and unchallenged. “Protocol Art” proposes a shift toward intervening at this foundational layer by designing new systems, social contracts, and rule sets that encode and execute values. Artists are challenged to engage in protocol design, not simply as commentary but as direct competition. In this light, art is alive and urgent.
Read full transcript (generated by Whisper)
That looks about right. Okay. Wonderful. I think we're ready to start then. Thank you very much, Hito, Francis, everyone for having me. It's been a wonderful program. I regret that I couldn't attend yesterday, but I look forward to watching the documentation. For this talk, it's with regret. I kind of want this talk to turn into a longer essay that isn't written yet. So I'm testing some ideas here, but basically going to use it as an opportunity to introduce some of the work we've done and maybe some provocations. This is the first provocation, which is to say that the most interesting… opportunities to intervene in the future are barely visible. The future looks like regular humans operating under a different protocol or a different operating system. That's the initial provocation I'm going to try and qualify. The title of this talk is Protocol Art. This is the term that we've been using for the best part of a decade to try and describe desperately what it is we do. And it's actually interesting being picked up. It's been picked up in a few areas. My definition of protocol art is changing all the time, but I'm borrowing a term here from Ted Chiang, the great science fiction author, to describe it as strange rules upstream of the production of media.
And this is borrowing… I was pointed at this by Venkatesh Rao, the great writer, to Ted Chiang's definition of science fiction, which is that science fiction is about strange rules. And fantasy is about special people. So the provocation is to suggest that the protocol layer is the level of abstraction at which culture now operates. So the person sitting at Google or OpenAI or Meta setting the incentives at the level of protocol, platform, or algorithm determines cultural outcomes downstream. So I'm going to try a little experiment here. This is a bit of a leap of faith. I saw someone doing this on Twitter. I saw this on Twitter the other day where there's a bunch of people kind of sharing their Instagram recommended. Fortunately, mine is not that bad, it seems. It seems like I'm a good person. But basically, the suggestion here is that none of this is organic, right? This is an observation. So my partner Holly and I lived in the Bay Area for a long time as the Web 2 boom was kind of taking place. And swiftly learned that the people who you've never heard… I've heard of setting incentives and structuring media were having these kind of incredible downstream aesthetic impacts on culture, kind of setting suggestions to prompt others to produce the media that would most dutifully satisfy the goals of the platform or whatnot.
And this is odd, right? There's nothing… And I think most people kind of intuitively understand this to be true, right? You intuitively understand this to be true. You intuitively understand that the kind of feedback mechanisms on a Twitter or an Instagram or a TikTok or whatever platform you use, there's this kind of strange dowsing going on, right? Where you have an opaque algorithm that kind of has its own incentives, it has its own goals. And then we are all kind of individuated, throwing stones in the dark to try and navigate our way through the platform, right? To give it kind of what it wants. And so, yeah. So on the left here, this is obviously a GPT-generated. This is a GPT-generated meme suggesting that there's nothing organic to the idea that a YouTube video would do better when someone shows their face or makes a… Whatever it takes to boost the visibility. And so, yeah. So in this instance, there's an invisible protocol developer somewhere setting strange rules that determine whether explicitly or kind of subconsciously how work is going to be made in that kind of environment. And of course, there's nothing necessarily new to this, right?
We're within a cultural institution. There are other strange rules and incentives that are set in order to encourage the production of work. The provocation I would make here is that once we know this, once we know that there is this kind of massive aesthetic political fallout from the creation of rules, that it makes most sense to actually… To actually intervene at the protocol level, right? That if there are indeed rules that are being set that are having these kind of downstream impacts, it makes sense to kind of get your hands dirty and go through the rather laborious process of tweaking them, right? Which is, yeah, where we are. Oh, yeah. This is fun. This is John Postol, who is known as the god of the internet. Regrettably dead now, but was a major kind of proponent of TCPIP. And the SMTP protocol. I won't go into it too much, but… And I certainly, certainly wouldn't put myself in… Make any comparisons to the accomplishments of these people, but I think there's a whole history of people who've kind of set conditions that are now understood as perfectly natural, that were actually engineered and designed. The other kind of provocation is to say that if you take my logic that the people setting the strange rules of the internet do have…
Have this downstream aesthetic impact, it kind of pushes us to think differently, I think, about critique, right? Critique as a shareable piece of media to be shared within these kind of ruled… Kind of these engineered environments perhaps only goes so far, right? There is an argument to suggest that if you are willing to intervene at the protocol level, this is a form of kind of executable critique, right? It's critique that can pick up legs… Yeah. And not be ignored, ideally. Not being too utopian about this. Yeah. All right. So to jump back a little bit, there's some writing on this in a book that we published last year called All Meteor's Training Data. And this is a nice way to kind of reset things to talk about our work, if you keep those kind of provocations in mind. To say that we've been working with machine learning for, I guess, about 10 years, the best part of 10 years. And very, very early on, started to get very interested in the idea of the training data, right? One of the reasons for this… Okay. The experiments that we did ultimately ended up manifesting in a record and tour that we did in…
That we released in 2019. And a couple of interesting things happened, right? So when we were first… When we first began training models at home with a consumer GPU, the first question was, okay, what's the best way to do this? And the first question you have to answer when training a model is, what data do you use? And so the incentives kind of reared their ugly head initially, where you're like, huh, okay. If we were to train a model on, I don't know, name someone famous, that would be a very, very quick way to gather a lot of attention to this idea of consent or compensation within these kind of new model environments. That seemed cheap. That seemed cheap to us at the time. And so what we instead decided to do was say, okay, well, if all media is training data moving forward, what would it mean to put together an ensemble of people whose explicit goal was to produce data for our models? So ideally producing models that had a particular kind of character, bespoke character, and a particular kind of objective, right? So we put together a training ensemble. This is kind of an odd portrait.
And we put together a training ensemble. This is an odd portrait of the different characters we put together, where once a week, a group of people will come to our studio and sing songs in order to train a machine learning system that we would use to make music. By the time we ended up releasing that music and going on tour, we decided to extend the training ceremonies out to the public. And so we would hold these performances. This is from the Volksbühne. Maybe the audio is working? Yeah. Maybe the audio. Drink one. How about a having a lot of people based in what that was like? The sense will guard us while we sleep. Emergent light, emergent light appears, appears. Thank you. So the basic idea there was, I mean, a couple of things. One, the kind of implicit acknowledgement that when we start talking about machine learning models, rather than this kind of very hackneyed perspective that there is a need for a machine learning model, that there is actually a third actor involved, rather these are better understood as kind of collective accomplishments, like incredible human collective accomplishments. That asked some really interesting questions of us, namely one that, and this will come up later, that when you're dealing with a machine learning model, the outputs of a machine learning model are ultimately the product of all of us and no one in particular, which is a really juicy kind of new challenge.
So I think that's really interesting. And I think that's something that we need to deal with when we're thinking about data rights and so on and so forth, which we'll get onto later. So this kind of initial exploration of saying, okay, what if we could train our own models on fully consenting groups of people, moved over kind of to the individual level. So my wife Holly is a musician and uses her voice quite a lot. So in 2020, 2021, we did a project called Holly Plus, based on a similar principle. The idea of the Holly Plus model, and ultimately protocol, was to share her voice with everybody on earth, to allow for them to permissionlessly use her voice with no questions asked, and to establish a regime of compensation that was kind of opt-in, where basically as a result of kind of existing personality rights, if you wanted to profit from works made and use Holly's name, you would ask her and co-sell those works together. So the idea here is you would go forward, you would use Holly's voice in your production or film or whatnot. If you wanted to collaborate together, you would basically profit split on any profits made from that.
This idea caught on. I know that it's been fairly influential in some corners of the AI industry and culminated, I guess, in us showing off this idea of identity player, this idea of identity. This idea of kind of permissionlessly sharing your identity with others as being like an interesting new affordance that's native to these models. We went and presented it at TED with this kind of singing model. And with this microphone, you'll hear a live version of Holly Plus developed with Voktar Labs. Holly Plus Okay. So the basic idea there is again to say that, you know, we have an interesting opportunity to kind of reconsider IP and also reconsider the individual when we think about the native affordances of these tools. And namely this idea of identity player or inviting others to be you in a kind of permissioned way is a new and interesting kind of interaction. So a little later than that, the next year, we were kind of approached by a bunch of people, specifically about Holly Plus, actually. The now and regrettably kind of dead CEO of YouTube was like, what are you going to do with this stuff? We moved on really quickly from the voice model to think about kind of web scale training data.
So most in the room are probably familiar with this politic at this point. There's a lot of people who are very exercised about the idea that billions of pieces of media, trillions of tokens are being taken from the internet in order to train the models that we all enjoy. In 2022, we decided to put together this website, haveibeentrained.com. It still works if you want to visit it. That basically allows for anybody to go and search and see whether they are present within these large training sets. Okay. From that, we thought it would be a good idea, because it turned out to be quite popular, to implement what we were calling kind of a consent protocol, right? Bringing forward into the world a machine readable opt-out. This proved to be quite successful as an experiment. To date, there is at least 2 billion pieces of media that are registered through the spawning opt-out. That's actually a low estimate of what we're doing. We're not going to go over the number, because we've released a lot of open source tools for people to be able to make consent claims over their work. I thought I'd break it down here in kind of an interesting and simple way.
When you think about the right of copyright, if you were to break it down in its most simple form, by default, copyright is restrictive. By default, once you put a mark on paper and it's connected to you, nobody can do anything with it, unless optionally you choose to be permissive. What we were advocating for here with the opt-out is that by default, everything is public. You have the option to restrict the public from working with what you do, which is a bit of an inversion of the fundamental principles of copyright. We would argue, and I don't have enough time to really go too much more into it, that this is a much more native and kind of interesting way to think about AI training data. What's good in a sense, although it does keep me up at night with people getting mad at me online, is that this kind of principle has been adopted and seems to be fairly influential. The UK is currently deliberating, implementing this as their kind of official AI policy, much to the chagrin of all the media companies. Creative Commons has also kind of picked up this approach and is going to be doing some work there.
All this to say that we're quite proud of this. A lot of this stuff was very hacked together, but from the simple observation that there are fundamental protocols that can be experimented with, we've actually seen quite a lot of downstream results in places that I never would have expected. So here's another maybe interesting project we did. This was the year after, 2023, I guess. This project is called Kuduru, which is, I'm probably pronouncing that terribly, but it's named after a project where the boundary stones of Mesopotamia… Please don't get mad at me. It may be Babylonia. But the basic idea of Kuduru's stones were quite interesting if you look them up as objects. They were designed to sit a little bit like the speaker around a property. And at the top, you would see the kind of provenance of the property. So it would be like a registry that would tell you who owned what. And at the bottom would be a curse. And the curse is what would happen if you were to break the rules of the property. So anyway, so what we did was thinking, okay, on the one hand, we've built this kind of very interesting…
Holly describes it as like manners for data, this very kind of nice, reasonable way in which AI data could possibly be managed. What happened? That would be the carrot. Like, what would happen if we built a stick? Okay. So with Kuduru, what we ended up doing was we purchased, I think it was like 2,000 to 3,000 domain names that are really commonly up here in the largest data sets, squatted them, and used them ostensibly to produce a listening network, right? To really simplify this, when a human being goes to view a website, right, you will go to yourfavoriteartists.com. Whatever, if people have websites anymore. It's quite easy to tell you're a human. If that same IP address were to go to 30 different sites that we owned at the same time, we could reasonably discern that that was an AI scraper, right? Because most humans don't randomly go to 30 sites at a time. What that ostensibly allowed us to do was produce a decentralized listening network where we could then communicate to everybody else in the network that this IP existed and they could block access. So it's a very adversarial, kind of aggressive way of dealing with this issue, but proved quite successful.
So this is web traffic for AI image training on the internet, and the two tests that you see, so the two troughs, are us turning the Kuduru network on and all AI training globally stopping for two half-hour periods. I have, like, some in the book, and there's a lot of fun examples of AI devs kind of freaking out on various discords, not knowing what's going on. But the case in point here was to present a kind of adversarial scenario. So this is kind of the Waluigi, kind of dark vision of a future. If we don't come to some kind of common civil terms about AI data and people's agency over it, in a free-for-all, right, which is what many people are advocating for, by default, all data is public. But your option there at that point is to become adversarial, right? And we're already seeing this. We've seen this in the past couple of years. Anyone who believes they have a large trove of data that they do not want to go directly to Google or open AI models is now blocking off access to that data. The problem with that is that that's also blocking off the public's access to data, right?
So in that kind of vision of this, Reddit will be fine, the model trainers will be fine, but the average person will be fine, but the average person browsing the internet will now be kind of… will now have some obstacles in their browsing. You're describing ultimately, in a great irony, right, like in the spirit of free information, you're actually balkanizing the internet. That's a great, funny irony. Whereas, of course, in contrast, the opt-out that we put forward, which we believe is rather civil and very permissive, would avoid all this, right? So this, in a sense, is kind of the dark vision of what the internet may become if we don't come to some reasonable terms. Okay, so jumping forward, other experiments we've done kind of in this domain. In 2021, we released a project called Classified. I'll show a little bit of it. And the idea of there was to produce a piece of software we called Beacons at the time. The idea of Beacons was to dive into the latent space of models and try to understand, you know, what the fundamental characteristics of an embedding or a concept is inside the model. What's really useful in terms of our research is that Holly is more famous than me, so she actually has embeddings in these models to be able to query and understand.
This is an early image of what the model apparently fundamentally understands Holly to be. Oh, yeah, we have… I should have put the cool animation while I was doing that. Yeah. And so what we ended up doing, and actually I really enjoyed the noise images that were just shown, actually through this neuron portrait of Holly from that time in here to complement the noise. But what we're doing here is really trying to get a decent understanding of what her concept was inside this model. One, because it's kind of interesting. You know, in her case, it turns out, it's actually a little bit more of a kind of a kind of a kind of a kind of a kind of a kind of a kind of a kind of a very like a kind of a kind of a a kind of a sort of fast-paced you know, I think a lot of it is like as if, as I'm bringing you up, at the moment, you know, a lot of the a lot of the conversation is about not wanting to be in models, right? Don't take my data, so on and so forth. And very swiftly, which I totally agree, the conversation will likely go to, wait, you think I said that?
This is how I'm represented in models? Right? you know, impression of you inside these models. And what's interesting is, given the kind of the climate of critique around AI, some, I think some of these critiques hold water, others were kind of over embellished. There was this incredible focus on AI getting things right, right? So we heard about deep fakes, we heard about misinformation, so on and so forth. This is going back a few years. But nobody was asking the question, which was one of the principal early questions of the internet, do you get to be who you want to be on this new information substrate? There's a core tension there, right? There's a core tension between getting something right, determining that somebody or something is objectively true, which is problematic in and of itself. And this idea of saying, well, if this is going to be the new information substrate, and I change my hair or my gender, do I get to do that? Is there any mechanism to, express myself within this new medium? Or am I fixed in terms of, basically whatever I was tagged the most as? Of course, there's all kinds of other attack vectors to complicate that.
So for the Whitney, the Whitney approaches to do something for the biennial, and we started a project that we called X-Hairy Mutant X. And the idea of this was to start from that point of saying, okay, we have very little agency in this conversation. How might you try and assert some agency, in this conversation? So our first idea was, okay, it turns out that poisoning these models is very difficult. Almost all known techniques, including some quite famous ones, fall short. They don't really work in practice. So could we come up with a new mechanism to poison the model that might have a hope in hell's chance of working? So the idea we came up with was the idea of kind of cliche poisoning. So it turns out that if you did want to change, let's say, you know, I'm dude with a hat and glasses, right? And I'm a guy with a hat and glasses, right? And I'm a guy with a hat and glasses, right? And I'm a guy with a hat and glasses, right? And I'm a guy with a hat and glasses, right? And I'm a guy with a hat and glasses, right?
I did want to change who I am. The fastest way to be able to convince a classifier to smuggle new data in to actually poison it would be to keep my hat and glasses and change everything else, right? So the idea here was we determined that Holly's embedding ultimately was ginger hair. So we're like, okay, let's make a ridiculous costume, this kind of hairy golem costume here, of her ginger hair, lean into the cliche, and then use that as a mechanism to be able to produce a whole new model of the model. And that's what we did. We created a model that was based on an overwhelming amount of data that might ultimately hope to change or deter her embedding. So what we did was we produced a model trained on various perspectives of this costume and then released a website under the Whitney Museum's domain, which importantly is very trusted, for anybody to be able to generate images using that model. So this is a single image that is created with this kind of poisoning mechanism. It has the ginger hair in there. It is conceivably Holly, but it is also something else. And so you still get this kind of decentralized determination as to what the concept is, but there's kind of a somewhat of a Trojan horse mechanism in there.
Now time will tell if this approach works. Actually training and text encoders have changed. Changed since then. But ultimately, you know, the goal of this project was to ask that question, right, of do we get to determine who we are on the next information substrate? And yet, and I don't, I still have not heard a good answer to that question. Okay, so moving forward, last year we had a show at Serpentine. We called it The Call. This is one of the training objects that we produced for the show. The basic idea of the show was to come up with a new kind of AI training protocol, because obviously we've gathered a lot of experience with this. We feel like somewhat authorities on the topic. Can we put forward a different way of doing this? Understanding that for those who know and for those who don't, putting together a model is actually kind of a big process. And there's actually multiple different stages in putting together a model that allow for you to kind of assert some agency or manipulate it. Or, you know, manipulate or maneuver the model in a way in which you'd like, right?
So you have the data collection process itself, which we've already gone over. This idea of producing training data deliberately, understanding that that data is going to be used to train models. There's the data labeling, which is a kind of semantic process of, you know, determining that this data be called something so that it might be invoked in future through prompts. And then there's the model making and the model interactions. So our basic idea for the call was to say, what if we wrote a songbook, kind of a songbook trained on, you know, folk music traditions from the UK and actually from the American South, where my wife's from. And the goal of this book is it's a document where if you were to sing every song in the book, you would produce all the necessary data, so all of the different phonemes and sounds and notes necessary to produce a perfect model of your voice or a very, very accurate model of your voice. And here's an example of a training song I'll play so I can show that for a minute. . Anyway. So the book was toured around 15 different kind of volunteer choirs in the UK.
We did a whole tour of the place and established a very specific training protocol. So people are standing in a very specific way. They're being guided to contribute the data or their voices in very deliberate ways in order to produce kind of the most rich data set we possibly could, right? So we have each voice individually mic'd. We have ambisonic room recordings, so on and so forth. We went across the whole country. And the kind of the interesting kind of configuration that we decided to put together was a data trust for all of these different choirs to co-own the data set together. And so we established kind of some kind of governance mechanism where we had a data delegate who ostensibly works as kind of like a politician for them, and then negotiate on their behalf. Really is like an experiment to try and demonstrate a couple of things, right? One, that these kind of data trusts are possible, that it is actually possible. Because unfortunately, for better or worse, I would say for the better, when you're dealing with the sticky question of AI data and value, dealing with individuals is almost useless. Unless you're Taylor Swift.
You know, you really have to think about the size of the data necessary to produce, one, to produce models, but also to produce something valuable. You really do have to think about collective IP, and there's been very few experiments to do that. The second part, of course, is that this idea of a populace co-owning a data set and by extension a model together is a pretty interesting prototype to think about how, you know, the public domain or public services could work in future. It's relatively bloodless to think about this with singing data, but one could by extension think about different kinds of social contracts around health data or even educational, or kind of the context produced in educational institutions using a very similar model. The third thing we are trying to demonstrate is that it is actually possible to produce a performant AI model with consenting people, and that actually that model itself would have its own character. It's certainly not going to be as general as a Suno or an UDO, but we didn't want it to be, right? And so here's an example of links diffusion, which is the Coral AI model we trained on the consenting data of the UK, which I think is quite nice.
통통 전� homework 1000 Colomล sdl sdl stralis SI 巒 sdl ned rás The model can produce infinite new compositions, many of which are rubbish, some of which are absolutely beautiful. What we're trying to communicate with this and the greater exhibition is this idea of kind of reviving one, a very antiquated idea of the public domain, which I think is incredibly beautiful, and it's a great shame that this idea has kind of gone so out of fashion. In fact, there's many people I'd imagine under the age of 30 here who would need a reminder about what that is. And also just revive this idea, which is a time-tested idea that submission to something greater than some of its parts is beautiful in and of itself, right? That when we talk about AI data, so on and so forth, a lot of people like to get very individualistic very quickly, start talking about money that probably isn't going to come to them, very kind of balkanized, and that's actually not the most beautiful or interesting thing about these models. The interesting thing about these models and the interesting thing about the conundrum of AI data is that it is unavoidably a collective question.
Yeah. So, how would one speculate on the idea of a public AI? How would one push that forward? So, last year, we decided to take this idea a bit further, releasing, I think it's the largest ever data set of public domain images called PD12M. You can get this yourself at source.plus if you want. These images went through a lot of, it was a pain in the ass, honestly, to get it done, but there are 12 million high quality images, and it serves as a really interesting archive, basically, of all public domain images that have ever been posted to the internet, to our knowledge. We're actually, for what it's worth, in the next month or so, going to be releasing PD40M, because we found some clever ways to increase the size of this data set for AI training. 40 million images should be sufficient to produce a very, very beautiful AI model that nobody owns. And good news is, we're training it. So, we've begun the training of public diffusion, which is a fine-tunable foundational image model, which is trained solely on public domain data, and I repeat, owned by nobody. It's a really interesting, I think, intervention into this entire conversation to say, well, what if we just had models that were owned by nobody?
I actually, being someone that's played around with models for a very long time, resent the fact that between, you know, I and most people in this room are caught between warring corporations, right? And as a result, we don't get the nice things, right? We don't get to go and experiment, for example, with fine-tuning models on our work and offering them for money. I don't care. I don't want someone to generate Super Mario on a model that I release, but as a result of models like this not existing, I'm prevented from experimenting. The second part of that that I think is really important is that between these warring corporations, there is very little economic experience, experiments happening, and that's a really bad thing because we really need to be very quickly experimenting with what, you know, an art career or just, you know, the, yeah, with what the business of art looks like after these models. And the good news with public diffusion is it looks really good. So you can see on the left, this is a mountain landscape. On the right, this is FluxDev, which is one of the, kind of, the most popular models in the world.
It's one of the kind of, you know, the best models. Now, of course, it looks really good because it turns out there's loads of paintings of mountains in the public domain because the public domain usually is populated by things that are really old. So, yeah. So just to reiterate this point that I mentioned earlier, the opportunity with a public AI is that models are, by definition, everybody and no one in particular. That's what they are, right? They are monumental human accomplishments where we have scraped the internet, everyone's contributed to it, and your output doesn't really resolve to any one individual. That's really interesting. Models are only as smart as those using them. There's a lot of evidence for this, right? The principle is called the Isagesis Theory of Models. If you go to a model and you say, what's the price of Bitcoin going to be in December? It will give you a bad answer. But if you go to a model and say, I have this crazy idea with these crazy reference points, can you help me? Flesh it out. All of a sudden, the model becomes really smart. It becomes as smart as you are.
Ultimately, I think that's a really, really strong argument for investment in public education because that suggests that the economies of the future that are going to be strongest are going to have the most educated populace that is able to use these models well. And then the third point, which will probably be popular in here, is this idea of models exposing individualist myths. What I love about the AI question and why I keep returning to these kind of often boring bureaucracies and apathetic things is you can't very well take kind of well-established left, right, conservative, progressive ideals to these models and come out satisfied. They frustrate you, right? Because they're kind of new. And so my argument here is that this ultimately should be a progressive's dream. The idea of something that indicates that a Dewey-style liberal open social education is of benefit. In an AI world, the idea that AI models are all of us and no one in particular and the idea that AI models can't really be accurately tamed by fairy tales we tell ourselves about the individual genius. The AI models will refuse these. That's a great opportunity for intervention. Yeah. And so one thinks there, okay, well, what kind of new social contracts?
This comes up a great deal with Sam Altman and so on and so forth. People always talk about social contracts. We need to rethink the social contract. This is going to be such a big deal. Nobody ever proposes one, right? Again, if you take people at their word, I do actually believe that the AI substrate will be a bigger deal than the internet. I don't have a long time to go into it, but I genuinely believe that. There are all kinds of things you, there are all kinds of demands that you can make once you start thinking of them as public services, right? Why not think of ways to contribute to the general public with your data in return for better public service? Why not eliminate the sunk costs of pursuing art that might be the ultimate reason why illustrators are the most concerned about these infinite illustration machines, the $100,000 in debt, if you're lucky, honestly? And could you possibly tax AI profits to fund IRL spaces and institutions that will provide the meaning that kind of, and the good news about this is this proposal is something I've been working on for some time and it's actually been picked up by a lot of people.
It's been picked up by a bunch of think tanks and it's something that's going into the world. I guess to close, I actually mentioned this to Hito earlier, so I thought it would be a shame not to bring it up in the context of kind of a public AI. This is one thing we're working on at the moment. It's kind of code word seraphim, but I think I'll change the name of that. Is again, going down this rabbit hole of producing a public model, you learn things, right? You learn, that there's a ton of mountains in the public domain. No one needs to put another mountain in the public domain. There are plenty. But you also learn, there's loads of things that this model will not know, right? Because we aren't scraping the entire web. My favorite example that I like to bring up is that there are no corgis in the public domain for some reason, right? So we know what a dog is, but we don't know what a corgi is. And it also turns out that in order to saturate a concept in latent space to give a model a rich understanding of a new concept, you actually need to have a lot of data.
You actually need to have a lot of data. You actually need to have a lot of data. You actually need to have a lot of data. You actually need as few as 600 high-quality examples. So if we could somehow coordinate 600 people or one person with a corgi in a lot of time to produce 600 images of a corgi, that corgi would forever be enshrined in the public domain and we would not need a single image more of a corgi, right? So the idea here is we're producing a coordination agent that can actually prompt humans to produce the data necessary to fill and populate the public domain with these concepts. It's still early, but I love the idea, basically, of kind of a countervailing protocol where people can freely offer their data for everybody else to use and the model becoming more general and kind of more impactful as it goes. Yeah, so there's a website with very little information on it. I hope that the initial provocations were received well and that the work makes some sense in the context of that. I do need to flesh out this talk to be a bit more illustrative of other works that I admire, maybe, in this tradition.
But yeah, it's a great pleasure to be here with you all and I hope that that was fun. Great. I'm sure there will be lots of questions. Or remarks or comments. Yes, Antonio and then Constant. Thank you, Hito, and thank you, Matt, for the great talk. I had a curiosity. I have a question and a curiosity. Curiosity is about the musical score because I saw that there were some kind of colored ornament-like looking shapes and I was wondering whether that was part of the notation or not because there's, you know, there's traditional musical score and then all those elements there. I wanted to know about them. But then, of course, I wanted to know, again, more about the criteria through which you selected the images of PD12M and now PD40M because, of course, the content there is going to shape a lot, the kind of images that may be produced. You said about the presence of a lot of older images out to the right and so on. If you can tell us a bit more about the images. Yes. So, the content of this training set. Absolutely. So, first with the book, it doesn't really show on this picture but we produced kind of a gold leaf overlay and that's actually to communicate the solfege.
So, it was written in solfege which is mostly associated with the tradition of sacred harp singing. And so, yes, there's the solfege notes for sacred harp singers and the lyrics for people who want to read the lyrics. That's why it looks like that. Okay. To the point about PD12M, it's actually interesting. There's a couple of things that went into it. One, the reason why there's more kind of classic paintings in there is that's just what's in the public domain but it's also what's high quality. Now, the question of high quality is really fascinating here, right? High quality to a machine in this particular case means bigger than 1024 pixels or in some cases much bigger. High definition. High definition, exactly. But we also used aesthetic ranking systems. This gets really interesting. So, most people don't know. I know this. But so, Midjourney, I mean, given the theme of this, many of the talks earlier in the theme of this conference, Midjourney were one of the pioneers of this idea of aesthetic ranking, right? So, I think I was actually one of the first 10 people on the Midjourney Discord, probably with Simon. And the idea at the time was you would go there, you would generate something, and you would give it a score, right, and that score then went into future trainings in order to kind of aesthetically bias the models.
There's actually an incredible developer called Rivers Have Wings, pseudonymous developer, who also worked on it. And I think JD Pressman also worked on a lot of these aesthetic ranking systems. And what they do is in a sense they're kind of, they give you a rather normy understanding of whether something meets an aesthetic criteria, right? So, if it's a, that would be very aesthetic. If it's a picture of a receipt or, you know, this piece of paper, that would not be aesthetic. But I think there's actually a lot of room for experimentation there. As an art practice, I think there's a lot of room for it. And I think that's something that's very important to kind of practice in and of itself to say, okay, well what would be a lens you could produce to produce an aesthetic data set to produce a certain outcome? Because these are, these things are inherently biased by kind of normy aesthetics. Yeah. But aesthetic rankings can also be extremely problematic if we think about the famous lion aesthetics subset and the criteria with which it was established kind of aburrent. So, how did you stay away from that kind of, yeah?
Well, it's a long conversation, but the, I mean, the good news is that, you know, there's not that much gross stuff in the public domain. But we actually weirdly did become rather experts in some of the gross stuff because through Have I Been Trained, we were using the Lion datasets and then ended up trying to build a lot of classifiers, so filters basically, to be able to discern if the content of an image is problematic, let's just say, right? And actually one of the reasons, it's a real shame in some ways, but one of the reasons that the Lion datasets now are not available is because they didn't go through that process, which kind of sucks. Yeah, so we have a lot of filters that we can now lean on in that case. Yeah, but the cool thing actually about the aesthetic ranking system is that once the model is trained, you'll be able to give, I forget the nomenclature, but it's like you'll be able to write like AES2 in order to generate the image to the aesthetic level that you want, right? And I think that there's a lot more work to be done.
And there that actually is the kind of work that isn't going to be super emphasized by, I wouldn't say Midjourney actually, because I think they do a pretty good job at this, but like most people developing these models is I think there are questions that artists could ask of models that could make the process a bit weirder. Yeah. Thanks, Matt. So I had to think of like organ donation actually, the law changed in the Netherlands where by default organ donations switch to just having to opt out. I didn't know that. It's amazing. Yeah, so it's kind of interesting. And I was thinking, what would be your arguments against radical enteignung, appropriation, actually just reclaiming the data, let's say that was already interpreted in the sense of like the models that have been made in senses… like if the societal impact is so large, why wouldn't we as a society actually appropriate that in a radical notion that belongs to everyone? In the sense I think it's a really beautiful thing, what you're doing, but I'm just wondering what the arguments would be against, like let's say one step further. Yeah, totally. I didn't know that about organ donation.
I don't know if I advocate for that because I have to think about it, but that's pretty interesting. That's a, yeah. I think organ donation is a very important thing. I think there's, if I infer correctly, I hope I answer your question. There's two sides to this, right? One is this idea of reclaiming information once it has been kind of compressed into weights, right? So stable diffusion 1.5 exists, and stable diffusion 1.5 will exist as stable diffusion 1.5 for the rest of time. Our argument always was that we have a window of time before models get so good to start playing around with these kind of consent protocols because inevitably those models will be replaced by better, higher performing models. In the case of images, I think we've already crossed that point, regrettably, for better or worse. So yeah, you kind of have to deal with the unfortunate reality that things that have been ingested are gone, and actually our principle on this, that things were public by default unless you stated as such, allows for that, right? There's a second part, which is, I don't like to try and be a soothsayer because you'll always be embarrassed, but this principle of consent, I think, goes much further than training diffusion models.
I anticipate that where this substrate is going will bleed into daily life. You're already seeing this a little bit with people speaking to their phone with chat TPT, Sam and Joni, I've teasing wearables and so on and so forth, I use the hell out of chat TPT, I pay too much money for it, but it's incredibly useful to me. This idea of having consenting relations when everybody in this room has a device that is taking my words into your private context is something that has to be worked out. And I actually think that the great opportunity with these kind of machine learning models is that through natural language now, you are actually able to write or set new protocols or laws, and you can actually use them to do things like this. So I think that's a really good point. I think that's a really good point. I think that's a really good point. In ways that you wouldn't be able to before. So a case in point, there was a funny Black Mirror episode, right, where I can't remember the exact conceit of it, but basically someone didn't read the terms and conditions and they ended up in this kind of hellscape.
And you're like, okay, well, in a good world now, we would all have the ability to digest the terms and conditions of something and pass fail, right? Like we shouldn't ever have to, going forward, sign up to something that we didn't understand, right? Because we have these models with large context windows, and we have these laws that we can't even understand. And so I think that's a really good point. I think that's a really good point. To that end, the idea of consent being something that you take with you, that you can express in far more detail than a simple yes or no, seems to me to be very important and fundamental toward the challenges of the next 10 years, or I don't know how long it's going to be, but it's one of these things where, you know, that's my easiest answer to say why I'm not a kind of free info utopian on this, because I do think you do need necessary protections in place, particularly in law, and I think the opt-out gives you enough flexibility to be able to kind of allow for AI training to take place, but also not go too heavy-handed making decisions about an information substrate we haven't even seen mature yet.
I hope that makes some sense. I hope that was a decent answer. Is your argument then that almost like in the, in like the development of medication, that we would have to see it within private hands to develop further and to have that substrate develop further, because this expropriation, let's say, of data, that was, if you would conceive that to be stolen, right, and if you would, as a society, believe that the preference or like the benefit of society would be so large that you would say we would consider expropriating that from private hands, right, like what would be then the arguments to not do that and stay within the realm of this, let's say, consent condition? I'm really wondering, like, is it development or not? Sorry, Hito. Oh, yeah, please. No, I think, I mean, there's a protocol already for doing what you suggest, which is taxation. I think the basically expropriated model should be taxed really hard to socialize some of the profit. One of the arguments I think that's not to underestimate is that if we were to try to expropriate view, let's say, we don't know what it's made on, we don't know what kind of parameters are baked into it.
I'm not really sure. I want to own it, right? So in that sense, making a model from scratch and having more control over parameters, biases, the architecture, we were discussing the architecture of the late and the homophily baked into it, et cetera, et cetera, is a good idea. It's not going to be perfect, right? But at least it's a step further, pushing the whole train of thought one step further, and then we can open the possibilities at the end of the world. Sorry, there was a question. Yeah. Thank you. Thank you for the talk and your answers. The first thing I would like to say is that maybe some of you already heard that the French state came up with a solution. Maybe I have to be more close. Sorry. It's a great effect. Like the French state, I'm a musician, and I'm always wondering how we can move on. And the French state came up with this solution about streaming. They said of all the money that has been generated in France with streaming, they take 3% and put it back into social funds for musicians. And I think this could be like a role model for other fields like JGPT, but the thing I would like to ask you is the following.
I would like to come up with two scenarios, and I'd be really curious about which one you think is more likely to come. Scenario first is that given the fact that companies like the New York Times are pushing lawsuits against companies like OpenAI, do you think as a consequence there will be some sort of transparency concerning the AI? Yeah. And the data that's been used for the training of their models, and there's going to be some sort of compensation. And scenario two is that via like great ideas like spawning and the great idea of the opting out, do you think as consequence of that is there's going to be a critical mass of people opting out to all those services that is big enough to be able to come up with a solution? Yeah. So, I think the question is, is it big enough to push those companies like OpenAI to come up to like people altogether in spawning and say, hey, we got the message, we need to find a solution for that? Great questions. To address the first point about the French, I think that's amazing. That was actually, I've done some policy work on this.
My recommendation to the UK government was a variation on this, not knowing that it was a thing, but saying, basically, I'm not going to do this. Basically, yeah, implement a levy to then fund arts education, public spaces, venues, things that people actually enjoy doing that are actually not in peril by machine learning at all. And they're actually the reason why people fall in love with art in the first place. The second point about the New York Times and OpenAI, I guess I'm being recorded, so I have a private opinion on this. I think it's leveraged for them to get a big check from OpenAI. Quite what, this is not taking a position, but it's taking a position on who's right or wrong in this. But ultimately, there are only so few players that have such large, valuable corpora of data that I think a lawsuit in that circumstance is basically opening negotiations. And we've seen that test prove out with, for example, Suno and the large record labels. They brought suits, there's a big fuss lawsuit, now they're negotiating to work together. This is kind of how Spotify worked, right? Like, all the big music labels own a big part of Spotify.
So I think it will most likely go in that direction. The third point, which I think is perhaps the most controversial, but I think is pragmatic and actually quite elegant, is that in my mind, the implementation of the opt-out serves two purposes. One, it serves the purpose to allow for people who really doggedly don't want to participate to not participate. I think everyone should have that right. Number two, I know just as well as everybody else does, that it will be such a small minority of people that it will not impact the development of models at all. So the scenario that a large group of the public are going to come forward and remove enough data to slow down AI development is a fantasy, it will never work. Part of the reason is because most of the data that will be ingested into these models is orphaned. We don't know who made it. Those people probably don't even know. You know, a lot of the data, I mean, you're talking a vast amount of data here. And so ultimately, the most pragmatic solution here is to say, use the leverage of this current moment to push for the opt-out.
Understanding that it's not going to impede ChatGPT necessarily, but what it does do is it carves out a fundamental human right to be able to say, wait, wait, wait, this is mine. That's exactly it. And that's going to become important in a way that I don't, like, I personally, Holly and I do not care. Use our artwork for whatever. We do not care. I don't see a problem with it. But the right to not participate, I think, is going to be incredibly crucial because we have no idea where this goes, and I think it's going to get weird. I think it's going to get really strange. Like, I'm not, you know, I try not to be a soothsayer, as I say, but all signs point to this getting really unusual. And I think having that fundamental right with regard to training data is important because we know about the weakness of copyright. The weakness of copyright, I mean, the EU does have some laws in place, for example, with, like, personally identifying information, so on and so forth, the GDPR stuff that is already really good. But I just think that, like, enshrining that option gives you maximum optionality, long story short.
But I don't think it's going to look like the public uprising and taking over. The best way to do that, I would argue, is closer to this public AI proposal, where you say, actually, wait a second, within the European Union, for example, we've got a lot of people, a lot of information, they're very used to socialized services. Why can't you produce a really, like, it's just math. I mean, genuinely, I mean, the idea, the people working at these frontier model companies are really brilliant. How many Americans are in there? I always bring up this point, I mean, Hito likes to bring up the point, you know, image AI, most of it was born in Munich, right? The stable diffusion comes from latent diffusion, which was a research project born in the city. You know, if you look at the major AI companies in the US, Swiss, German, British, Canadian, Russian, I mean, you know, there's a lot of talent here. The regrettable thing is that the talent moves, because you get paid a lot of money. You get paid a lot of money to leave. So I think there's a lot of room to actually think about quite ambitious political configurations around this.
If you just take people at their word and say, okay, well, maybe the economy is going to change radically, cool. So there's kind of a first to the post opportunity here to say, okay, cool, if we're proposing new social contracts, why not get in early and make one that doesn't sound ridiculous, but like, you know, that is actually feasible and practicable. And I think, you know, who am I to propose it? But I think that, you know, that's really what I think. And where I hope the conversation goes is actually, we do have a lot of agency here because at least from my limited, or well, let's say, extensive interactions with this stuff, nobody has a clue. There's no master plan. There's no dark conspiracy. I genuinely don't think so. Like, people just don't know. They're writing, you know, they're throwing a lot of money and compute at something, and no one has a clue where this goes. So that, in my mind, if your risk tolerance is relatively high, which mine is, throw out some ideas, you know. It's all to play for, as they say. Yeah. Okay. Hi, Matt. Thanks for the great talk.
And my question is actually also about the opt-out because I have a little problem with that. Only you would know. So how do you administrate it? Because in the end, like, anybody can say and say, well, that's mine. So as much as I love the axis of evil of copyright horrors to fall and be replaced by your system, it sounds like you are trying to replace it with another centralized system, the authority of who's opted out. And is spawning set up like I would imagine? You set up spawning early so you are in place to take in the administrative fees for, like, the little processing fees if somebody, if some service wants to ask. Like, is this opted out? So you just painted the most bearish case, and I'll answer each of your questions. Sorry. No, no, no. I actually agree. I mean, no, I'm used to it. We set this up as a centralized entity because no one else wanted to do it. The goal from the beginning is that it, one, would never be something we charged for, right? So all the code is out and available publicly. Number two is that we wanted to federate it.
And so actually there will be some news, which I can't talk about right now, over the next couple of months where the entire goal of this was to say it took a small team to demonstrate. Going back to that protocol point earlier, right? We showed that this opt-out protocol would work. Stable diffusion used it. It's integrated into Hugging Face, so on and so forth. We've demonstrated there are ways to do it. Now we don't want to be in charge of it for exactly that reason. The other thing that's important is that nobody would sign up to it. So actually in discussions that we've had with, let's say, the big players, the fact that it is centralized and one point of vulnerability dictates that they would never sign up to it. So that then implements a policy problem. So actually, whether you call it market forces or ethics, everything points to this being federated and not owned by anybody. But I agree, you have to take my word for it in the short term. But long term, to be honest, even if I were to try and be nefarious about it, it wouldn't work. You know, nobody would…
It's relatively easy to put together. The difficult thing to do ultimately is to put together the example to demonstrate that such an idea could work. Because now when someone from Google lobbies the EU, and says, an opt-out is totally unworkable, there is a legislator there who can say, no, it's not. Look, this worked pretty well. With some kinks that need to be ironed out. So yeah, thank you, Mario. I appreciate it. We no doubt agree on everything. Good answer. Thanks. I just have a small comment. I think you should balkanize more. I should balkanize more? Yeah. Because, I mean, this is a discussion we all also had previously. What you're suggesting is basically an old Yugoslav model of worker self-management, of collectively owned models, collectively governed models, etc., etc. And I think you should just push this kind of strand of balkanization forward, radically and without any ambiguity. Period. I think I might agree. Okay. Good. We agree on some things. So, yeah, I mean, and also a comment I think is super useful. All the sort of infrastructural experiments you're suggesting. I mean, I've followed these experiments for a long time, and I see a super interesting way of, you know, learning by going forward.
And I know, I mean, the opt-out model was not a solution. It was a solution. And I think it was never intended to be a solution, but it opened the door to keep, you know, experimenting. And I guess public diffusion is not going to be perfect. It's going to open up new problems, etc., etc. But there will be, you know, another possibility to make another step forward, whether that's expropriation or whatever, you know. But I think it enables, you know, it enables, first of all, the possibility to say no, and also the options of, you know, pushing further. That's my comment on that. Thank you. Maybe, Boris, you have another one. And then I think we should, yeah. Would you like to wrap it up, or shall we respond directly? I'm… . Okay, then let's progress in that direction. I'm thinking, first, thanks. It was an amazing talk. Oh, thank you. Thank you. I think we should applaud one more time. .