[ci skip] Update draft

This commit is contained in:
Luca Beltrame 2022-11-22 00:12:38 +01:00
parent aa0ecc4104
commit cc5b298c03
Signed by: einar
GPG key ID: 4707F46E9EC72DEC

View file

@ -3,6 +3,7 @@ categories:
- general
- anime
comments: true
toc: true
date: 2022-11-20 22:17:41+01:00
disable_share: true
draft: true
@ -19,9 +20,13 @@ You might have heard it in the past few months: some areas of the Internet are b
In this post, you'll be guided by Yumiko, Satsuki, and Maya: the first two are characters created by someone else I know (who wants to remain anonymous) which I then expanded, cooperating with their creator in a certain project from a few years ago; the latter is... well, a character with an interesting history, which will be explained later.
{{< multithumb "images/2022/11/yumiko-hi.png" "images/2022/11/satsuki-hi.png" "images/2022/11/maya-hi.png" >}}
But first, an introduction for those unfamiliar with this world is in order.
## What is this AI you speak of?
{{< imgthumb src="images/2022/11/yumiko_question.png" size="500x" caption="Even Yumiko wants to know!" >}}
{{< imgthumb src="images/2022/11/yumiko_question.png" size="600x" caption="Even Yumiko wants to know!" >}}
Unless you were living in hibernation in Alpha Centauri waiting for the right moment to start the invasion, you might have heard about the use of various methods of "machine learning" to have a computer program (to be very simple) "learn" particular features out of a data set (chess games, video games, images, sounds...) and use them to perform various tasks, such as [playing games](https://www.deepmind.com/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii), [generate texts](https://towardsai.net/p/l/gpt-3-explained-to-a-5-year-old), [tackle complex scientific problems](https://alphafold.ebi.ac.uk/about) and many more things.
@ -31,6 +36,39 @@ One important point, which is relevant for the background of all this, is that a
You know, it sounds awfully like the prologue to [an incident involving a former MIT researcher and a jammed printer](https://albertopettarin.it/faif2/faif2.xhtml#x1-40001), but I'll leave that out for now. We'll get back to it later.
### On a robot that can paint, and one event that changed everything
## A robot that can paint
{{< imgthumb src="images/2022/11/maya-confused.png" size="600x" caption="Eeh~ I don't really understand all this stuff..." >}}
In January 2021, OpenAI announced [DALL-E](https://openai.com/blog/dall-e/), a play on Disney's WALL-E and renowned artist Salvador Dalì:
an extension to their GPT-3 system (which generated texts) which allowed the generation of images from natural language. This meant that the text "an apple on a wooden table" would produce (more or less) an image of an apple on a wooden table. To prevent any possible potential liability (they said "misuse" or similar terms, but it was mainly for liability), both sexual and violent images were removed from the (massive) data set used for training. However, despite "Open" in the name, DALL-E was only available to OpenAI's paying customers (like GPT-3). There was no way to alter, modify, or improve the whole deal unless OpenAI wanted to (in part they did, with [DALL-E 2](https://openai.com/dall-e-2/)).
Similar models were made by other companies, like [Midjourney](https://www.midjourney.com/), but likewise, they kept everything to themselves. Some places started offering generation services (free, or at a price), but it looked like some sort of niche interest. Until summer 2022.
## One event that changed everything
That summer, Patrick Esser (from RunwayML) and Robin Rombach from a university research group (Machine Vision & Learning, LMU Berlin) released [Stable Diffusion](https://stability.ai/blog/stable-diffusion-announcement), another approach to generate images, trained on about 5 billion reference images. The big deal is that, although with some restrictions, the model was substantially *more open* than the others, and in fact was available, and allowed modification and reuse. That was when AI generation exploded.
{{< imgthumb src="images/2022/11/satsuki-flame.png" size="600x" caption="Explosions? Mess with Satsuki, and you'll get burned." >}}
Many, from researchers to particularly smart lay people, actually started working on improvements to the models, the algorithms, and everything that turned around Stable Diffusion. Although Stable Diffusion aimed at all kinds of art, specialized models were made, for example, to draw [anime-style art](https://gist.github.com/harubaru/f727cedacae336d1f7877c4bbe2196e1). In addition, [NovelAI](https://novelai.net), a company which provided a service to create stories using GPT-3, developed a custom (and high quality) model to draw anime art. Said model was also [somehow leaked](https://twitter.com/novelaiofficial/status/1578529189741080576), and prompted [further modifications (link in Chinese)](https://www.bilibili.com/read/cv19603218), although of likely questionable legality.
The advantage of all of this is that a regular user, provided there a suitable high-end GPU (NVIDIA or AMD, although Intel's ARC could also prove useful in the future) is available, can generate art. There's a [plethora of software available to do so](https://www.reddit.com/r/StableDiffusion/comments/wx7f50/stable_diffusion_user_interfaces_and_how_to/), [although the most popular one is very ambiguous on licensing](https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2059). There is still [plenty of movement (warning: some links contained there may be NSFW)](https://rentry.org/sdupdates3) in the field, as well.
The rest of this post, in fact deals to what I actually did with all this stuff.
## Art-challenged?
Those who are familiar with me or my background know it already: art was *never* my forte. In high school, my lowest grades were in art and related subjects. I was never, ever able to draw more than stick figures. It wasn't a big deal: whatever I lacked in art I made up for it in other disciplines (like science).
So what has that to do with AI art? Let me tell you two anecdotes
## Ideas, and lack of implementation
As I discussed with the other anonymous creator of Yumiko and Satsuki several times, it would've been nice to actually see characters "spring to life".
But neither of us could do it. As a matter of fact, we did ask (and pay) artists in the past. Despite sometimes troublesome relationships (involving, in one case, a dispute on a payment site), it helped shape the characters and even prompted new ideas for them. One of the issues was that often for cost or time we would actually cut some of the planned ideas for the images.
## Building upon a memory
{{< imgthumb src="images/2022/11/maya_confused.png" size="500x" caption="I don't really understand all this stuff..." >}}