Avatar Me!

Avatar Me!

Many of my friends have these cartoons, which they call avatars, gracing their bios on LinkedIn, Facebook, and elsewhere. Naturally, I developed a serious case of “Avatar Envy”. 

Being the intensely curious person that I am, I reached out to them and asked how they did it. The responses were all the same: “Just upload an image of yourself and ask the AI to create a cartoon of the image.” Choosing which image of myself wasn’t an issue. I will use my LinkedIn Learning bio image that graces this series.

Tom Green dressed in a grey jacket and open neck white shirt is leaning on a table.
My bio pic.

Still feeling chastened by my experience with chatGPT, I decided to venture into this new realm using Google’s Gemini AI and Adobe Firefly, which my friends recommended.

Gemini has a very basic interface. However, there is a category called “image.” Clearly, this would be a good starting point, and Gemini even mentions it uses something called Imagen. Since I don’t see anything that requires me to upload an image, I assume the “All Knowing Google” knows who I am. But based on my experience with chatGPT, I hedge my bets and make sure it uses me, the other Tom Green.

The gemini prompt reading create a cartoon image of Tom Green LinkedIn Learning author.
I start with this prompt to see where we go.

The result was quite surprising. Do I really look like I can balance a microphone on one finger? And I wouldn’t be caught dead wearing that T-shirt and shirt combo. Mind you, the hair colouring was appreciated, and the glasses and the rather pronounced chin were somewhat acute, but adding my name at the bottom and that sticker on my computer really bothered me.

The cartoon of Tom Green featuring his name, two microphones and a computer with a LinkedIn Learning sticker.
The result.

Clearly, I was missing something. That something was a reference image. It turns out that item is hidden in the dropdown menu when you click the + sign. There it is: Upload image. I upload the image, which appears in the chat area, and then add: “Create a cartoon image of the subject of this photo.”

The original photo is above the promp[t reading Create a cartoon image of the subject of this photo. The resulting image looks nothing like the author.
When your beloved asks: "Who is that?" It is back to the drawing board.

Close but no cigar. I needed a second opinion. I show the image to my beloved and her response is: “Who is that?” This prompt needs some work.  Back to Gemini and a new prompt:  “Create a cartoon using this reference image.”

Article content
A used car salesman?

This time I am not breaking out the surrender flag. Obviously I need to flesh out the prompt. Not only that, I wanted it in 3D, not 2D. As one of my friends said when I asked how to do it in 3D: “Did you say 3D in the prompt?”  

As I learn, prompts require much more context and detail to achieve accuracy. The more detail, the more accurate. Not knowing where to start, I turned to the “Oracle of Google” and asked: “How to create an accurate 3D avatar from a prompt in Gemini using a photo portrait.” After instructing me to upload the photo, the Oracle stated that the second step is to craft a detailed prompt that includes style, pose, and any specific features you want to emphasize. The third step is to make sure you reference the photo in the prompt. The oracle even suggested a prompt which I thought I would use:Create a photorealistic 3D avatar of the person in the reference photo with a slightly smiling expression, facing the camera and in a casual pose.

As you may notice, I am slightly smiling, facing the camera, and relaxed in the photo. How did the Oracle know?

3D photorealistic image of the author that doesn't lookm like the author in the reference image.
Yikes!

This is where the Oracle’s 4th point, Iterate and Refine, comes into play. According to the Oracle, “The initial generation may not be perfect (You think!), so you will likely need to iterate by specifying to Gemini what needs to be adjusted.” Good advice.

My first iteration was : “ Change the hair colour to grey  and have the subject, with crossed arms, leaning on a table”.

Gemini couldn’t do it. Its response was:” I am still learning to generate certain kinds of images, so I might not be able to create exactly what you are looking for yet, or it may go against my guidelines. If you would like to ask me for something else, just let me know!” 

It was time to learn how to iterate. I started by asking Gemini to change the hair color to grey, remove the background and change the shirt color to white.

The avatar wearing a white t-shirt and sporting greyt streaks in the hair.
Not even close.

The glasses were missing and the hair needs be greyer.

Eyeglasses and more prounced grey streaks are added to the avatar which stuill doesn't resemble the author.
The glasses and more grey hair are added to whomever this is.

The face needs to be thinner and the frames for the eyeglasses need to be thicker. thicker.


Eyeglasses are changed and the hair is greyer.
Still not looking like me.

The hair should be white.

The hair color is changed to white.
Needs more work.

I felt the face should be thinner and the chin longer.

The face is thinner and the chin is longer and the avatar is starting to resemble the author.
Stop wasting time and go with this one.

Conclusion 

The original image with the avatar shown in the upper left corner of the image.
Me and my avartar.

This was close enough for me. As you can see, there inevitably comes a point where you need to step back and accept what you have. When it comes to creating avatars, it is all too easy to be Goldilocks and waste an inordinate amount of time attempting to get it “just right.” Still, the lesson for me was the Oracle’s 4th Point: “You will likely need to iterate by specifying to Gemini what needs to be adjusted.” This also supports my observation that Generative Art has an issue. The issue is that you can’t fully describe “intent"; the best you can do is be specific about what you’re trying to create, and eventually, you need to accept a somewhat acceptable result.

While fumbling around trying to create this avatar, I found myself agreeing with many of the claims Sari Azout makes in her article How I stopped Worrying About AI and Learned to Value my Humanity. In the article, Sari examines the disparity between our expectations of AI and its actual capabilities. One of these gaps is that we expect AI will reduce our workload, giving us more leisure time. Sari points out that the reality is AI expands what’s possible, raises expectations and standards, and creates more work.

Tell me about it.

In the next installment of “The Bumbling Prompter,” I embark on a quest to use Adobe Firefly to see if I can do even better than Gemini to create my avatar.

It is so correct what you are writing .... It needed a lot of experience to create Europes first Bicycle Avatar ... https://youtu.be/SdUD8RlbX2o I

Like
Reply

Nice efforts here, Tom. Interesting evolution on each iteration, though you are spot on - using a language model to try and clone your image can return quite varied outputs. There are ways around this. Various platforms out there either allow you to do this yourself, OR provide a service to do it for you. This can be done with image uploads OR actual video footage of you. An example of this with a historical figure (Einstein) is here - you can even have a chat with him: https://trulience.com/avatar/8118392647806109895

Like
Reply

Its like an art form if you have a vision.

Like
Reply

I feel your pain. I tried to create an avatar in a similar manner, with results that (I hope) don't look too much like my current profile image.

  • No alternative text description for this image

To view or add a comment, sign in

More articles by Tom Green

  • Training the AI Dragon

    As I progress through this series, I am concluding that learning to prompt is no different from how Hiccup in the movie…

    1 Comment
  • Has LinkedIn Learning Abandoned Learning?

    In May 2024, LinkedInLearning released my course, The User Experience of Motion (for Non Designers). To explain the…

    25 Comments
  • Generated Video Redux

    Earlier in this series, I didn’t have a great experience creating videos just using a prompt. The model, shall we say…

  • The (quiet) demise of LinkedIn Learning

    I started my relationship with LinkedIn Learning back in 2007 through Lynda.com.

    18 Comments
  • One AI Does Not Rule Them All

    I have been involved in this silly business since the emergence of desktop publishing in the 80’s. I have seen the rise…

  • The Soul Of The New Machine Is Not Mine

    One of my more cherished possessions is a writing sample from when I was in Grade 5, where I used a fountain pen for a…

  • Figma Make and CROFTC For The Win

    In the previous installment of the Bumbling Prompter, I experimented with Figma’s AI tool—with mixed results—to craft…

  • Wrestling Figma's AI To The Ground

    As I discovered in the previous installment, moving the generated result from UX Pilot into Figma would cause a…

  • UX Pilot's Crash Landing

    As a LinkedIn Learning author, I have been doing a lot of work in the UX field. One thing I have discovered is that the…

    2 Comments
  • Firefly's Facial Hair Obsession

    Over the years, I’ve done a lot of work with Adobe and made several friends who still work there. Recently, they’ve all…

    3 Comments

Others also viewed

Explore content categories