By naykdhodlr | naykdhodlr | 6 Nov 2024




The last decade has brought to the attention of the general public the revolutionary realm of Artificial Intelligence (AI).

If interested, a deeper understanding in the subject of Artificial General Intelligence (AGI), and Artificial Super Intelligence (ASI) refer to this article that elaborates on these, and an opinion on their social influence going forward:

The most common available for public use, a language model AI, produced by Open AI known as Chat GPT. Since its introduction, has passed through several iterations each gaining a larger 'cloud' of universal knowledge fueling its algorithm to respond to `prompt' queries put to its resolve.

The subject to this post is related to Generative Artificial Intelligence (GAI), the progression of GAI's capabilities in the realm of ART in all its format, medium forms: graphics, photography, visual arts, videography, etc...

GAI specifically, a Text-to-Image Synthesis Model is characterized by its ability to generate original images from natural language descriptions, or prompts.

The following lists the latest Text-to-Image Synthesis Model:

  1. BiLSTMS on color generation: A model specifically designed for color generation, using Bidirectional Long Short-Term Memory (BiLSTM) networks.
  2. alignDRAW (2015): One of the first modern text-to-image models, extending the DRAW architecture with a recurrent variational autoencoder and attention mechanism.
  3. StackGAN (2016): A generative adversarial network (GAN) architecture for text-to-image synthesis, combining the strengths of ProGAN and StackGAN.
  4. AttnGAN (proposed): An Attentional Generative Adversarial Network for fine-grained text-to-image generation, utilizing attention-driven, multi-stage refinement.
  5. Stable Diffusion (2022): A state-of-the-art text-to-image model, part of the Stable Diffusion architecture, capable of generating high-quality images from text descriptions.
  6. DALL-E 2 (2022): A text-to-image model developed by OpenAI, known for its ability to generate photorealistic images from text prompts.
  7. Imagen (2022): A text-to-image model developed by Google Brain, capable of generating high-quality images from text descriptions.
  8. Midjourney (2022): A text-to-image model known for its ability to generate artistic and photorealistic images from text prompts.
  9. Dzine (formerly A text-to-image model known for its ability to generate artistic and photorealistic and video images from text prompts.
  10. Leonardo: A text-to-image model known for its ability to generate artistic and photorealistic images from text prompts.

These models have been developed and refined over the years, with some achieving state-of-the-art results in text-to-image synthesis tasks. They utilize various techniques, such as attention mechanisms, generative adversarial networks, and diffusion models, to generate high-quality images from text descriptions a.k.a.: prompt.

Being keen to explore the capabilities of these various Models, a chance given by short-term free access to challenge their capacities; the following are image samples from four of those listed above: Prompt: 'a decades of humanity's existence pass into millennia, unfortunately, it's collective intelligence is superseded by its demonstrative stupidity.'

DALLE-2 Prompt: 'Star Wars prince of evil advancing a realm of violent fear'

MIDJOURNEY Prompt: 'Create the image of a unique automobile design incorporating the iconic features of both Porsche and Ferrari'

LEONARDO In this instance rather than exploring what could be `prompt' generated, is to take an image from Midjourney and Leonard, to measure its ability to enhance and AI prompt generated image:





As can be observed the clarity, alteration and realism from one to another AI model by comparison is much improved.

To illustrate the opportunity the model affords the enhancement of photographed images and have them illustratively, creatively altered is astonishing to say the least:

Photograph: Street Graffiti Art

grafitti art Revision



These are just few of the many images either uniquely created from a worded prompt, alteration or variation from an existing image be it an AI generated, or original photograph. too, has the opportunity to create a video stream generated from a worded prompt as yet to be explored.

For the community of Artists of all genre, and medium; these generational tools are both a curse and a blessing for reasons that are self-evident. That said, the challenge is to look for new opportunities to exploit the advantages Generative Artificial Intelligence affords going forward as it obviously, is just the beginning.

How do you rate this article?



A free-lance writer delving into an eclectic array of topic of interest; crypto-development being of the many.


Welcome: what you will find developing from this Blog in the form of individual Posts, is an eclectic array of form and format delving into an equally eclectic array of subject matter. The objective of this Blog is to convey the meanderings of a curious mind expressed through poetry, short-story, photography, and graphic-digital art. If any of this tickles your fancy: please proceed and hopefully your curiosity will be satisfactorily served. Too, a quick critical comment is always appreciated.

Send a $0.01 microtip in crypto to the author, and earn yourself as you read!

20% to author / 80% to me.
We pay the tips from our rewards pool.