Artificial intelligence is revolutionizing the way we interact with the digital world. Thanks to its ability to learn and continuously improve, AI offers endless possibilities for application.
In particular, there are six main ways it can be used:
- text-to-text
- audio-to-text
- text-to-audio
- text-to-image
- image-to-text
- image-to-image
Let's see how these six ways of usage can transform our digital experience and improve our daily lives.
Before we start, we need to define two basic concepts common to all models: Prompt and Neural Models.
What are Prompts for AI models?
Prompts can be extra words introduced by a human being. Like clues in crossword puzzles, prompts guide the model towards a desired decision or prediction.
What are Neural Models in AI?
Neural models in Artificial Intelligence are machine learning algorithms that mimic the functioning of the human brain. These neural networks are composed of interconnected artificial neurons that process information similarly to biological neurons.
These models are obtained through training systems where a series of inputs are given, the outputs are visualized, and the coefficients of the neural network are modified to obtain the desired outputs. To carry out training, a large amount of high-quality data is required; otherwise, the results will be inaccurate.
Text Models
Text generation is one of the areas of Artificial Intelligence that has seen increasing attention in recent years.
Thanks to the machines' ability to process large amounts of data and learn from existing models, it is possible to create software capable of producing high-quality texts that seem written by humans.
The possibilities can be divided into several types, for simplicity we will refer to the examples officially presented by OpenAI (click here to learn more about OpenAI).
FAQ or Q&A Answer Generator
This type of system uses machine learning to understand questions asked by users and provide relevant and precise answers. The peculiarity of these models lies in the fact that they can continuously improve their ability to answer users' questions through the analysis of the feedback provided by the users.
Prompt:
I am a highly intelligent bot for answering questions. If you ask me a question that has a root in truth, I will give you the answer. If you ask me a nonsensical, deceptive, or unclear question, I will respond with "Unknown".
Q: What is the life expectancy in the United States?
Answer:
A: The life expectancy in the United States is 78 years.
Text Classification Systems
These operations are used when given a textual input, we want to categorize or classify it. It can be used to classify individual words, phrases, or entire blocks of text.
Once again, they are easily improved through user feedback, as every time a model's incorrect behavior is reported, it is improved to avoid making the same mistake or similar mistakes.
Prompt:
Extract the keywords from the following text:
Black-on-black pottery is a 20th and 21st-century pottery tradition developed by Puebloan Native American ceramic artists from the northern New Mexico. Traditional reduction-fired pottery has been produced for centuries by pueblo artists. 20th-century black-on-black pottery is produced with a smooth surface, with designs applied through selective burnishing or application of refractory slip. Another style involves sculpting or incising designs and selectively burnishing raised areas. For generations, several families from the Kha'po Owingeh and P'ohwhóge Owingeh pueblos have made black-on-black pottery using techniques passed down from matriarchal ceramicists. Artists from other pueblos have also produced black-on-black pottery. Some contemporary artists have created works that honor the pottery of their ancestors.
Answer:
pottery, black-on-black, 20th century, 21st century, Puebloan, New Mexico, reduction-fired, smooth surface, designs, selective burnishing, refractory slip, sculpting, incising, selectively burnishing, Kha'po Owingeh, P'ohwhóge Owingeh, matriarchal ceramicists, contemporary artists, ancestors.
Prompt:
Classify the sentiment in the following tweets:
- "I can't stand homework"
- "This sucks. I'm bored 😠"
- "I can't wait for Halloween!!!"
- "My cat is adorable ❤️❤️"
- "I hate chocolate"
Result:
Sentiment ratings of the tweets:
- Negative
- Negative
- Positive
- Positive
- Negative
Chatbot
This involves using artificial intelligence to provide support through chatbots. Once again, training and optimization are easy as users can provide negative feedback when the chatbot makes errors.
Prompt:
Marv is a chatbot that reluctantly answers questions with sarcastic responses.
You: How many pounds are in a kilogram?
Result:
Marv: Again? There are 2.2 pounds in a kilogram. Take note.
Free Generation
One of the most interesting theoretical uses of Artificial Intelligence. It refers to the ability of AI to generate content with basic guidelines, such as texts, writings, poems, songs, or even works of art.
The idea behind this application is to endow AI with its own creativity, capable of producing original and innovative material without human intervention.
In this case, training is very complex as it requires a large amount of data and significant hardware.
Prompt:
Product description: A pair of shoes that fits any foot size.
Keywords: adaptable, suitable, omni-fit.
Product name:
Result:
Omni-Fit Shoes
Translations
A system similar to free generation but more focused on actual translations between different languages or even between different programming languages (it is not absurd to think of a completely autonomous Rosetta Stone in the future).
Once again, training is very complex due to the large amounts of data required.
Prompt:
Translate into 1. French, 2. Spanish, and 3. Japanese:
Where is the nearest museum?
Result:
- Où se trouve le musée le plus proche?
- ¿Dónde está el museo más cercano?
- 最も近い博物館はどこですか?
Transformations
These are systems that allow transformations and reprocessing of textual content into others. Use cases can be very diverse, such as grammar correction, synthesizing, or rewriting content with different writing styles.
These systems are the most complex to train as the neural network is very different after training, and therefore many weights need to be modified.
Prompt:
Summarize for an elementary school student:
Jupiter is the fifth planet from the Sun and is the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two and a half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye at night, and has been known to people of ancient civilizations. It is named after the Roman god Jupiter. When observed from Earth, Jupiter can be bright enough to cast visible shadows and is on average the third-brightest natural object in the night sky after the Moon and Venus.
Result:
Jupiter is a large and bright planet that orbits the Sun. It is the fifth planet from afar and can be seen with the naked eye at night. It is named after the Roman god Jupiter and can be bright enough to cast visible shadows. It is on average the third brightest object at night after the Moon and Venus.
Prompt:
Correct the grammar of the following sentence:
Jupiter and the fifth planet from the sun.
Result:
Jupiter is the fifth planet from the Sun.
Prompt:
Find the contact information from the following email:
Dear Kelly,
It was wonderful talking to you at the seminar. I found Jane's presentation really interesting.
Thank you for the book. Here is my address: 2111 Ash Lane, Crestview CA 92002
Regards,
Maya
Result:
Contact information for Maya:
Name: Maya
Address: 2111 Ash Lane, Crestview CA 92002
Audio Processing
Thanks to the power of advanced algorithms, AI can analyze, recognize, and improve audio like never before.
Filtering background noise, distinguishing voices, translating conversations in real-time into different languages, all of this is possible thanks to AI innovation.
Artificial intelligence can also create new forms of sound and music art, generating sounds and melodies autonomously and innovatively. Interest in AI in the field of audio processing is growing, and its applications are already evident in teleconferencing systems, virtual assistance services, and the production of high-quality audiovisual content.
From Text to Audio
The use of artificial intelligence for converting text to audio is becoming increasingly widespread. There are many applications that allow the transformation of text into an audio file, making it easier to listen to books, articles, and documents. This technology is particularly useful for people with visual impairments or for those who prefer listening over reading.
This technology can be used to create podcasts or intelligent voice assistants. Thanks to artificial intelligence, the computer-generated voice can be made increasingly natural and human-like, thus improving the user experience.
These systems are also evolving in the generation of audio content from scratch, with Google leading the way and producing impressive results:
Prompt:
The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.
Result:
The example is taken from Google's official research paper: https://google-research.github.io/seanet/musiclm/examples/
From Audio to Text
This involves the ability to convert speech into written text.
This technology has already been used in many sectors, such as in transcribing interviews, conferences, and political speeches. But the use of this technology is not limited to speech transcription; it can also be used in the medical field, such as in the transcription of medical reports or in the transcription of legal texts.
Furthermore, the audio-to-text conversion technology can be used to create automatic subtitles in videos, improving accessibility for people with hearing impairments.
In this case, there are various artificial intelligences that provide this functionality, one of which is certainly provided by OpenAI Whisper.
Input:
Result:
Before he had time to answer, a much encumbered veerer burst into the room with the question, I say, Can I leave these here? These were a small black pig and a lusty specimen of black red gamecock.
The example, in this case, is also taken from official resources provided by OpenAI.
Image Processing
Artificial intelligence has revolutionized the way images are processed and analyzed. Thanks to deep learning techniques, neural networks can learn to recognize objects, faces, and patterns within an image and classify them or create new ones.
Here, the two most well-known and advanced interpreters are MidJourney and the open source alternative Stable Diffusion.
From Text to Image
This is a fairly simple mechanism and is nothing more than asking artificial intelligence to generate images from a prompt.
Here are some examples taken from the reddit of the two platforms:
The request was to create a photo of the former US president as a homeless person.
Here, the request was to show what a selfie taken by Native Americans in the 1800s would have been like.
The results are astonishing, and it is very difficult for anyone to recognize these photos as generated. For more examples, click here for Stable Diffusion