Part one delivers insights on Ethical AI, Edge AI, and Conditional computing. Part two will further examine the trends that will influence AI development in 2022 and beyond.
The problem of data scarcity comes as surprising when joined hand-in-hand with data flooding. Yet both topics are expected to be crucial in defining the future directions for Artificial Intelligence and Machine Learning.
- Part 1
AI Trend 1: Responsible AI
AI Trend 2: Edge AI
AI Trend 3: Conditional Computation
- Part 2
AI Trend 4: Synthetic Media
AI Trend 5: Large Language Models and Natural Language Processing
AI Trend 6: Continual Learning
AI Trend 7: AI Cybersecurity
Synthetic data is a term that refers to data generated by neural networks, with little to no connection with reality. Machines are increasingly proficient in generating convincing images of food or faces, sounds and music, and natural speech. This comes with multiple use cases in 2022.
Building synthetic datasets
There is no coincidence in the fact that Facebook and Google are among the leaders of AI development. Pushing boundaries is directly connected to the availability of data and Google, Facebook, and other tech giants have access to immense amounts of data of multiple varieties.
Yet many companies are not so lucky. According to the Datagen report, only one percent of computer vision teams have not experienced a project cancellation due to insufficient data. The most common issues encountered by these teams include poor annotation (48%), lack of domain coverage (47%), and the scarcity of data (44%).
The challenge of insufficient or missing data leads to multiple outcomes, including hidden biases when a particular type of data (for example faces of people from ethnic minorities) are underrepresented.
Also, generating synthetic data is way cheaper than building a real-world dataset, as well as legally easier than collecting images and managing their intellectual rights to them.
Supporting creative work and marketing efforts
As mentioned above, there are many use cases for synthetic data in creative fields. Machines can deliver AI-powered voice acting for computer games or animated series, short music pieces and support video creation by applying particular filters on recorded material.
While the development of tools can be troublesome, the increased availability of these tools encourages creators to deliver them in an easy-to-use way.
AI can not only be used to deliver creative works but also to pieces of art. This comes as a challenge for traditional approaches, where there was a strong bond between a piece of art and its creator. And it remains so if AI is used as a creative tool.
A good example of an artist who uses AI as a sophisticated art creation tool is Ivona Tau, who contributes to Tooploox’s own works and delivered a comprehensive piece about AI art for our blog.
Yet a problem arises when art is created by the machine without human interference. AICAN, a Generative Adversarial Network (GAN), trained on paintings of renowned artists including Michaelangelo, Kandinsky, and Warhol among others.
Continual learning and data flood
According to Seagate data, there have been 44 zettabytes of data produced by 2020, and by 2025 the amount of data is predicted to reach 463 exabytes. To show a proper scale – zettameters can be used to show the length of the Milky Way Galaxy’s diameter (approximately 1.7 zettameters) and there are 1.398 zettalitres of seawater in all earth’s oceans.
The number of devices producing data rises dramatically with IoT and 5G technologies boosting this process. This is also a growing concern for AI.
The newly gathered data is great nutrition for existing AI models to improve their performance and accuracy. On the other hand, using all this data in a traditional training method would cost tremendous amounts of computing power and storage.
Continual learning is a trend-focused on addressing this challenge through constant learning performed on inflowing data. The model does not forget all the gathered knowledge, but rather is collecting new ideas, much in the same way humans learn throughout life.
Tooploox has contributed to this trend by submitting research papers that tackle the main challenges related to continual learning processes. Just to name some examples:
On the robustness of generative representations against catastrophic forgetting – catastrophic forgetting is a process that occurs when a neural network loses earlier skills gathered when learning new things. This paper deals with the problem by comparing generative and contemporary neural networks, showing the robustness of the former.
Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in Continual Learning – this work deals with the problem of forgetting by providing a network with the ability to consolidate similar knowledge and forget what has been learned in a controlled manner.
Large language models and NLP
Large Language Models are Natural Language Processing (NLP) machine learning models that can perform language-related operations in a human-like manner and in a more convincing way. ML models are known for generating texts indistinguishable from ones generated by human writers.
The “large” refers to the fact that there are massive neural networks beneath the model, trained on gargantuan datasets consisting of whole libraries of texts as well as internet resources.
Large Neural Networks have been delivered by major tech players, including Google, Facebook, and Microsoft.
Google translate is one of the best examples of AI-powered tech that users access every day without even knowing that they use neural networks. NLP models will become increasingly multilingual to deliver AI-based services for more niche markets, with the aim to outrun the competition and deliver better services for multinational companies.
Conditional computing also influences large language models and NLP. Providing models with contextual information greatly enhances their ability to provide relevant output. This technology has already been seen, for example, in an automated analysis of YouTube movie reviews, yet with the rise of automated assistants, it will only continue to grow.
Inhuman languages included
One of the most interesting applications of NLP models is Copilot – an AI that works with the programmer to deliver code in a faster and more efficient way. The network was trained not on natural language, but on the infinite code repositories of GitHub.
The U.S. Healthcare Cybersecurity Market 2020 report shows that more than 90% of healthcare organizations in the US suffered at least one cybersecurity breach during the last three years. Also, according to the Deep Instinct report, the use of malware has increased by 358% during the year 2020 and ransomware usage increased by 435% during the same year.
Cybersecurity is of critical concern, especially considering the growth of connected devices powered by IoT and 5G trends, among others.
This particular field will be increasingly influenced by AI – on the light side and the dark side of cybersecurity alike. A Europol-issued report delivers an interesting dive into the matter, showing multiple malicious use cases of AI, including password breaking and impersonating users for fraud.
On the other side, there are increasingly effective spam filters and malicious activity detectors that can find the signs of an actor breaking into the system early enough to prevent further harm.
Like the steam engine of centuries past, AI technology is transforming businesses and society around the world daily, with an increasing number of users who have no idea that they are using AI-based tools.
If you feel that some of these trends can support your business and provide you with new possibilities for development, don’t hesitate to contact us now to share your thoughts!