2021 was a big year for artificial intelligence and paved the way for further innovation in 2022. Things are moving fast, but sometimes it can be hard to discern what’s valuable amidst all the hype, so we brought together our favourite people from the field, to share their highlights from 2021 and predictions for the coming year. I’ll kick things off…🚀
Gráinne: 🌱…For years the "explosion" in AI has been widely predicted but never fully realized – companies have consistently been investing resources into developing data applications but often stumbling on productionizing and delivering real value. I chose the 🌱 emoji because I think 2021 was the year in which that AI investment really began to pay off and I believe (and hope) that the growth will only accelerate from here.
Daniel: 🏍…Like the motorcycle, AI is getting people from point A to point B much faster – and in style 😎. Not everyone understands how they work and many are still hesitant to get on, but most people agree that it can be a next-level tool if used correctly. On a darker note, if no one makes any rules around how to use them people will get hurt and/or die – here's one person that's voting for more regulations around AI.
Edouard: 👻 …I feel like everyone used to talk about AI being the sexiest job of the 21st century, but in 2021 it seems like cryptos are the new hip thing and fewer people are talking about cool AI breakthroughs these days.
CJ: 🏚 … I think the biggest problem in AI is that 80% of projects fail. The Zillow algorithm was a pretty public failure, but I think it's just indicative of the whole field. AI needs to be closer to product and include a maturing of the company in their data savviness, or this will continue.
CJ: I think NLP has really moved forward leaps and bounds, with tools like GPT-3, but I'm personally more curious about the work in bioinformatics.
Gráinne: I think the development of GPT-3 in 2020 and the subsequent developments of large-language models really allowed an explosion in "Data Imagination" in 2021. By that I mean that the wide availability of models approaching general artificial intelligence has allowed data scientists (and in fact everyone in data-forward fields) to be creative in thinking how data applications can solve real problems in a valuable way, and this creativity is contagious (as much as Omicron). An example of course is Spoke 😄 where we are solving the problem of extracting relevant information from workplace communication using technology and ideas only developed over the past couple of years!
Daniel: I'd say the most exciting (scary?) development is the battle around creating and detecting fake text/videos. I think the way we look at the truth of what we see in text or video is truly coming into question for the first time. If models can really generate text that humans can't detect as computer-generated, there's an unlimited amount of false information that can flood any website in seconds – which will change the way we consume/trust what we read. This same thing is happening around videos, with at least the catch that some of these videos are absolutely hysterical – I loved the deepfake roundtable myself. I'm picturing things like making yourself the lead actor in a movie or watching yourself try on clothes "in real life" coming out in the upcoming years.
Edouard: I don’t want to talk about a specific model here, but I would say what’s most exciting is that I feel like more and more companies are starting to include AI in their processes and it’s becoming natural to have Data Scientists (and Data Engineers, obviously, without whom Data Scientists are nothing) in all companies.
Daniel: For healthcare, I'm advising a startup that's building a simulated cell for targeted cancer treatments (turbine.ai), and we're already seeing some really game-changing results in drug discovery. We already saw a lot of improved speed in the development of COVID vaccines, DeepMind made AlphaFold available – a huge development around protein folding. All of these (and more) are giving the medical field tools to build on top of that will only accelerate the next impact wave of AI on healthcare.
For cybersecurity, the problem is only getting bigger and bigger as we move more and more online – every device added to a network is inevitably a potential point-of-failure for a security system. Identifying these points of failure becomes harder and harder to do without the help of AI, and I expect to see some exciting things popping up as a result in 2022.
CJ: Less corona? Just kidding. But no, really. This year I'm focusing on IMPACTFUL AI. I want to see some of the glorious models pouring out of research centers in California implemented into real-world problems.
Edouard: One thing that comes to mind is obviously HealthTech…there are many applications of AI, from online consultations with digital agents (online consultations are now a big thing thanks to/because of the pandemic and I’m sure they could benefit from some automation) to better forecasting of new variants and better testing/diagnosing/healing of Corona.
Gráinne: Personally I am very excited about the field of Data Ethics – the philosophical discourse on the subject, the legal developments that will follow and the technical advancements in methods of applying ethics to data applications. There have been a lot of very cool tools built for determining the fairness of AI for example and I hope to see a lot more interesting papers on the topic but I think I am most excited about the establishment of forums where discussions on ethics can be held in the public domain because the seedling 🌱 is going to grow into a giant tree 🌳 quite rapidly and I think we should have a process in place as a community to explore ethical issues.
If I am allowed a second one ☺️ I will add that the development of analytics engineering is something I think is very exciting. Over the past years in the Data Science community, we have benefited a huge amount in the application of software development principles to data science development and I think the analytics field will see the same benefits with increased adoption and growth of tools like DBT.
CJ: After we've spent two years coming up with ways that AI can help the health care industry and aid in the pandemic, I would like to note that none of those algorithms is being used. None is a big word, so I'll hedge my bets and say very few... but still that is a massive waste of potential. I'd like to see someone dig into why there is no impact on the pandemic from AI.
Edouard: If I’m being honest, I’d say voice assistants are an area of AI that are most widespread but also the most frustrating in my experience. They work reasonably well for native English speakers but for most other languages they’re just not there yet, even though Alexa and Google claim they will make your life easier.
Gráinne: I think the concept of "feature stores" is a little overhyped. I see feature stores more as an interim step before sophisticated data and model/code version control. If data (or the data-generating process) is properly versioned (e.g. help us analytics engineers!) and so are the transformations used in feature engineering - I don't see the need to build a complex product around the output of the two. Perhaps this is simply a preference difference for the philosophy of functional vs. object-oriented programming but I rather focus on tracking and managing changes to the processes (e.g. transformation functions) than the results. I am willing to have my mind changed on this though so come convince me!
Daniel: I'd probably swap overhyped for misunderstood and then say AutoML. I think AutoML is being sold as a tool that will swap out data scientists and have them sitting on the street in a few years – count me as a non-believer. Don't get me wrong, there's huge potential in AutoML and this year should take it from mostly buzz-word to actually being used in production for many companies, especially with most of the cloud providers bundling it up with their cloud offering. But since modeling is only a part of the job of a data scientist, they'll still be needed to put all of the pieces together for a product. From my experience there are always edge cases in preprocessing the data, creating features, building the model, understanding the results (and more) that greatly affect the final product.
So while someday AutoML might be able to do all of these things generically at some point, I think it's much farther away than many people picture when they read (over-)hyped blog posts about AutoML. The same goes for computer-generated code and software engineers. The smarter the computer programs get the more programmers we'll need and when we no longer need programmers, the computers might not need us anymore... 🙃
So there we have it…I think it’s fair to say that we’re pretty excited about what AI has to offer in 2022 🦾
Special thanks to Edouard (Data Scientist at Revolut, in Berlin), CJ (Data Lead @ jayway by devoteam, in Stockholm), and Daniel (Head of Data Science at Adyen, in Amsterdam) for their insights.
Ready to automate alignment and take back your time?
Sign up now to our waitlist for early access.