Jan 16, 2024 · 8 min read

Next-gen AI hardware devices

What next-gen AI hardware devices are, why new form factors emerge, and why behavior change may be their biggest adoption challenge.

What are they? Their relevance & biggest challenge.

A few months back, Sam Altman (CEO of OpenAI) met Sir Jony Ive (Chief designer of all Apple products). The primary reason for this meeting is about “some sort of next-gen AI device”.

Two biggies (from AI and Hardware design) meeting regarding an “AI device” should signal you something. Not convinced? Let’s try again.

Just a few days back, a new hardware device (called rabbit) was released. It is a simple device (yet a new form factor & built on top of LAMs) with a minimalist interface. You can take pictures with it, speak with it, ask it to perform tasks, etc..

Do you know the surprising part? Within 4 days of launch, 40,000 units were sold at $199/unit. That product looks something like this:

Let me show you a few products:

Meta’s Smart Glasses: These glasses are designed by Meta, and they are AI-infused. They can make calls, send texts, real-time translation, show answers to your queries, gesture control, etc.

Tab AI: A wearable pendant with having AI-controlled microphone that constantly listens to you, and the people around you. Imagine it like a 24*7 available companion, and gives you insights all the time.

Humane AI Pin: A wearable AI chest pin with which you can talk, gesture, take photos, or communicate with an AI assistant.

Looking at all the above devices, and things happening around the world, it is pretty evident we are on the verge of a new form factor. So now, let me go back & start again.

What are Next-gen AI hardware devices?

Generally, when we say next-gen devices, we always talk about them in terms of current devices. Like, how much improvement happened in design, performance, functionality, etc.. compared with the existing.

However, right now we are in a place where we are going to witness such a jump which we have never seen before. We are going to see new form factors, new capabilities & most importantly, “a tight integration between the hardware & the software with AI as a mediator”.

and these are the “next-gen AI hardware devices”.

But why a new bread of hardware (form factor)?

You see, whenever generational changes are happening in tech, existing devices simply cannot accommodate those changes. We need to develop new form factors that can accommodate the changes.

For example:

Cassettes (magnetic fields tech) → CD Player (laser tech) → Pen drive (flash memory) → Harddisk (magnetizing a ferromagnetic)

All the above devices are primarily storage devices. However, when we are moving from one phase to another, the former form factor simply cannot accommodate the technological leaps & thus leads to a new form factor.

You can find examples like the above all the time & in all the places (computers, mobiles, cameras, etc..).

Similarly, for the next-gen AI that we are going to witness, our current hardware won’t be able to fully support & therefore the emergence of new hardware (form factor) is bound to happen.

What kind of changes these devices could bring?

I’ll try to explain as simple as possible. Let’s take one user journey:

“You want to book a flight ticket from Bangalore to San Francisco for 10th May 2024. You also want to opt for meals, and wifi during the journey”.

With the existing tech, this is probably how the user journey looks:

In this journey, there are 3 major parties involved.

Humans (doing all the manual stuff like clicking, typing, searching, seeing, etc.. This is where the UX/UI will come)
Software (to support humans in achieving the goal. Like travel apps, payment gateway, search engine, etc.. )
Hardware (to support software in helping the humans to achieve the use case. Like screen, memory, processor, etc..)

Humans have to instruct → Software has to instruct → Hardware

Here what we humans are doing are called “Actions”, and this is exactly where the next-gen AI devices could bring difference.

In this picture, just imagine all the items in the red box are getting completed without the interference of humans, and that’s what these new-gen AI devices could accomplish very soon.

So how could these new-gen AI devices might work?

As of now, all of you must be fully aware of LLMs (Large Language Models), and if you have never heard of them, you definitely must have heard about ChatGPT which is a type of LLM.

In a simple way, this is how LLMs work:

There will be a huge neural network with billions of parameters that are trained on all the data (at least most of the data) across the globe. Then they will work as below:

Take the input from the users, and
Check their entire learning (weights), and find which is the highest probable continuation for the user input, and
Gives that as output.

This is where LLMs stop. After giving the text output, they don’t have any life and that is exactly where LAMs come into the picture.

A very basic (for easy understanding) of how LAMs work:

All the above 3 steps which an LLM does (as explained above) will be done by LAM also, however, a new step will be added.

4. Ability to perform the action based on the inference.

In our flight ticket booking example, when you have given all the necessary information to LLM, it can give you the details of the best plan.

However, an LAM will go further and book the ticket for you removing all the UX/UI and manual efforts (clicking, typing, searching, etc…

If you are in regular touch with the tech world, you might have heard about “plugins by ChatGPT” / “agents” which also do the same behavior as LAMs. However, the primary difference is, that these “agents” are customized for very specific tasks & they have no life outside that task.

For example, you can build an agent using ChatGPT + Zapier that can perform the above ticket booking, but if you want it to purchase English willow bat from Flipkart, it won’t be able to do it. That would be a separate agent and so on.

Currently, we are at a very early stage with LAMs. Unlike LLMs, these LAMs come with a lot of tangible responsibility which makes their adaptability difficult until the common audience achieves some confidence.

“Next-gen hardware devices + LAMs → Next-gen AI hardware devices”

Biggest challenge?

The biggest challenge for these devices will be “breaking the behavior”.

You see, behavior is a very tricky characteristic of humans. It is very hard to develop a behavior, but it is much harder to break it once developed. For the last decade, we have habituated to technology in a certain way (at least 99% of all consumer tech) :

A screen to see the activities/actions/steps whatever.
A proper UX/UI.
Confirmation at each critical step.

Only when all these 3 are proper, do we feel confident because we have developed this habit for a decade. When any step is missing, it drips.

For example, say for the same flight ticket example, do you feel comfortable/confident where a device is doing the entire thing with just your voice command? Like looking for flights, picking the best one (based on time, safety, brand, etc..), adding the addons, interacting with the payment gateway, etc..

If your answer is yes, then let’s try this.

You want to apply for the Visa application. Do you feel confident just by telling a device “Apply for the Visa application”?

“It’s not that the device will do wrong, it is just that we are habituated in a certain way and our brain won’t accept a new pattern immediately. This is not a new phenomenon.

When computers were beginning decades ago, many organizations were not comfortable believing computer’s calculations. Because, until that point, humans we doing calculations & believing computers suddenly is a behavioral change that took it’s sweet time.

A similar phenomenon will happen here too, and that will be the biggest challenege for the next-gen AI devices. This is because, the adoptions of these device needs behaviour changes & that will their biggest challenge.

Along with “breaking the behavior”, there are a few more major challenges these devices could face:

Predictability of the system behavior (for example, give a similar question to ChatGPT multiple times and it will give multiple variations of the answers each time. Now, increase the scale & impact of this problem to 100x, and that’s a big deal).
Training of these systems (there is a saying in data science, “The quality of the output depends on the quality of input”. Creating high-quality training data is a big challenge)

Both the above look like software problems, but the impact could be felt by the end-user which can impact the adoption of these devices.

My take on these devices?

All the products I have mentioned in the beginning (like rabbit, humane AI, tab AI, etc..) could utterly fail. But we still need them in the journey until we reach the global optima for these devices to be successful.

Iterations are necessary for any tech to become big. If you remember, between the keypad & full-blown touch screen, we have seen 10s of form factors. Let me show you:

All the above devices (to be precise, form factors) failed. But these devices paved the direction for the devices that we have today. Similarly, we don’t know what kind of devices will become the next big thing, but we are in for a ride & that is the beautiful part. You’ll be one of those lucky generations that is going to witness a full-blown pivot in front of the eyes.

Last note:

I have written this article after reading, thinking, watching, and understanding several resources. There is a chance my assumptions could go wrong, but that’s the beauty, isn’t it? Unpredictability.

Thanks for reading. Please share your opinion in the comments.

I’m Naresh, a product & tech enthusiast excited to witness the tech pivots in the upcoming years. You can reach out to me on Twitter here.

Originally published on Medium