A quick look at the current state and future of AI Agents

Author: jolestar

I played around with AI Agent last week and attended the ai16z event in Beijing the day before yesterday. I wanted to see what AI Agent can actually do now and think about what it can do in the future.

The current situation of AI Agent reminds me of the meme picture where a person is hidden inside a vending machine. Everyone has imagined that AI Agent has begun to have autonomous consciousness, but in reality, there is actually a developer hidden inside the AI Agent. (Please imagine the picture here. I tried to let AI generate this picture, but found that AI cannot understand "hide")

How the AI Agent Framework works

The AI Agent framework currently plays the role of a glue, gluing together clients (Twitter, Discord, Telegram, etc.) and various plug-ins (chains, etc.). The framework then provides a basic library (memory storage, session isolation, context generation), etc., and then connects to various AI platform interfaces.

How to combine AI Agent framework with applications and business scenarios

Since AI became popular last year, various platforms and tools have emerged. The most important thing is to solve a problem: how to combine AI with applications. Some AI platforms try to provide plug-ins, some build workflow models, and some embed AI in traditional applications. But the key here is: 1. Where is the interactive entrance of the application? 2. How to combine AI with existing business logic.

The interactive entry point for applications provided by various AI platforms to users is a dialog box similar to a chat window. Obviously, everyone believes that the way to interact with AI applications should be an "anthropomorphic" way. The clever thing about AI Agent is that it directly connects to all open IM and social systems, which is obviously easier to accept than creating a new one.

How to combine AI with existing business logic. The solution provided by AI Agent is to allow developers to integrate AI decisions into business scenarios. Programming languages require certainty, and the if condition can only be true or false, which cannot handle fuzzy business logic. However, AI can convert complex logic into precise conditions, which can then be seamlessly integrated into business scenarios.

For example, for the function of replying to messages in a group, traditional IM Bots need some clear message instructions to trigger it, but through AI, a method shouldReplyMessage can be implemented, which returns true or false given the context.

The main functions of AI in business logic scenarios are:

1. "Intent" discovery: Through the instructions in the prompt words, let AI discover the "intent" in the user's text message based on the context and map the intent to specific code.

2. Assist decision-making: Use AI to convert fuzzy and complex conditions into definite true/false or enumeration types, and then incorporate them into business logic.

Seeing this, many people may be disappointed with AI Agents. Many people think that AI Agents can do everything after being taught. In fact, due to the problem of contextual limitations of large models, it is impossible (at least currently) to create a universal AI that can do anything. But the good news is that programmers don’t have to worry about losing their jobs. There are still a large number of programmers behind AI, and people are still needed to stack if else, but the key difference is that the business boundaries that programs can handle are expanding.

Two types of AI Agents

At the event, Shaw was asked a question. The market has two expectations for AI Agents: 1. AI Agents play a role, have their own ID and brand, and provide services to users. 2. Users have personal AI Agents, which are equivalent to personal assistants and can assist users in handling some business. Which of these two AI Agents will be more popular? He thinks both directions will be good and they may be combined.

The first direction is what people are exploring in the market now. This direction is similar to the AI agentization of services. In the future, there may be no App interface. Apps will all be AI agentized and personified. The second direction is the agentization of application clients. In the future, application clients will be a plug-in of the assistant agent. The local data of the application will become part of the agent memory library. At the same time, this plug-in is also responsible for communicating with the service agent in the cloud. This is a new application architecture model that will change the entire infrastructure.

AI Agent Infrastructure Requirements

1. The infrastructure should be permissionless, otherwise AI Agents will be restricted by various anti-attack strategies, and services should use economic cost methods (Gas) to prevent attacks. In this regard, platforms with a relatively poor degree of openness will face a greater impact, and the enthusiasm for open platforms in the early days of Web2 will be reignited.

2. AI Agent needs to be able to operate funds to pay to solve the above problems.

That is to say, future services, whether blockchain-based or not, will need to support Crypto's private key mode authentication and Crypto-based payments.

Combination of AI Agent and Chain

In addition to the two points mentioned above, how AI Agent can be combined with the chain is a direction that everyone is exploring. At the event, I talked with Mikkke about focEliza that he is working on. Of the two types of AI Agents mentioned above, at least the first one requires a running or verification environment provided by the chain. Because once an AI Agent provides services to the outside world, there will be trust issues, and the role it plays is actually the same as that of a smart contract.

There was a controversy about the name "smart contract" back then. It is just a piece of code, how can it be "smart"? AI can make smart contracts truly worthy of the name. The difficulty is how to call the AI interface in the smart contract environment. If it is still a long way to let the big model run in a verifiable environment, using a solution similar to Oracle is a more feasible path.

There are many requirements surrounding AI Agents. How do AI Agents acquire public knowledge? How do AI Agents determine facts? How do AI Agents identify the same user on different platforms? How do you store the "memory" in smart contracts? If I have multiple devices, each with an AI Agent installed, how do they share memory?

You will find that the "data on-chain", relationship on-chain, DID, P2P network, etc. that were originally used in Web3 all have new meanings and scenarios.

Conclusion

I will repeat my conclusion from my 21-year sharing on AI and blockchain: an Internet that is more friendly to AI is also more friendly to humans. It was just a brainwave at that time, but now the future has arrived.