You are currently viewing Everyone in AI is talking about Manus. We put it to the test.

Everyone in AI is talking about Manus. We put it to the test.

The new general AI agent from China had some system crashes and server overload—but it’s highly intuitive and shows real promise for the future of AI helpers.

Since the general AI agent Manus was launched last week, it has spread online like wildfire. And not just in China, where it was developed by the Wuhan-based startup Butterfly Effect. It’s made its way into the global conversation, with influential voices in tech, including Twitter cofounder Jack Dorsey and Hugging Face product lead Victor Mustar, praising its performance. Some have even dubbed it “the second DeepSeek,” comparing it to the earlier AI model that took the industry by surprise for its unexpected capabilities as well as its origin.

Manus claims to be the world’s first general AI agent, using multiple AI models (such as Anthropic’s Claude 3.5 Sonnet and fine-tuned versions of Alibaba’s open-source Qwen) and various independently operating agents to act autonomously on a wide range of tasks. (This makes it different from AI chatbots, including DeepSeek, which are based on a single large language model family and are primarily designed for conversational interactions.)

Limited Access and User Experience

Despite all the hype, very few people have had a chance to use it. Currently, under 1% of the users on the waitlist have received an invite code. (It’s unclear how many people are on this list, but for a sense of how much interest there is, Manus’s Discord channel has more than 186,000 members.)

MIT Technology Review was able to obtain access to Manus, and when I gave it a test-drive, I found that using it feels like collaborating with a highly intelligent and efficient intern: While it occasionally lacks understanding of what it’s being asked to do, makes incorrect assumptions, or cuts corners to expedite tasks, it explains its reasoning clearly, is remarkably adaptable, and can improve substantially when provided with detailed instructions or feedback. Ultimately, it’s promising but not perfect.

Testing Manus

To put it to the test, I gave Manus three assignments:

  • Compile a list of notable reporters covering China tech.
  • Search for two-bedroom property listings in New York City.
  • Nominate potential candidates for Innovators Under 35.

Task 1: Listing Reporters

The first list Manus provided was incomplete, citing “time constraints” as a reason for cutting corners. However, with additional guidance, it compiled a more comprehensive and well-structured list.

Task 2: Apartment Search

Manus initially struggled with vague requirements but refined its results after additional input. The final output was well-organized with categories like “best overall,” “best value,” and “luxury option.”

Task 3: Finding Innovators Under 35

This was the most challenging task. While Manus efficiently reviewed selection criteria and research strategies, it struggled with paywalls and provided an incomplete list. Even after three hours, it managed only three detailed candidates before producing a broader but less reliable list.

Final Thoughts

Overall, Manus is a highly intuitive tool with great potential, though it suffers from system instability, crashes, and difficulty handling large data loads. While it occasionally outperformed ChatGPT DeepResearch, its reliability is still an issue.

Manus’s transparency and collaborative approach make it a promising alternative for white-collar professionals, developers, and small teams. If its infrastructure improves, it could become a significant player in the AI agent space.

Conclusion: Manus isn’t perfect yet, but it shows the growing impact of Chinese AI companies on the future of autonomous AI agents.

This page has 37 views.