It’s not too hard these days to generate a research report or an article with the help of AI. You just need to figure out the topic and prompt chatgpt, it will generate a draft for you. Potentially you can polish the draft and work towards a decent article. However I found myself even lazier than that. I just want to show the AI the topic and some keypoints and let it generate a long comprehensive article for me.

There are two major blockers for a general AI to do this. First, it can’t access the internet to get the latest information. Second, there is an output length limit for most of the AI models(4096 tokens or 500 words for even GPT-4)

Multi-agents & CrewAI

Then I came across a framework called CrewAI which gives an example on how to use it to generate instagram posts . The main concept of CrewAI is agents, tasks, tools and crew. Agents are powered by llm models which can execute tasks. Crew is a group of agents that can work together to complete a complicated task like research report generation. During the process agents can use tools like web search/scraping to finish the tasks.

Sounds good, so I followed the example of instagram post and created a few agents including researcher,writer and editor. The researcher agent is responsible for collecting information from the internet, the writer agent is responsible for writing the draft and the editor agent is responsible for polishing the draft. Then I created a crew with these agents and let them work together to generate a research report.

Then I kickoff the crew process in the terminal and monitor the process. I have to admit it feels like I’m managing a team of AI agents to work for me. After like 10 mins the process is done and I got the article. I have to say that the quality looks good and has all the keypoints with up-to-date information. However it fall short of my expectation of the length. Although I prompt the agents with clear instruction on the length matters, the output is still around 500 words, no matter how many times I tried.

Then I figured out that it was the limitation of llm model. as the flow is a sequence which would end at the output node which is the task for editor agent. It simply just can’t exceed the limit of gpt-4 output length.

Besides that, the execution seems costly. As it needs to take in the content from google search multiple times, I found it can easily consume over 100k tokens or 3$ if you use gpt-4. It better be a great output however the lenghth is still a concern.

Divide and Conquer

One of the most fundamental strategies in computer science kicks in , divide and conquer. OK so if I can’t generate a whole article in one go , can I generate it in several parts and then combine them together? That sounds possible.

But you can’t simply divide the article into several parts and let the agents work on them separately. As the agents are not aware of each other, they can’t work on the same article. So it has to be a master agent to look after the structure of the article and assign tasks to the sub-agents.

I created a master agent and let it write the introduction and conclusion and divide the article body into several parts and assign tasks to the sub-agents. The sub-agents will generate the content for their part and return to the master agent. Then the master will combine the parts together and generate the final article.

As a first step the draft looks like:

# Introduction
abcd

# Part 1 (assigned to agent 1)
# Part 2 (assigned to agent 2)
...

# Conclusion
xyz

As this would take each section as a separate task, the output lengh is not a concern anymore. The master agent can also contain the logic to control the lengh of each part. All works well I just found the last step to combine the parts is a bit tricky as all agents are supposed to use llm, however in this case I just need it simply combine the parts together. So atm I manually combine the parts together but this should be relatively easy to implement a custom tool or separate script.

Open source models

The second blocker for scaling up is the price, you have to admit 3$ for a 500 words article is not neglectable. So I also tried open source llm models like llama3. I used the api from Groq which is quite generous to provide 1million tokens for almost free. I used llama3-70b , it works mostly fine but in the middle I can tell from the logs it sometimes lost the action context and has to reverse and try again. But somehow it still managed to generate but the quality is not as good as gpt-4. I think it’s somewhare around gpt-3.5. But it’s almost free so I can’t complain too much.

I also tried llama3-8b, it still manages to generate but the quality is dramatically downgraded. However considering you can even run it on your own machine, it can be a good fit for some simple use cases.

Alternative platforms

Besides CrewAI , Autogen backed by Microsoft is another platform that offers multi-agents capabilities. I didn’t try it yet but I had a glance of how it works , as under the hood it also relies on llm models to run agents. You should expect similar results and same issues brough by llm itself.

Another inteseting and promising tool is GPT-researcher multi_agents. It is designed to generate research reports. I tried several runs and get impressed with it’s cost efficiency and long article generation capability. Say a topic like “How to use AI to generate long articles” it can generate a 4k words article if you use detailed style with under 1$ which is quite impressive. I almost turned to this tool for my task but I found it’s not as flexible as crewAI to customize the agents and output format. Also the sections seem not to know each other so it’s hard to generate a coherent article. However if you expect exactly the same format for your articles, gpt-researcher is a wonderful choice.

Conclusion

This is just my simple experiment on utilizing multi-agents for tasks like long article generation. It is an area that is fast evolving and I believe there will be more tools and platforms coming out. Some platforms are even conquering complex tasks like automatic code develoment. It looks both exiting and a bit scary. But if it has to come, we better be prepared.