Conversation
This is much, much faster than sequential analysis. On a codebase of approx 12,500 files I estimate that sequential analysis will take ~10 days. In batch mode it takes less than a day. However, the quality of the output is not as good so I need to investigate the root cause. I have hardcoded quite a lot of Anthropic/Bedrock-specific types and logic into the batch services, but it should be easy enough to move that out into a BedrockAnthropic batch_client and keep the main codebase generic.
|
I have re-run that subdirectory without gleaning and it seems much better. Trying on a larger codebase now. Edit: Updating this in case anyone has similar issues. Without gleaning I saw far fewer hallucinations so I tried it on the full codebase. I found the result quality dropped quite a lot. It didn't seem to know about key components and frequently referenced irrelevant things. When I looked at the entities and relationships the RAG system was supplying with the query I noticed that there was a lot of garbage being sent. It turned out that because I was gathering data from all I cut the gathered data down to just Incidentally the batch run took less than 6 hours and that is including waiting a few hours for Bedrock capacity. I kicked off a sequential build at the same time and it is currently at 2.5%. |
|
Hi @tbtommyb Thank for sharing. I am researching to batch processing as it fits better for such type of the tasks and way cheaper. I did few experiment with Anthropic API. I tried Bedrock but I don't like limit in 50 messages minimum in 1 request and Bedrock limits are still are not the best one. |
|
You could write different BatchClients for anthropic, OpenAI etc and their batch APIs and reuse some of my logic. Bedrock is what I have access to so I use it. for embeddings I just use the standard embedding client in the codebase. They are fast enough that it’s not really worth batching. |
I am not expecting you to merge this PR as it is quite messy with a lot of line noise (not sure why, I did use ruff), but I thought you might be interested in my approach.
I have been working to implement batch analysis using Anthropic on AWS Bedrock because I find the sequential analysis too slow on large codebases. I have hardcoded quite a lot of Anthropic/Bedrock-specific types and logic into the batch services, but if you want to support batch mode and settle on an interface I can try and refactor it into a BedrockAnthropic batch_client. Separate clients could implement support OpenAI and Anthropic's batch modes and keep the main codebase generic.
Analysis using this batch mode is much, much faster than sequential analysis. On a codebase of approx 12,500 files I estimate that sequential analysis will take ~10 days. In batch mode it takes less than a day and this is including a summarization call somewhere that I haven't moved to batch mode which took 10 hours.
Since the existing code is based on sequentially analysing documents I have had to make a lot of changes. The basic flow is:
However, in experimenting with the output I find the batched RAG generates less accurate answers than sequential RAG. It hallucinates entities a bit more and seems to have less breadth available. For example, I asked it a question about a subdirectory of a React/Redux app. The sequential RAG generated a pretty good answer but the batched one focused inappropriately on Redux reducers for some reason. I am wondering now if that is due to the gleaning step as I see you warn that it causes hallucinations. I will try without the gleaning but please let me know if you can see anything obviously wrong in the code. I track error rates at various steps and they are all well below 1% so I am not sure what else might be wrong.