Q: 7
A Generative AI Engineer has been asked to build an LLM-based question-answering application. The
application should take into account new documents that are frequently published. The engineer
wants to build this application with the least cost and least development effort and have it operate at
the lowest cost possible.
Which combination of chaining components and configuration meets these requirements?
Options
Discussion
Nah, I'm sticking with A. D looks nice because of agents, but setting one up adds work that isn't needed for basic RAG. B is a common trap-fine-tuning is pricey and overkill just to keep the answers current. If someone has seen otherwise in recent exams, let me know.
Not B, it's A. Frequent fine-tuning (option B) is way more effort and cost than just updating retriever indexes. A lines up with RAG, fits new docs well, and is less complex than setting up agents or fine-tuning. D sounds tempting but agent setup is overkill for this use case. Pretty sure it's A here, unless I'm missing something subtle.
I don’t think it’s D, A fits better. Fine-tuning and agent configs add both cost and dev time, while using a retriever like in A covers new docs cheaply. Not totally certain but A looks right here.
A tbh
I don't think it's D, agents add extra config and maintenance which isn't the lowest effort. A avoids the trap of B (fine-tuning is costly) so pretty sure A is right here.
Official Databricks docs and a bit of hands-on with prompt chaining cover setups like A pretty well.
A for this one
C or D for me. Both mention agents or prompt engineering, which feels like less setup than a whole retriever pipeline. I think C is especially simple, just doing the prompt work plus LLM, so maybe lowest cost? Could be missing something though.
Makes sense, A fits the RAG pattern and handles new docs easiest. Pretty sure that's right.
Not B, A. Fine-tuning every time would be expensive, RAG covers updates without much extra work.
Be respectful. No spam.