The Art of Summarization
Summarization has always played a crucial role in making dense information more digestible, memorable, and actionable. From the ancient Greek epitomes to modern-day executive summaries, the ability to distill complex content into its essence helps us understand, communicate, and make decisions faster. Every day, we are inundated with information. No person has the time or energy to attend every meeting, watch every TV show, read every book, or listen to every political speech. Summarization is indispensable.
Passing the Baton: How AI is Taking Over Summarization Duties
Historically, summarization was a highly-regarded human-driven skill, requiring excellent reading speed and comprehension. Especially prevalent in fields like academia and journalism, summarizations help us keep tabs on recent developments in our industry or developing news stories. Authors often round up their main messages at the end of a chapter or article, much like the moral at the end of each of Aesop’s fables. Even social media and discussion boards make widespread use of summaries. The term “TL;DR” Too Long; Didn’t Read), is often used to introduce a summary of a lengthy post.
But now, for the first time in history, we are beginning to shift the task of summarization away from humans and onto AI.
The Summarization Potential of LLMs
Summaries are invaluable for breaking down complex information into the key takeaways, helping you grasp the “big picture” quickly. This efficiency can save countless hours, and large language models (LLMs) excel at it. Thanks to their broad training and ability to recognize intent, they can turn a twenty-minute read, like a whitepaper, into a quick two-minute overview.
The Challenge: Keeping LLMs on the Right Path in AIOps
AIOps requires a precise and context-aware approach for diagnosing issues and recommending solutions. LLMs, being generalists, are prone to take liberties in what they consider important or irrelevant in a prompt if it’s not explicitly clear, often leading to unsatisfactory responses. This raises the question: how can we harness their summarization strengths while minimizing these risks?
It takes a nuanced approach. The LLM needs some hand-holding upfront, but the results are well worth the effort. If we combine intelligent systems built on domain expertise and traditional AI with the English language mastery of Gen AI, we can achieve high quality summarization in precision-critical tasks, including investigating root causes in observability.
Imagine this scenario: A network operations team is dealing with intermittent outages affecting part of their server infrastructure. To troubleshoot, the team collects logs, performance metrics, and historical incident reports. They believe they’ve identified the root cause but want to create a clear summary report to circulate within the organization, ensuring no details or affected systems are overlooked.
To streamline this, the team has developed a program that categorizes and translates incident data into a concise prompt for a large language model (LLM). The LLM then generates an error-free summary. This summary aligns with the team’s initial hypothesis, pointing out a potential link between high CPU usage and specific network traffic spikes. Additionally, it suggests that a recent software update might be responsible, as similar issues were logged after the update was implemented.
Armed with this summary, the team investigates further and confirms that the software update indeed contains a bug causing resource exhaustion under certain conditions. The summary not only provided a data-driven second opinion but also served as a clear communication tool. Now, the network operations team can effectively share the issue and its cause with cross-functional teams, ensuring better visibility and understanding across the organization.
This example illustrates the clear value LLMs can offer—with a bit of assistance—in enhancing observability within AIOps.
Key Considerations
When designing a high-precision system that involves LLMs, a few considerations need to be met.
- Eliminate Possibility for Hallucinations: LLMs need to be carefully guided to avoid generating inaccurate or misleading summaries.
- Extract the Most Critical Information: The ability to distill information to its most important elements is key to effective summarization in AIOps.
- Weave in Domain Expertise: Integrating domain-specific knowledge ensures that the summaries produced by LLMs are relevant and actionable.
The Blueprint: Crafting a Robust Framework for LLMs in AIOps
To deploy LLMs effectively in AIOps, we need a layered approach. As we have discussed, using LLMs in observability is not going to be as easy as bolting one onto your production environment. We are at a point with the still-evolving technology where we need to set it up for success and limit its scope.
Using LLMs to help investigate root causes requires a “micro summarization” effort upfront. This could be described as the context-gathering phase, during which everything the LLM will need–and nothing more–is collected and formatted into a prompt. The purpose of this micro summarization is to enhance the performance of later stages of the summarization process.
The workflow then looks something like the following:
In observability, we are always watching for certain triggers, such as unexpected latency or downtime. As soon as an incident occurs, our systems collect related data on all sorts of important events happening at that time, including anomalies, correlated issues, errors, failures, and so on, using the environment topology as a guide. This is essentially the first round of summarization, as it focuses the scope on a particular area or areas. Knowing what information is important and from which systems is a skill tuned over years of experience by domain experts, such as senior SREs or tenured developers. Our system is the accumulation of this domain expertise.
Once the right row-level data is gathered, it is not yet ready to be fed into the LLM. It must first be converted into an LLM-friendly text prompt. This process involves careful categorization to maintain the integrity of the information and prevent context loss.
A model alone cannot distinguish between causes, effects, and side effects. Neither can the average person. It takes someone who is highly tuned and experienced in processing complex relationships between components. A skilled SRE has a mental map of the system and an inherent scoring mechanism they use for prioritization. Building on this wealth of domain expertise, we developed a series of rules to instruct the LLM on what information and correlations to prioritize above others. The resulting scoring and prioritization system equips the LLM with further context, making its responses more accurate and useful. At this point, we have what we need to tell the LLM, “This is the data we gathered, and here is how to prioritize it.”
Running every prompt fully through the LLM would be inefficient. Some prompts may have already been run, or similar enough issues have been summarized. In those cases, we don’t want to actually have the LLM create a new summary from scratch. Instead, we first consult our incident database. If there is a close match, we return the existing summary.
If no match is found, the prompt is fed into the LLM, which generates a summary and proposes potential causes. The LLM’s output is then reviewed and validated by human experts before being added to the incident database. In the case of meta-incidents, where an SRE may need to review multiple summaries at once, a meta-summary–basically a summary of connected summaries–is available.
Peering into the Future: The Evolving Role of LLMs in Observability
Future developments in LLM technology will likely focus on improving contextual accuracy and reducing the occurrence of hallucinations. But, the road to get there involves integrating real-time feedback loops where human experts continually train and adjust the models.
In our case, expanding the incident database and refining the prioritization rules will enhance the LLM’s ability to generate useful summaries in increasingly complex environments.
In Summary
Summarization is undeniably one of the most important superpowers LLMs bring to the table. But, in order to leverage it effectively in high-precision fields like observability and root cause analysis, we must create systems that give LLMs context and specific instructions. Broaden the scope too much and the LLM will not be able to say anything specific about an issue. Provide too little context, and the model may hallucinate. We built a system that incorporates domain expertise into the flow, so the model can intelligently point SREs in the right direction and allow them to more easily communicate within their organization.