Encrypted or not, your AI chats aren’t private
Modern LLMs typically return answers through a token-by-token streaming process, where each token is packaged into a separate encrypted message. While the text itself is unreadable under TLS, the timing and size of packets can still be observed. The researchers show that this metadata follows consistent patterns tied to the content being generated, which creates a side channel for sensitive inference.
Microsoft researchers have uncovered a critical privacy weakness in the way large language models deliver streaming responses over encrypted connections. The findings show that an attacker who can observe traffic patterns, without breaking encryption, can determine whether a user is discussing highly sensitive topics with an AI model. The researchers argue that the threat is realistic, scalable, and currently affects nearly every major commercial provider.
The study, titled “Whisper Leak: A Side-Channel Attack on Large Language Models” and published as an open-access preprint, analyzes how network-level metadata generated during LLM streaming reveals hidden information about user queries. It shows that attackers do not need to read or decrypt content. They only need to log packet sizes, sequence timing, and transmission patterns that emerge as the LLM generates each token.
A side-channel attack built from encryption’s blind spot
Modern LLMs typically return answers through a token-by-token streaming process, where each token is packaged into a separate encrypted message. While the text itself is unreadable under TLS, the timing and size of packets can still be observed. The researchers show that this metadata follows consistent patterns tied to the content being generated, which creates a side channel for sensitive inference.
The threat model assumes a passive network observer with access to the encrypted traffic stream. This could be an ISP, a compromised Wi-Fi hotspot, a national monitoring system, or any party capable of recording internet packets without altering them. Because the attacker does not need to modify traffic or break encryption, the approach resembles long-standing side-channel attacks that exploit timing, power consumption, or cache behavior in cryptographic systems.
The researchers designed the experiment to mirror a realistic privacy threat by training models to answer a simple binary question. The goal is to detect whether a user is asking about a sensitive topic, such as the legality of a criminal activity. To do this, they prepare two categories of prompts: a small cluster of sensitive-topic questions and a much larger, diverse set of unrelated queries. The design reflects real-world conditions in which sensitive prompts are rare and buried within massive volumes of ordinary user interactions.
The team then collects encrypted traffic traces for all interactions across 28 different LLMs from major providers and converts each session into a sequence of TLS record lengths and inter-arrival times. This metadata forms the raw input for the classification models.
Machine learning models extract hidden meaning from packet patterns
To analyze this encrypted network data, the authors train three distinct machine learning models, each targeting different structural patterns:
- A gradient-boosted decision tree model that ingests flattened numerical features extracted from packet sequences.
- A bidirectional recurrent neural network with an attention component that learns temporal dependencies across tokens.
- A modified DistilBERT model that treats discretized packet characteristics as tokens in a sequence classification problem.
The authors note that these three approaches cover a wide spectrum of analytical techniques: classical feature-based learning, deep recurrent time-series modeling, and transformer-based sequence analysis. Despite the variation in architecture, all three models produce strong results.
The primary metric used is the area under the precision-recall curve, which is appropriate for imbalanced datasets where sensitive prompts are rare. Across nearly all providers tested, the performance reaches extremely high levels. Many models achieve AUPRC values above 98 percent, and several approach near-perfect discrimination even when only packet sizes are used as input.
The authors also test a high-imbalance scenario designed to reflect the likely real-world ratio of sensitive queries to generic ones. Even at a ratio of ten thousand generic prompts for every sensitive one, the models maintain high precision at non-trivial recall levels. This means an adversary could reliably detect some sensitive queries with essentially no false positives.
The findings conclude that packet-level metadata forms a stable fingerprint that correlates with the content being generated. Even though the text is encrypted, the underlying generative process leaks patterns that effectively identify the topic of conversation.
Reproducible results across nearly every commercial provider
The researchers test 28 models, including multiple variants provided by major companies. The results reveal consistent vulnerabilities across the industry.
The study also highlights differences between providers based on how responses are streamed. Models that send one token at a time create strong, predictable sequences that attackers can easily analyze. Some providers that already use token batching show reduced leakage, but even these are not fully protected.
The experiment also includes a larger data collection run on a popular model to test how classification accuracy scales with more training samples. The findings show that performance continues to improve as the dataset grows. This suggests that real attackers, who can continuously collect traffic over months or years, would likely achieve even higher success rates than reported in the paper.
According to the study, this is not an exotic form of attack requiring specialized equipment. Packet logging tools like tcpdump are sufficient, and attackers need no access to internal model details, source code, or provider configurations. The risk emerges from the fundamental design of streaming generative models.
Mitigation strategies show partial protection but no complete fix
The researchers evaluate three mitigation techniques that LLM providers can implement at the server level: random padding, token batching, and packet injection. Each option targets a different part of the side-channel vulnerability.
Random Padding
This method adds noise to the size of each packet by inserting extra data. Although padding weakens the link between token length and packet size, the reduction in attack accuracy is modest. Many models still show high classification performance because timing signals and overall size trends remain intact.
Token Batching
Batching multiple tokens into each message removes fine-grained timing patterns and smooths the sequence of packet sizes. Simulations show that batching can significantly reduce side-channel leakage for certain models. However, it is not universally effective. Some models continue to leak enough structural information for high-confidence classification.
Providers also face a tradeoff: batching increases latency and changes how responses feel to the user. It also complicates streaming-based applications that rely on gradual output.
Packet Injection
Injecting fake packets attempts to disguise both timing and size patterns. While this reduces leakage further than padding alone, it requires substantial bandwidth overhead, sometimes doubling or tripling the data sent to the user. Bandwidth spikes may be unacceptable for providers serving global, high-volume traffic.
The authors conclude that while these techniques help, none eliminate the leak. Partial mitigation may impede casual attackers, but determined adversaries with large training datasets and long-term access to traffic would still succeed.
Responsible disclosure shows mixed industry response
The paper documents an extensive coordinated disclosure process beginning in mid-2025. The researchers privately notified all companies whose models were tested, providing technical details and suggested mitigations. Some leading providers, including OpenAI, Microsoft, Mistral, and xAI, implemented padding or other defenses shortly after disclosure. Others did not respond or chose not to act.
The responses varied widely. Some companies were receptive and acknowledged the risk. Others argued that packet-level metadata is inherently difficult to protect and expressed uncertainty about the feasibility of comprehensive mitigation.
The study states that the vulnerability persists across the ecosystem even after these partial fixes. Since the attack relies on fundamental properties of streaming generative models, the authors argue that future research should explore more radical architectural changes, such as compressing output into batches by default or rethinking how streaming protocols are structured.
- READ MORE ON:
- LLM privacy risk
- encrypted traffic leak
- side-channel attack AI
- Whisper Leak study
- large language model security
- TLS metadata vulnerability
- AI chat privacy
- packet timing analysis
- network traffic inference
- generative AI security
- machine learning side channels
- AI streaming vulnerability
- cybersecurity research 2025
- LLM data protection
- encrypted communication risks
- FIRST PUBLISHED IN:
- Devdiscourse

