Home / IT Security & Compliance / Eurostar’s AI Chatbot Exposed by Basic Security Flaws

Eurostar’s AI Chatbot Exposed by Basic Security Flaws

Dec 23, 2025

Samuel DuvainsSoftware Integration Advisor

The rapid integration of artificial intelligence into customer-facing services has created a new frontier for digital interaction, but it has also exposed a concerning trend where the allure of cutting-edge technology overshadows the necessity of fundamental cybersecurity. A recent in-depth analysis of Eurostar’s new AI chatbot serves as a stark illustration of this very problem, revealing that the sophisticated Large Language Model (LLM) at its core was left vulnerable by a series of elementary security oversights. The investigation, conducted by cybersecurity firm Pen Test Partners, uncovered how classic web and API vulnerabilities were not only present but amplified by the AI’s architecture. This case study highlights a critical lesson for the industry: no amount of advanced AI can compensate for a failure to implement and enforce the foundational principles of digital security, a lapse that can turn a helpful tool into a significant liability.

The Illusion of Security and a Critical Flaw

At first glance, the architecture of Eurostar’s chatbot suggested a thoughtful approach to security, designed to manage the unpredictable nature of an LLM. The system was built on a REST API that processed the entire conversation history with each turn of dialogue, a common practice for maintaining context. To safeguard interactions, the design incorporated a system of guardrails intended to filter out disallowed or malicious content before it could reach the AI model. Furthermore, each message that successfully passed these checks was given a cryptographic signature, theoretically providing a verifiable indicator of its integrity. This multi-layered approach was meant to create a secure environment by validating user inputs and controlling the AI’s outputs. However, the theoretical strength of this design was completely undone by a critical oversight in its real-world implementation, creating a facade of security that crumbled under scrutiny.

The entire security framework hinged on a single, catastrophic flaw in its server-side enforcement. While the chatbot meticulously sent the complete conversation history to the backend with every new message, the server was programmed to verify the cryptographic signature of only the most recent entry in that conversation. This oversight created a gaping loophole. An attacker could easily craft a dialogue where the final message was entirely benign—a simple “Hello,” for instance—which would pass the guardrail check and receive a valid signature. Using standard client-side development tools, the attacker could then intercept the request and alter any of the previous messages in the conversation history, injecting malicious prompts or commands. Because the server only validated the last message, the tampered, malicious content in the earlier part of the dialogue was accepted as trusted context and fed directly into the LLM, effectively rendering the entire guardrail and signature system useless.

Exploiting the Breach from Prompt Injection to Data Leaks

Once the primary security mechanism was bypassed, the researchers gained direct, unfiltered access to the underlying LLM, enabling a classic prompt injection attack. This allowed them to steer the chatbot far from its intended purpose of providing travel assistance. While the vulnerability did not expose other users’ personal data, it did allow for the successful extraction of sensitive operational details from the system. By carefully crafting their injected prompts, the security team compelled the model to reveal its own system prompt—the core set of instructions that defines its personality, constraints, operational rules, and response formatting. The leakage of a system prompt is particularly dangerous because it provides an attacker with a detailed blueprint of the AI’s internal logic, making it far easier to craft more sophisticated and targeted attacks in the future, especially as the chatbot’s capabilities might expand to handle sensitive functions like booking modifications or personal data retrieval.

The consequences of the initial breach cascaded into other serious vulnerabilities. The chatbot was designed to format some of its responses using HTML, such as embedding hyperlinks to help articles, but it failed to sanitize the output generated by the LLM before rendering it in the user’s browser. Capitalizing on the successful prompt injection, researchers were able to trick the model into generating and returning arbitrary HTML and JavaScript code instead of legitimate support links. This flaw created a severe Cross-Site Scripting (XSS) vulnerability. A malicious actor could exploit this to inject malicious scripts, sophisticated phishing links, or other harmful content directly into the chat window, delivered from the trusted Eurostar domain. This would make users far more likely to interact with the malicious content, potentially leading to session hijacking or credential theft. Further investigation also uncovered weak validation of conversation and message IDs, which could theoretically be leveraged to execute a stored XSS attack, replaying a malicious payload into another user’s session.

A Broken Disclosure Process and Broader Implications

The technical failures discovered in the chatbot were significantly compounded by a deeply problematic and poorly managed vulnerability disclosure experience. Despite following the company’s published Vulnerability Disclosure Program (VDP), the initial report submitted by Pen Test Partners in June went unanswered for weeks, as did subsequent follow-up attempts. It was only after a colleague of the researcher resorted to using a private message on LinkedIn to escalate the issue that a response was finally received from Eurostar. This lack of a timely and professional communication channel immediately raised concerns about the company’s commitment to working with the security community. The friction highlighted a critical gap between having a VDP on paper and having a functional, responsive process in practice, undermining the very trust such programs are meant to build.

Eurostar’s eventual response revealed an alarming breakdown in its internal security processes. The company claimed it had no record of the initial disclosure, admitting it had outsourced the management of its VDP after the report was submitted and, in the process, had apparently lost an unknown number of incoming vulnerability reports. More disturbingly, at one point in the exchange, Eurostar’s team suggested the researchers were attempting blackmail—a baseless and unprofessional accusation given that no threats had been made. This reaction demonstrated a profound misunderstanding of good-faith security research. This entire ordeal underscores how poor process management and a confrontational attitude can severely damage relationships with the security community, ultimately hindering a company’s ability to protect its customers. It serves as a powerful reminder that technical security and procedural integrity must go hand in hand.

Lessons for the AI Era

The security lapses identified in Eurostar’s system were not unique; they were representative of a broader industry trend where the rush to deploy AI has often outpaced the implementation of fundamental security measures. The incident reflected a common mindset that treats the underlying web and API infrastructure as secondary to the novelty of the LLM itself. As a result, classic vulnerabilities that have been understood for decades were left open, creating pathways for exploitation that the AI, by its nature, could not defend against. This case made it clear that securing an AI-driven application requires a holistic approach, where every component, from the user interface to the backend API and the model itself, is subject to rigorous security validation. Trust in a digital service is not component-specific; a failure in the chatbot could erode a customer’s confidence in the entire brand, demonstrating that security is an integral part of the user experience.