Layered Defense in a Box: Multiple, correlated statistical analysis modules detect attacks that other defenses miss
OWASP defines the attack surface of a web application as the sum of all paths for data and commands into and out of the application, combined with the code that protects them and the data behind them. Obviously for organizations with even a moderately complex website, this comprises a vast area—and attacks can occur anywhere on it. This is why Cayuga Networks developed the a next-generation solution with multiple layers of web application attack detection.
Each of these attack analysis modules delivers deep inspection and behavior analysis of web traffic to protect against malicious application activity. In addition, they share suspicious events with a centralized intelligence module, which aggregates evidence to determine the relative risk of any anomaly they detect.
How it works — Finding needles in a rapidly moving haystack
Network and Application Layer processing
Cayuga Networks next-generation web app protection employs a very fast detection engine to observe the network packet stream and detect various TCP and IP evasion techniques. It then decodes evasions, including any overlapping segment and fragment evasions. The detection engine also performs many important post-parsing actions, including Directory Traversal evasions, URL evasions, Hex and Unicode evasions. It employs code flow analysis CFA™ technology to inspect inbound network traffic and sessions.
When CFA finds indication of hidden attack code, it analyzes the grammar of the suspect string across multiple languages—in parallel across the whole packet stream at web speed—to identify the likely language and verify that the transitions are grammatically allowed in the observed language. Using Machine Learning, CFA infers if each incremental snippet in the stream contains valid code in the respective languages. Then, it recursively updates the probability that it is grammatically correct until it can confidently confirm or deny that it is valid code. Obfuscated code is then decoded at this level before it is delivered to the statistical analysis modules.
The statistics engine employs numerous statistical analysis modules to correlate events for addresses, URLs and domains. It continuously calculates statistics on more than 100 types of suspicious indicators to determine an Anomaly Score. It also correlates network statistics to compute a stress index for all URLs and domains. Additionally, the statistics engine learns the traffic pattern and history of the various domains and URLs.
Code Detection and HoneyPot Orchestration
Code detection is critical because code found in HTTP requests is highly indicative of an attack. When the detection engine determines a high probability that a snippet contains valid code, the Decisis honeypot orchestrator automatically sends the snippet into a sandboxed detonation chamber that has the same versions of application stack components and even the same code as the customer’s environment but lacks the important data. The orchestrator may then safely replay the code to see what it does and whether or not it poses a credible threat.
Needle and Bot Detection
Needle Detection enables custom or ad hoc inspections and rapid deployment of critical newly discovered attack vectors. Bot Detection offers an important clue to the nature of the web visitor. Techniques employed include verifying browser-level behavior; checking cookie support; and examining the HTTP headers for bot indicators. For search agent bots like those employed by Google we perform additional confirmation steps.
Machine learning does the heavy lifting in determining which anomalies to prioritize. Every detection module contributes Events that are associated with suspicious attack indicator. The Decisis Inference Engine then centralizes the suspicious events, correlating indicators of attack across modules. As the incriminating evidence accumulates past critical thresholds, it elevates events to Anomalies. Anomaly scores, used in prioritization leverage multiple algorithms based on various features and using a statistical framework to calculate. This scoring mechanism determines which anomalies are most interesting and merit automated validation or active investigation.