Case Study: High-Performance Keyword Scanning

Activity Summary

A fast-growing provider of email security solutions was looking to further improve the speed and accuracy of their text scanning engine. This involved a rewrite of their core lexical parser routines, developed in portable C++ to achieve maximum efficiency. They contracted Delera Systems to create an innovative, flexible scanner to be embedded in their line of products.

Problem

Our client is key player in the email security solution market and creator of the industry's first 64-bit security appliance. With a number of new products under development by their in-house staff, they went looking for an outsourcing partner with a very specialized skillset: know-how in the area of lexical parsing and text scanning with complex expressions.

The performance, stability and portability requirements for the scanner were extremely high. Suspect keyword scanning consumes a large amount of CPU time (and sometimes, memory). The code would need to run in 32-bit and 64-bit systems, each hour handling hundreds of thousands of corporate emails and other documents. Because of the firm's focus on scanning appliances, the engine had to deliver superior performance in a wide variety of environments.

Solution

Delera Systems built a custom scanner engine from the ground up. Our consultants analyzed the specific requirements, the volumes of data that would need to be handled, and the main processing costs. We came up with a solution that was both very fast (not relying on regular expressions) and very flexible (processing multiple-word conditions with proximity clauses, negative and positive scoring and query operator grouping)

The scanning engine was delivered as portable C++ source and was thoroughly tested on various platforms, including some 64-bit configurations. Its impressive performance and low defect rate ensured the smooth transition of our client's codebase to the new scanning technology.

Service Offerings

The following Delera Systems service offerings were applied to this project:

Technologies

  • C++ (cross-platform g++ compiler)
  • Flex and Bison - driven query interpreter
  • Various 32-bit and 64-bit Linux-based environments

 
Copyright (C) 2005-2007 Delera Systems. I Contact I