computers that learn to spot the odd one out

te

ch

no

lo

gy

37In

fosecu

rity Tod

aySeptem

ber/October 2006

Instead of programming a computerwith a million instructions to cover

all eventualities, couldn't you simply setan application loose and teach it howto do its job rather like you mightteach your toddler to use a potty?

Impossible? Not so. In fact, the disci-pline of Machine Learning (ML) hasbeen making steady, if under publicised,progress for years.Many successful MLapplications have been developed, fromdata-mining that determines where andwhen financial fraud has occurred, tocars that learn to follow the road with-out the help of a human driver.

Why these fascinating applicationsshould remain largely unknown as ex-amples of ‘computers that learn’, isbecause the benefit is not in thelearning itself, but in the effects ofthe learning, says Dr Steve Moyle, ex-ternal lecturer on the software engi-neering programme at OxfordUniversity, CTO and founder of start-up information security firm, Secerno.“How much stuff do you use, but notknow how it works?” he asks.

“Businesses want business solutions,”says Michael Azzoff, analyst at ButlerGroup. He believes that machine learn-ing companies have learnt the hardway — through commercial failure andassimilation — to concentrate on bene-fit.That ML algorithms and techniquesare now embedded in search-routines,counter terrorism products and datamining applications shows how the industry has matured, he adds.

The maturity has come because MLtackles problems that traditional pro-gramming could not touch. Situationsin which reducing a problem to logicalprogram statements, executed one af-ter the other, can never work becausethe logical depths cannot be fathomed.

“You can't specify exactly and com-pletely what you're looking for.You'relooking for anomalous patterns of be-haviour - something that's not normalor expected,” explains Dr RichardOverill, Department of ComputerScience, King's College London.

Machine Learning is applicable“wherever you can't stipulate preciselywhat is, or is not, acceptable behaviourwithin a system, be it anti-malware,intrusion prevention, fraud detection,anti-terrorism, or any comparable situa-tion,”adds Overill.

Exactly this condition arises in financial organizations where criminals constantly adapt theirtechniques such that defenders areleft guessing the timing and form of the next attack.And learning systems have been given a boostwith the emphasis on compliance

and regulation. For example, finan-cial companies are now forced torun processes that can identify andreport ‘suspicious’ activity.

The difficulty is in defining whatis suspicious, but “this is wherelearning can come into its own,”says Rosemary Turley, director ofmarketing at Norkom technologies.Programs can be taught to recogniseknown examples of suspicious behaviour and then infer if a newsituation is suspicious based onwhat is already known.

This can even extend to watchingcash machines,Turley explains.Withdrawals up to 200 Euros mightbe perfectly normal, but that amountremoved every two minutes is mostdefinitely suspicious, she says.

Computers that learn from experience populate science fiction novels andmanage dystopian cities. But imperceptibly, the technology has taken acentral part in information security, and this role is set to expand.

Computers thatlearn to spot theodd one out

William Knight

“Recognise pornography as

easily as a human eye”

King’s College, London’s Richard Overill:ML for the anomalous

te

ch

no

lo

gy

38In

fosecu

rity Tod

aySeptem

ber/October 2006

Worldwide, it's certain that adaptivetechniques are needed.According tothe latest Deloitte global security sur-vey, more than three-quarters of theworld's leading 150 finance groupssuffered a serious security breach inthe last 12 months.

David Lacey, Honorary Fellow ofthe Jericho Forum and freelance se-curity consultant believes machinelearning techniques will becomeever more widespread.“You couldbe interacting with millions of de-vices that are adapting and learning.An intelligent monitoring systemwill give you a lot more variety andallow you to control more complexsituations - there is no other way,” hesays.

And this realisation is dawning inareas away from finance.The contentmonitoring and filtering (CMF) mar-ket is scrabbling to include learningalgorithms in its products.With over

400 million domain names registeredand as many as 10 per cent givenover to pornography, databases ofwhitelists and blacklists cannot hopeto keep up to date, says NickOutteridge, director, OEM TechnologyPartnerships, Puresight.

“Algorithms that recognisepornography, or recognise sensitiveinformation as easily as a human areneeded,” he adds.

But these algorithms remain ex-tremely technical. No user is interest-ed in understanding their sexual-con-tent filtering system uses a “pre-trained, back-propagated artificialneural network.”An organization justneeds to know that 98% of all pornsites will be filtered out — that's sell-ing the benefit.

So learning algorithms are createdby academic companies, soldthrough OEM licence and embeddedin UTM devices and security soft-ware suites. Unbeknown and un-sung, devices and applications arelearning how to protect us.

“People want a box to manage,”says Outteridge, and he seems wellplaced to benefit as an OEM supplierof learning content-filters. CMF com-panies are queuing up to apply ML totheir products.

“We can see the value in emergingdynamic content recognition tech-nologies, which analyse web pagesand categorise them in real time.Numbers we have seen suggest thesecan catch more than 97% of pornog-raphy, quickly and accurately,” saysMike Clark, MD of ApplianSys, CMFsupplier.

And Nick Kingsbury, CEO ChronicleSolutions, plans to licence learning al-gorithms to “deduce relationships thatare not immediately obvious,” and torecognise behavioural problems, forexample, damaging gambling habits.This will be added to Chronicle's content capturing solution.

But there is another aspect to MLthat is worthy of exploration: ratherthan a broad brush perimeter de-fence, ML solutions can wrap tightlyaround an application, forming a mi-cro-perimeter.

Secerno's Moyle, fervently believes inthis approach, and the company's firstproduct is tightly targeted at protectingSQL databases.“There are so many ele-ments to IS, that you have to focus onspecific elements and specific assetsthemselves. If you try and put state ofthe art learning at the perimeter andwatch every packet go past on the net-work, you've got very little hope.”

His ideas strike a chord with Lacey:“The future is about, decoupling securi-ty from the infrastructure level andmoving it to the data level.And maybe,in between, move it to the applica-tions. Because things will change fasterand faster, there is no alternative but toinvest in machine learning systems.”•William Knight is a technology writerwith 18 years experience in SoftwareDevelopment and IT consulting. Hewrites for titles that include: Computing,JavaPro and Gantthead.com.

“There is no alternative but to invest in machine

learning”

Secerno’s Steve Moyle: There's no hopeat the perimeter

Important MachineLearning TechniquesArtificial Neural Networks (ANN)Taking inspiration from the brain by model-ling neurons and the millions of connectionsbetween them. Excellent at analysing noisyand complex data and flagging target orodd examples in large data sets.

Genetic Algorithms (GA)Modelling Darwin's theories of survival ofthe fittest, Genetic Algorithms find best-fitsolutions in a wide landscape of possibilitiesand ‘mate’ or ‘mutate’ good solutions to‘evolve’ better solutions.

Computational Immunology (CI)Drawing inspiration from the human immune system, anti-body agents detectand attack anomalies in information flow.Useful agents are evolved while less success-ful agents are killed off. CI utilises other MLtechniques to achieve its goal and whilelargely experimental at present, commercialexamples are expected within 5 years.

Bayesian LearningCalculates a specific probability that a hy-pothesis is correct based upon previous ex-amples. Bayesian solutions are among themost practical approaches for automaticallyassigning categories to nouvelle examples.

Decision Tree LearningA widely deployed and practical method ofinferring information from examples usingsimple if-then rules in a top-down tree.Particularly useful where errors are likely inthe training examples.

David Lacey: ML applicable when youdon't know what to look for

http://Gantthead.com

computers that learn to spot the odd one out

Documents