are there any areas artificial - c-mric.com · 2019. 11. 22. · artificial intelligence (ai) –is...

Are there any Areas Artificial Intelligence couldn’t be

applied to?

Dr Cyril OnwubikoC-MRiC.ORG

Invited Guest LectureKingston University, London (KU), Surbiton, Surrey, UK

22nd November 2019

AI, ML, DL and RL are being applied to most everyday aspects of our lives, ranging from Autonomous Vehicles (Self-driving cars), Social Media, Cyber Security to Digital Imagery for Clinical Diagnosis.

It’s arguable to see What and Where AI could not be applied.

Abstract

Rules Engines, Expert Systems, Knowledge Graphs, or Symbolic AI. Collectively, known as the Good, Old-Fashioned AI (GOFAI)

Context

New AI e.g. Neural Networks, Deep Neural Networks, NLP, Reinforcement Learning and Deep Reinforcement Learning etc. (New-AI)

Building BlocksArtificial Intelligence (AI) – Is a branch of Computer Science that ‘trains’ computers to behave intelligently by carrying out tasks that normally required human intelligence, e.g. visual perception, speech recognition, etc. It encompasses ML, DL, RL, DRL, NLP and Stochastic process, decision and predictive analysis. AI – is simply “the Science and Engineering of making intelligence machines” – John McCarthy

Deep Learning (DL) Is a subset of Machine Learning that does well in identifying patterns in unstructured data, and are used mostly in Computer Vision for image identification, object detection and image classification. Simply put, it means more accuracy, more math, and more compute because of the deep hidden layers.

Machine Learning (ML) Is a branch of AI that are generally regarded as Programs that alter themselves.

ML has the ability to modify itself when exposed to more data and/or different data set. E.g. Neural Networks

Building Blocks

Deep Reinforcement Learning (RL) Is a subset of RL that applies Deep Learning. Reinforcement learning is usually described using concepts such as agents, environments, states, actions and rewards and penalty.

Reinforcement Learning (DRL) Is a branch of AI that focuses on goal-oriented algorithms to solve complex tasks through ‘reward’ and ‘penalty’ akin to Game Theory.

These algorithms are rewarded when they make good decisions and penalised when they make bad ones – hence the word reinforcement. E.g. AlphaGo

Example of A Deep Neural Network for DL

Some AI Use CasesVoice

recognitionAutonomous

CarsVoice Search

Cyber Security

Fraud Detection

Social MediaPredictive Analysis

Credit Rating

Network Distribution

Preference Recommendation

AnalyticsFacial

RecognitionImage

Recognition

Motion Detection

Threat Detection

Sentiment Analysis

Imagery Identification

ResourceAnalysis

Games

Load Balancing

General Principles & Concepts

© 2019 C-MRIC

ML Problem Space

ML AlgorithmsFeature

EngineeringData Set

(Training & Testing)

Data Sets• Ground Truth Data is still an issue

• Data has become relatively available in comparison to previous decade. E.g.

• ImageNet

• MNIST

• Kaggle

• CRAN

• UCI UC Irvine Data Set

• CTU-13

• VisualData

Publicly available

Artificially generated

https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research

Feature Engineering

Feature engineering, while generally known to boost classification metrics, however, it creates a need for substantial investment of expensive and scarce data science expertise. We find that reliance on domain expertise and feature engineering severely inhibits the feasibility of applying existing correlation and filtering methods in practice [2]

[2] Egon Kidmose – PhD Thesis –Network-based Detection of Malicious Activities – A Corporate Network Perspective, 2018

• Cost – Expensive

• Domain Experts / Domain Knowledge

• Data Scientist

• Complex and Convoluted

• Tedious

Parsers /Plugins

UserDataFieldsFeatures

Features, Reliability & Relevance

[3] SSDeep Project – context triggered piecewise hashes (CTPH) - https://ssdeep-project.github.io/ssdeep/index.html

Indicators of Compromise (IoC)

• Constantly evolving features

• Feature Reliability & Relevance

Features Features Features Features

IP Address (IPv4 or IPv6)

FilePath FileHASH-SHA256 Encrypt-AES256

Domain CIDR FileHash-IMPHASH Encrypt-AES128

Hostname CVE Mutex Encrypt-AES224

FQDN Email SSDeep /CTPH [4] Encrypt-DES

URL FileHash-MD5 GeoIP Encrypt-Unknown

URI FileHASH-SHA1 DNS YARA

Applications in Cyber Security

© 2019 C-MRIC

User behaviouranalysis

Insider threat detection

Malware familyidentification

Network trafficprofiling

Malware detection

Spam filtering

Network anomaly detection

C2detection

Unsupervised(no labels)

Supervised(with labels)

Incremental(Learn

continuously)

Batch(Learn only once

or in discrete steps)

The problem one is trying to solve dictates the data,feature and ML algorithms used [4][4] Scott Miserendino, BluVector Inc.

Cyber Security Case Studies• Malware Detection

• Malware detection, network-based attack detection and code detection and C2 & Bots

• Profiling & Security Analytics

• User & Entity profiling, behavioural analytics, big data, security and web analytics etc

• Cyber Security Operations Centre

• Security Monitoring, Flow Analysis, Log Collection & Collation, Correlation, Analysis, and Cyber Incident Management and ‘Human-in-the-loop’ etc

Malware Detection• Intrusion detection systems (IDS) using Rule-based,

Heuristic, Machine Learning - Supervised & Unsupervised Learning

• Detect Bot / C2 (Command & Control)

• Correlating Intrusion detection alerts on Bot malware infections

• Content inspection – content is inspected against feature set of malignant (malicious) content

• Temporal heuristic & behavioural analysis

• Featureless Engineering for Anomaly detection

Featureless Engineering & without Domain Expertise

• A case study of Filtering & Correlation of IDS Alerts

• IDSs produce high volumes of logs/alerts, raise high false positives

• Reliability / Trust?

• Alert ≠ Incident

Conclusions• AI, ML, DL & DRL have been applied to solve

many real-life problems ranging from autonomous vehicle to malware detection to correlation of alerts and behavioural analytics.

• Cloud computing, advances in GPU, and possibly Quantum Computing, and the availability of Test data sets, e.g. ImageNet, MNIST etc have helped industrialise AI applications.

1

Conclusions• AI does not have all the answers! Context

information is elusive with AI/ML-based solutions.

• AI will always find something, and whether what AI has found is correct, or can be explained is still a cause for concern. E.g. Biased algorithms, ‘black-box’, unrelated contextual interpretations.

• Human-in-the-loop is, and will continue alongside most applications of AI.

• We see a cooperative and collaborative Human-to-AI applications

2

Cyber Science 2020 Conferencehttps://www.c-mric.com/conferences

Thank – You!

Centre for Multidisciplinary Research,

Innovation and Collaboration

(C-MRiC.ORG)

[email protected]

@CMRiCORG

www.C-MRiC.ORG

© 2019 C-MRIC

are there any areas artificial - c-mric.com · 2019. 11. 22. · artificial intelligence (ai) –is...

Documents