Artificial Threat Intelligence

To explain how AI and machine learning can benefit cyber security, it helps to have an understanding of how machines make sense of large volumes of data. Typically, they do this using knowledge representations such as ontologies. Simply put, ontologies are systems made up of distinct “stuff” known as entities, and their relationships with each other. For instance, the following is a very simple ontology of different types of coffee.

In this case, a simple Venn diagram, the individual ingredients are entities, but they form an ontology which includes a set of relationships.

From this ontology, we can determine that a combination of milk, coffee, and foam becomes a latte, for example.

Use of knowledge graphs in cyber security for Artificial Threat Intelligence

In the realm of cybersecurity, ontologies are used to represent the real world inside a machine-learning environment. In this ontology you can see malware sitting at the center, surrounded by various other entities that could relate to that malware.

For example in below image the entity ‘MalwareCategory’ could be a banking trojan or a worm, and the ‘AttackVector’ entity might indicate either spam, or sequel injection, or a particular vulnerability the malware is exploiting.”

Using this type of ontology, a machine can begin to understand the real world — in this case, the threats faced by a network.
Of course, in order for a machine to build up a comprehensive picture, it needs to consume data, and classify any data points it recognizes that refer to these particular entities.

From there, the next step is to convert those entities into events, which in turn will have their own classifications, for example, the Attacker, or the Target, or the Method.

In this case a post on Twitter, is being analyzed using a simple rule-based approach.

This rule-based system is looking to identify entities that would indicate a cyber attack. It’s able to identify an operation, it’s able to identify two organizations, and it may also be able to identify a target.
This type of system offers huge benefits to the field of cyber security, where it can process a huge number of data points very quickly, irrespective of source language.

Share your thoughts