b3af84c8cedae0b735d896bd1ad0103f8c89ff36 — terceranexus6 1 year, 1 month ago 4b2d689
adding docs to explain the use of data
1 files changed, 25 insertions(+), 0 deletions(-)

A docs/making_sense_of_data.md
A docs/making_sense_of_data.md => docs/making_sense_of_data.md +25 -0
@@ 0,0 1,25 @@
# Making sense of data

## Pivoting malware

The whole concept of using STIX is so it's easier to handle relationships among indicators. In this case, I'm pivoting from malware. When creating `stix` files, I create new UUIDs for each indicator as well as for each relationship BUT the UUID of the malware remains the same. That way, if I look for stuff related to a malware, I can see the whole thing. The malware indicators should be stored in `malware_IDs.csv` file, following this structure:

<malware name>,<UUID>

The STIX UUID has a very particular structure, something like `<type>--<UUID>`. If you want to create a new malware to pivot around, you can use the `create_id.py` script, simply use as an argument the type which (following this logic) should be "malware":

python3 create_id.py malware

And then use it in the `malware_IDs.csv` file.

## Making sense of hashes

The hashes will likely be related with hacktools, encrypted files, scripts, miners, etc. The most important facts are the name, description and it's -hash- value. Sometimes some of them are not entirely malicious, but definitely suspicious (this happens a lot in suspicious ads which are related to malware downloads, for example). This is why we also use the IoC type value, as well as the hash type.

The rest is relevant information of the malware which is supossed to be related with the IoC. Since this information might be redundant in the same CSV, a next version will be fixed to avoid this redundancy. 

## Making sense of domains