vahadruya / Capstone_EDA_Global_Terrorism_Analysis

A comprehensive analysis of the GTD, to uncover global terrorism patterns, trends, and impacts through data-driven analysis. Involves rigorous analysis of most used attack & weapon types; favourite targets; yearly distribution of casualties, no. of attacks, success rates, and more - both holistic and for specific countries and terror organizations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Global Terrorism Dataset Analysis

Table of Contents
  1. About the Project
  2. Dataset Description and Cleaning
  3. Data Analysis Strcture
  4. Insights from Data Analysis
  5. Libraries Used
  6. Contact

NOTE: Link to the final notebook in case it does not open in Colab. Otherwise use nbviewer to view plotly and folium visualizations.

About the Project

The Global Terrorism Analysis project aims to provide critical insights into the patterns and drivers of terrorism related incidents worldwide. Terrorism poses a significant threat to global security and stability, affecting countries, general public and communities across the world. By analyzing historical data and trends, this project seeks to understand the temporal trends, geographical hotspots, most prominent organizations and their attack types, and many more factors contributing to terrorism incidents.

Dataset Description and Cleaning

The dataset is quite vase with 181691 rows (each row representing a unique terrorist attack) and 135 columns. Several columns had over half missing values. They were all filtered out, and the necessary columns carefully selected out of the remaining. Their characteristics are described below: NOTE: The variables colour-coded with green background below are the "primary variables", i.e., variables which are primarily being used for the EDA.

image

A column named casualties was created from the columns of killed and wounded for each terrorist attack, representing the total impact of the terrorist attack. The total number of terrorist events, casualties, number of suicide attacks, and successful attacks are called the "Impact" metrics, which highlights the effect of terrorism.

Data Analysis Structure

The dataset is analysed in three separate sections, each of which contain multiple sub-sections. Various kinds of plots such as line-plots, bar-plots, combined line-bar plots, pie-charts, and geographical heatmaps have been used in this project, mostly using libraries such as plotly and folium. The sections are, namely:

  1. Global Analysis: In this section, the entire dataset is considered for a holistic analysis, for all countries of attack and all terrorist organisations, to understand the general characteristics of terrorist attacks and their effect on the populace. Both univariate and bivariate analyses is performed, for various features such as the attack type, yearly variation of attacks, target type etc, for all the impact metrics.
  2. Local Analysis This section involves analysis on a subset of the dataset, of specific countries of interest. These countries are chosen based on the heatmap and/or the plots which display hot-spots of terrorist activities. The countries chosen are Iraq, Afghanistan, Pakistan and India.
  3. Analysis of specific terrorist organisations: This section contains analysis of specific terrorist organisations, selected based on their notoreity, popularity and impact of terrorism. They are Taliban and The Islamic State of Iraq and the Levant (ISIL). Few other terrorist organisations such as Shining Path, Boko Haram etc. are also analysed briefly

Insights from Data Analysis

  • Number of terrorist attacks and the casualties increased exponentially in the 21st century, especially in the regions of Middle East, South Asia and Sub-Saharan Africa, peaking in 2014. The percentage of suicide attacks among the attacks in these regions also started increasing in the same period of time.
  • Bombings and Armed Assaults are an overwhelming favourite of terrorists as a method of their attack, with most of suicide attacks involving the former. Private Citizens and Properties, Military and Police were the most targeted institutions/places by the terrorists.
  • The terrorist hotspots in 20th century were primarily in the United Kingdom, El Salvador and Peru. The terrorist organisations of IRA, FMNL and Shining Path respectively were prominent within these countries.
  • The countries of Iraq, Afghanistan, Pakistan and India are the terrorist hotposts of 21st century, with highest number of attacks, casualties and suicide attacks. The one single terrorist event in USA by Al-Qaida was one of the most devastating terrorist attacks, claiming over 19,000 casualties in just 4 attacks.
  • The Islamic State (ISIL) is the most active terrorist organisation in Iraq with over 4000 terrorist attacks conducted by them, despite starting as late as 2013. They were also active in several areas of Turkey and Syria. Baghdad was the most targeted city in Iraq, and over a quarter of attacks were through suicide bombings.
  • The Taliban is the most active terrorist organisation in Afghanistan. They almost completely operate within the country, with a few attacks in border of Pakistan as well. Their primary target is the Afghan Police and Military
  • India has been witnessing several Maoist and Communist attacks in the 21st century, in the Naxalist infested states of Jharkhand, Chattisgarh etc. Most of the attacks are using Bombings and Armed Assaults, with very few suicide attacks unlike in Iraq.

Libraries Used

Pandas Matplotlib Seaborn folium plotly

Contact

Linkedin Gmail

About

A comprehensive analysis of the GTD, to uncover global terrorism patterns, trends, and impacts through data-driven analysis. Involves rigorous analysis of most used attack & weapon types; favourite targets; yearly distribution of casualties, no. of attacks, success rates, and more - both holistic and for specific countries and terror organizations


Languages

Language:Jupyter Notebook 100.0%