How to Quickly Plot Data with Python on your Computer

Here’s a quick tutorial on plotting data with python on your computer. Python is one of the easiest ways to plot and visualize data.

How to Quickly Plot Data with Python on your Computer
Subscribe to my newsletter
Sign up for my weekly newsletter where I share my thoughts on fintech, product management, culture, and travel.

    Here’s a quick tutorial on plotting data with python on your computer. Python is one of the easiest ways to plot and visualize data.

    Recently, I had some data I wanted to examine and plot quickly. My mind quickly jumped to Python as an easy way to explore the data and chart areas of interest.

    While it is easy to play with python in a web kernel (DataCamp / CodeAcademy/Kaggle), I wanted to be able to chart them on my computer. After searching for a few tutorials, I realized the information to do this is scattered across the internet.

    1. Installing Python

    The first part of the process is to install Python and the required dependencies on your computer.

    Easy Way – Anaconda

    Anaconda Front Page

    The easiest way to install Python is by installing the Anaconda Framework for Data Science. This works for both MacOSX, Windows, and Linux.

    https://www.anaconda.com/distribution/

    Screenshot of Anaconda Distribution Page

    You can download either installing Python 3 or Python 2 version. I personally recommend installing Python 3.

    Once the download is completed, you can launch the package installer and complete the installation.

    Installing Anaconda onto MacOS X Part 1
    Installing Anaconda into Mac OS X Part 2

    Hard Way – HomeBrew (MacOSX)

    MacOSX

    1. Check your python version

    $ python --version
    Python 2.7.3 :: Continuum Analytics, Inc.

    2. Install Xcode

    To properly run Hombrew on your Mac, you need to install Xcode as a dependency.

    Option A

    $ xcode-select --install

    Option B

    Go to the App Store and Install Xcode.

    3. Install Homebrew

    http://homebrew.sh

    In your Terminal paste the following command:

    $ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    

    To verify that the installation worked as expected, type the following command:

    $ brew doctor
    Your system is ready to brew.

    4. Installing Python

    To install Python, you call the brew command in your Terminal. You can install other packages besides python using brew.

    $ brew install python3 

    Once you have successfully installed python3, you can check the version to verify that the installation was successful.

    $ python3 --version

    To start a Python session type in:

    $ python3

    2. Selecting your Editing Experience

    For the rest of this tutorial, I’m going to assume you installed the Anaconda Package.

    1. Start Anaconda Navigator

    There are a couple of options in the Anaconda Navigator. Jupyter Notebook is the most common one which provides a great framework to do data analysis and annotate your steps along the way.

    For this tutorial, we are going to use Spyder which provides an IDE like experience to quickly iterate on our analysis. This is similar to R Studio or MATLAB.

    2. Launch Spyder

    3. Importing your Data

    First, we want to add the required dependencies to read and plot data. We will import Pandas, NumPy, Matplolib, and Seaborn into our Python file.

    # Import Required Python Dependencies 
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    

    Next, we are going to import the file you want to analyze. For the purpose of this exercise, we are going to use sample data from Kaggle. Kaggle is a great place where you can download and use different data sets to explore trends or practice your data science skills.

    We will import this file into Pandas so we can easily plot the DataFrame using Pandas and Seaborn.

    To add a file to a Pandas DataFrame, you can use the pd.read_csv() command. We will also clean up our data to make it more readable and cut off parts we don’t want to analyze.

    # Read in the file 
    terror= pd.read_csv('globalterrorismdb_0718dist.csv', encoding='ISO-8859-1', low_memory = False)
    
    # Rename Columns for better readability
    terror.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound':'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'},inplace=True)
    
    # Select columns we are most interested in analyzing
    terror=terror[['Year','Month','Day','Country','Region','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive']]
    
    # Calculate the casualties (Both Killed + Wounded)
    terror['casualties']=terror['Killed']+terror['Wounded']
    
    print(terror.head(4))
    

    Output:

       Year  Month  Day    ...     Weapon_type Motive casualties
    0  1970      7    2    ...         Unknown    NaN        1.0
    1  1970      0    0    ...         Unknown    NaN        0.0
    2  1970      1    0    ...         Unknown    NaN        1.0
    3  1970      1    0    ...      Explosives    NaN        NaN

    Lastly, we are going to print out some descriptive information about the

    4. Charting your Data

    Using Seaborn, we can plot some areas of interest. For this exercise, I will plot two simple visualizations of the data.

    Terrorist Activities Each Year

    plt.style.use('fivethirtyeight')
    
    # Using Seaborn we can plot the Terrorist attacks by Year 
    plt.subplots(figsize=(15,6))
    sns.countplot('Year',data=terror,palette='RdYlGn_r',edgecolor=sns.color_palette('dark',7))
    plt.xticks(rotation=90)
    plt.title('Number Of Terrorist Activities Each Year')
    plt.show()

    Attack Methods by Terrorists

    # We can also plot the Attack Methods by Terorrists 
    plt.subplots(figsize=(15,6))
    sns.countplot('AttackType',data=terror,palette='inferno',order=terror['AttackType'].value_counts().index)
    plt.xticks(rotation=90)
    plt.title('Attacking Methods by Terrorists')
    plt.show()

    Full Code Block

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    """
    Created on Tue Feb  5 16:55:13 2019
    
    @author: oscarbarillas
    """
    
    # Import Required Python Dependencies 
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Read in the file 
    terror= pd.read_csv('globalterrorismdb_0718dist.csv', encoding='ISO-8859-1', low_memory = False)
    
    # Rename Columns for better readability
    terror.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound':'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'},inplace=True)
    
    # Select columns we are most interested in analyzing
    terror=terror[['Year','Month','Day','Country','Region','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive']]
    
    # Calculate the casualties (Both Killed + Wounded)
    terror['casualties']=terror['Killed']+terror['Wounded']
    
    print(terror.head(4))
    
    
    
    plt.style.use('fivethirtyeight')
    
    # Using Seaborn we can plot the Terrorist attacks by Year 
    
    plt.subplots(figsize=(15,6))
    sns.countplot('Year',data=terror,palette='RdYlGn_r',edgecolor=sns.color_palette('dark',7))
    plt.xticks(rotation=90)
    plt.title('Number Of Terrorist Activities Each Year')
    plt.show()
    
    
    # We can also plot the Attack Methods by Terorrists 
    plt.subplots(figsize=(15,6))
    sns.countplot('AttackType',data=terror,palette='inferno',order=terror['AttackType'].value_counts().index)
    plt.xticks(rotation=90)
    plt.title('Attacking Methods by Terrorists')
    plt.show()

    Attributions

    Additional Reading

    https://seaborn.pydata.org/

    https://pandas.pydata.org/pandas-docs/stable/