Details for iMoodJournal_visualization.ipynb

Published by Wenqiu999

Description

Visualize the data exported from iMoodJournal.

0

Tags & Data Sources

visualization mood change mood variance iMoodJournal

Comments

Please log in to comment.

Notebook
Last updated 3 days, 2 hours ago

This notebook is used to analyze mood changed over time based on iMoodJournal data. It used a data file exported from iMoodJournal. You can upload your own Mood data and use the code to analyze. And This notebook is free to reuse and adapt, distributed under an MIT license: https://opensource.org/licenses/MIT

Let's import packages first.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import time

Load the data.

In [2]:
Mood = pd.read_csv('mood-Jul 8, 2019.csv', index_col=False)
In [3]:
date_strings = ['%d. %b %Y', '%b %d, %Y']
date_format = None
datetime_format = None

Mood['Time'] = Mood['Hour'].map(str).str.cat(Mood['Minute'].map(str), sep = ':')
Mood['Date'] = Mood['Date'].map(str)
Mood['DateTime'] = Mood['Date'].str.cat(Mood['Time'], sep=' ')


while date_strings:
    date_format_test = date_strings.pop()
    datetime_string = '{} %H:%M'.format(date_format_test)
    try:
        Mood['DateTime']=pd.to_datetime(Mood['DateTime'], format=datetime_string)
        date_format = date_format_test
        datetime_format = datetime_string
        break
    except ValueError:
        continue

if not datetime_format:
    raise Exception('Failed to parse datetime - maybe we need another datetime_string?')


mood = Mood.set_index('DateTime', drop= False)
DateTime = mood.pop('DateTime')
mood.insert(0, 'DateTime', DateTime)
Time = mood.pop('Time')
mood.insert(1, 'Time', Time)

First, let's start with a line plot of the full time period to show the changes over time.

In [4]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(rc={'figure.figsize':(11, 4)})

begin_date = mood.iloc[[0]]['Date'].apply(lambda x: datetime.strptime(x,date_format))
end_date = mood.iloc[[-1]]['Date'].apply(lambda x: datetime.strptime(x,date_format))

mood['Level'].plot(linewidth=1);
plt.title('Figure 1. Mood Change From {} to {}'.format(begin_date[0],end_date[0]))
plt.show()

The chart above shows the trend of your mood change over the whole period. Now, let's take a look at how our mood changes daily. Calculate the average level of daily mood.

In [5]:
moodlevel_daily= mood['Level'].resample('D')
moodlevel_daily_mean = moodlevel_daily.mean()

moodlevel_daily_mean.plot(linewidth=1);
plt.title('Figure 2. Daily Average of the Mood From {} to {}'.format(begin_date[0],end_date[0]))
plt.show()

Then, compare the daily average mood with the mood log.

In [6]:
fig, ax = plt.subplots(figsize=(30, 14))
ax.plot(mood['Level'],
marker='.', linestyle='-', linewidth=0.5, label='Mood Level')
ax.plot(moodlevel_daily_mean,
marker='o', markersize=8, linestyle='-', label='Daily Mean')
plt.title('Figure 3. Comparison of Real Mood Change and Daily Average Mood Change')
plt.show()

From the chart above, we can see the difference between your actual mood data and the daily average.

To explore how your mood changes everyday and also compare across the whole period, let's use heat map to plot in a more colorful way. First, we need to convert the incomplete time list into a complete time list. The mood level of the moment will be recorded as NAN if there isn't any data point at that time.

In [7]:
mood.loc[mood.Minute>30,'Hour']= mood['Hour'] + 1        
mood.head(10)
hourly_mood = mood[['Date','Day of week','Hour','Level']]
In [8]:
def get_date_list(begin_date,end_date):
    date_list = [x.strftime(datetime_format) for x in list(pd.date_range(start=begin_date, end=end_date, freq='H'))]
    return date_list

begin_date = Mood.iloc[[0]]['Date'].apply(lambda x: datetime.strptime(x,date_format))
end_date = Mood.iloc[[-1]]['Date'].apply(lambda x: datetime.strptime(x,date_format))
Time_list = pd.DataFrame({'DateTime':get_date_list(begin_date[0],end_date[len(Mood)-1])})
Time_list['DateTime'] = pd.to_datetime(Time_list['DateTime'], format=datetime_format)
Time_list['Time'] = Time_list['DateTime'].apply(lambda x: x.strftime(date_format)) + ' ' + Time_list['DateTime'].apply(lambda x: x.strftime('%H'))
Time_list['Date']= Time_list['DateTime'].apply(lambda x: x.strftime(date_format) )

hourly_mood['Date'] = pd.to_datetime(hourly_mood['Date'], format=date_format)
hourly_mood['Time'] = hourly_mood['Date'].apply(lambda x: x.strftime(date_format)) + ' ' + hourly_mood['Hour'].apply(lambda x: str(x))
hourly_mood = hourly_mood.drop(['Date'],axis=1)
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if sys.path[0] == '':
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:13: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  del sys.path[0]
In [9]:
Hourly_mood = pd.merge(Time_list, hourly_mood, how='left', on='Time' )
Hourly_mood_heatmap = Hourly_mood.drop(['Time','Date','Day of week','Hour'], axis=1)
Hourly_mood_heatmap = Hourly_mood_heatmap.set_index('DateTime')

Now, we have our complete hourly mood log. Let's transpose the matrix first.

In [10]:
groups = Hourly_mood_heatmap.groupby(pd.Grouper(freq='D'))
Hourly_mood_heatmap = pd.concat([pd.DataFrame(x[1].values) for x in groups], axis=1)
Hourly_mood_heatmap= pd.DataFrame(Hourly_mood_heatmap)
Hourly_mood_heatmap.columns = Hourly_mood['Date'].drop_duplicates(keep='first', inplace=False)

Let's use heatmap to provide a more intuitive, left-to-right data layout, with each row representing the hour and each column representing the day. Color Red stands for good mood, the more red the better. Color Blue stands for bad mood, while color Green means your mood was so-so at that time.

In [11]:
plt.matshow(Hourly_mood_heatmap, interpolation=None, cmap='jet', vmin=1, vmax=8)
plt.xlabel('Date',fontsize=14)
plt.ylabel('Time of a Day',fontsize=14)
plt.xticks(np.arange(Hourly_mood_heatmap.shape[1]),Hourly_mood_heatmap.columns, rotation=90)
plt.title('Figure 4. Heatmap of Mood in A Day', y=1.5)
plt.show()

We can also further explore your mood changes within one day. Select one date and plot your mood log of that day.

In [12]:
oneday = '2019-6-19'
Daily_mood = mood.loc[oneday]
In [13]:
Daily_mood['Level'].plot(linewidth=1)
plt.title('Figure 5. Mood Change on {}'.format(oneday))
plt.show()

To further explore the variation of the mood during this period, we will try to use Pareto chart to highlight the most representative mood levels over the whole period. First, we need to caculate the frequency of the levels of mood.

In [14]:
Levels = mood.groupby('LevelText', as_index=False)[['Date']].count()
#Levels['LevelText'] = Levels['LevelText'].apply(str)
Levels
Out[14]:
LevelText Date
0 Bad 20
1 Good 127
2 Great 3
3 Meh 21
4 Okay 130
5 So-so 58
6 Very bad 1
7 Very good 30

Then let's plot the frequency in Bar chart.

In [15]:
Levels.plot(kind='bar', x='LevelText', y='Date', legend=None, title='Frequency of Mood levels')
plt.title('Figure 6. Frequency of Mood Levels')
plt.show()

Now, we can use a pareto chart to represnt both the frequency and the cumulative percentage of the mood levels.

In [16]:
def create_pareto_plot(df, x=None, y=None, title=None, show_pct_y=False, pct_format='{0:.0%}'):
    xlabel = x
    ylabel = y
    tmp = df.sort_values(y, ascending=False)
    x = tmp[x].values
    y = tmp[y].values
    weights = y.cumsum() / y.sum()
    
    
    fig, ax1 = plt.subplots()
    ax1.bar(x, y)
    ax1.set_xlabel(xlabel)
    ax1.set_ylabel(ylabel)

    ax2 = ax1.twinx()
    ax2.plot(x, weights, '-ro', alpha=0.5)
    ax2.set_ylabel('', color='r')
    ax2.tick_params('y', colors='r')
    
    vals = ax2.get_yticks()
    ax2.set_yticklabels(['{:,.2%}'.format(x) for x in vals])
    
    formatted_weights = [pct_format.format(x) for x in weights]
    for i, txt in enumerate(formatted_weights):
        ax2.annotate(txt, (x[i], weights[i]), fontweight='heavy')    
 
    if not show_pct_y:
        ax2.set_yticks([])
        
    if title:
        plt.title(title)
    
    plt.tight_layout()
    plt.show()
In [17]:
create_pareto_plot(Levels, x='LevelText', y='Date', title='Figure 7. Pareto Chart of Mood Level Frequency')

The cumulative percentage may vary based on the type of your mood. Let's also plot two separate Pareto Charts for negative mood and positive mood.

In [18]:
BadMoodText = ['So-so','Meh', 'Bad', 'Very bad']
GoodMoodText = ['Okay', 'Good', 'Very good', 'Great']
BadMoodLevels = Levels.loc[Levels['LevelText'].isin(BadMoodText)]
GoodMoodLevels = Levels.loc[Levels['LevelText'].isin(GoodMoodText)]
In [19]:
create_pareto_plot(BadMoodLevels, x='LevelText', y='Date', title='Figure 8. Pareto Chart of Bad Mood Level Frequency')
In [20]:
create_pareto_plot(GoodMoodLevels, x='LevelText', y='Date', title='Figure 9. Pareto Chart of Good Mood Level Frequency')

Events happened in someone's daily life can influence his/her mood. Let's highlight the period and add annotations to help you understand your own mood. Please input the time period and the event happened during the period below.

In [21]:
time_periods = [
    {
        "start": "2019-07-01",
        "end": "2019-07-04",
        "label": "something happened",
    },
]

Now, we can grab this period and the event.

In [22]:
period_selected = mood[time_periods[0]["start"]:time_periods[0]["end"]]
period_selected_event = time_periods[0]["label"]

Let's take a look at the peak and nadir of your mood during this period.

In [23]:
mood_period_max = period_selected['Level'].max()
mood_period_max_idx = period_selected['Level'].idxmax(axis=0, skipna=True)
print('The moment that your felt best during the period:',mood_period_max_idx)
#mood_period_max_event = input('please input the event happened when you felt best during the period:')

mood_period_min = period_selected['Level'].min()
mood_period_min_idx= period_selected['Level'].idxmin(axis=0, skipna=True)
print('The moment that your felt worse during the period:',mood_period_min_idx)
#mood_period_min_event = input('please input the event happened when you felt worse during the period:')
The moment that your felt best during the period: 2019-07-02 12:20:00
The moment that your felt worse during the period: 2019-07-04 20:09:00

According to the time points above, you can input the events happened at that moment below.

In [24]:
events = [
    {
        "event_mood_max": "thing A",
        "event_mood_min": "thing B"
    },
]

Now, we can highlight the period you selected and add your notes to the plot.

In [25]:
fig, ax = plt.subplots(figsize=(30, 14))
ax.plot(mood['Level'],marker='.', linestyle='-', linewidth=0.5, label='Mood Level')
ax.plot(moodlevel_daily_mean,marker='o', markersize=8, linestyle='-', label='Daily Mean')
ax.axvspan(time_periods[0]["start"], time_periods[0]["end"], color=sns.xkcd_rgb['grey'], alpha=0.5)
ax.set_title('Figure 10. Mood Changes Over Time with Annotations')

ax.set_ylabel('Mood')
ax.set_xlabel('Date')

ax.legend(loc='upper left', fontsize=11, frameon=True).get_frame().set_edgecolor('blue')  

bbox_props0 = dict(boxstyle='square, pad=0.6', fc='mediumvioletred', ec='r', alpha=.4, lw=.5)

ax.text(time_periods[0]["start"], 9, 'Event happened during this period:\n{}'.format(period_selected_event) , size=12,ha='left',
        family = 'serif', color='yellow', style = 'italic', weight = 'bold', bbox = bbox_props0)

bbox_props1 = dict(boxstyle='round4, pad=0.6', fc='cyan', ec='b', lw=.5)

ax.annotate('Mood Max = {}\nEvent = {}\nDate = {}'
                 .format(mood_period_max, events[0]["event_mood_max"], mood_period_max_idx.strftime('%a, %Y-%m-%d')),
            fontsize=12,
            fontweight='demi',
            xy=(mood_period_max_idx, mood_period_max),  
            xycoords='data',
            xytext=(-150, -30),      
            textcoords='offset points',
            arrowprops=dict(arrowstyle="->"), bbox=bbox_props1)    

ax.annotate('Mood Min = {}\nEvent = {}\nDate = {}'
                 .format(mood_period_min, events[0]["event_mood_min"], mood_period_min_idx.strftime('%a, %Y-%m-%d')),
            fontsize=12,
            fontweight='demi',
            xy=(mood_period_min_idx, mood_period_min),  
            xycoords='data',
            xytext=(-150, 30),      
            textcoords='offset points',
            arrowprops=dict(arrowstyle="->"), bbox=bbox_props1) 
plt.tight_layout()

Let's take a look at the tags you added now.

In [26]:
tags = list(mood.columns.values)[10:]
In [27]:
tag_sum = pd.DataFrame(mood[tags].apply(lambda x: x.sum()))
tag_sum['tags'] = tag_sum.index.values 
tag_sum.columns = ['frequency','tags']

First, Let's try to plot a bar chart to explore the frequencies of the tags you used.

In [28]:
tag_sum.plot(kind='bar',x='tags',y='frequency', legend=None, title='Figure 11. Frequencies of Mood Tags')
plt.show()
In [29]:
mood_good = mood[mood["Level"]>=6]
mood_bad = mood[mood["Level"]<6]
In [30]:
goodmood_tag_sum = pd.DataFrame(mood_good[tags].apply(lambda x: x.sum()))
goodmood_tag_sum['tags'] = goodmood_tag_sum.index.values 
goodmood_tag_sum.columns = ['frequency','tags']
badmood_tag_sum = pd.DataFrame(mood_bad[tags].apply(lambda x: x.sum()))
badmood_tag_sum['tags'] = badmood_tag_sum.index.values 
badmood_tag_sum.columns = ['frequency','tags']
In [31]:
fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, sharex=True, sharey=True)
goodmood_tag_sum.plot(kind='bar',x='tags',y='frequency', legend=None, ax=ax1)
ax1.set_title('Frequencies of Good Mood Tags',loc= 'right')
badmood_tag_sum.plot(kind='bar',x='tags',y='frequency', legend=None, ax=ax2) 
ax2.set_title('Frequencies of Bad Mood Tags', loc = 'right')
plt.suptitle('Figure 12. Frequencies of Mood Tags')
plt.show()

To present the frequecies of tags in another way, we can use word cloud. The bigger the font of the tag is, the more frequent you used this tag.

In [32]:
!pip install wordcloud
Requirement already satisfied: wordcloud in /opt/conda/lib/python3.6/site-packages
Requirement already satisfied: pillow in /opt/conda/lib/python3.6/site-packages (from wordcloud)
Requirement already satisfied: numpy>=1.6.1 in /opt/conda/lib/python3.6/site-packages (from wordcloud)
You are using pip version 9.0.1, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
In [33]:
from PIL import Image, ImageSequence
from wordcloud import WordCloud

def DrawWordcloud(df):
    wc = WordCloud(background_color = 'White',width=1000, height=860, margin=2)
    name = list(df.tags)
    value = df.frequency
    for i in range(len(name)):
        name[i] = str(name[i])
    dic = dict(zip(name, value))
    wc.generate_from_frequencies(dic)
    plt.imshow(wc)
    plt.axis("off")
    plt.title('Figure 13. Wordcloud of the Mood Tags')
    plt.show()
    wc.to_file('Wordcloud.png')

DrawWordcloud(tag_sum)

To compare the difference of your mood trend on weekdays and weekends, let's plot your actual mood data point in a run chart as well as lines representing the hourly average and rolling means.

In [34]:
mood_perminute = mood   
mood_perminute['Day of week'] = mood_perminute.index.weekday_name
mood_perminute['Date'] = mood_perminute.index.date
mood_perminute['Time'] = mood_perminute.index.time
mood_perminute.loc[mood_perminute.Minute>30,'Hour']= mood_perminute['Hour'] - 1   
mood_perminute['TimeStamp'] = mood_perminute['Hour'] + mood_perminute['Minute']/60
In [35]:
weekday = ['Monday','Tuesday','Wednesday','Thursday','Friday']
weekend = ['Saturday','Sunday']
weekday_mood = mood_perminute.loc[mood_perminute['Day of week'].isin(weekday)]
weekend_mood = mood_perminute.loc[mood_perminute['Day of week'].isin(weekend)]

weekday_mood_hourly_mean = weekday_mood.groupby('Hour')['Level'].mean()
weekend_mood_hourly_mean = weekend_mood.groupby('Hour')['Level'].mean()

weekday_rolling = weekday_mood_hourly_mean.rolling(3, center=True).mean()
weekend_rolling = weekend_mood_hourly_mean.rolling(3, center=True).mean()
In [36]:
fig,