In our previous article we showed you how to access the Moodle Logs at a course level. In this article we will examine the data that can be extracted from the logs and look at how you can start to understand this data using Excel.
Sample Logs Data
In this article we will examine anonymised data from an actual course. The sample dataset we are working with has 3213 rows of data. We will review the data using Excel to see what data is available. We will also use some simple features of Excel to understand this data better. The image below shows the columns that are available in the log data. Note that for this dataset we have removed the Origin and IP Address columns. Each entry in the dataset will be one action performed by one user. Actions in the dataset are referred to as ‘events’ which is the terminology we will use throughout this article.
You can see that we have the following columns of data available to us:
- Time the event occurred
- The Name of the user performing the event and the user they performed this event on (if applicable)
- Where the event occurred
- The component type that the event took place in
- The type of event that occurred
- A description of the event that occurred.
Wow, there is a fair bit of data here. Our first task is to get a feel for what the dataset holds. We can do this by looking at the contents of each column. An easy way to do this in Excel is to place a filter on each column and then view the unique values in that column.
Filtering on the User full name column will give us a list of the users in the dataset:
This shows us that we have five unique users, one administrator and one tutor in the dataset. If we wanted to look at the data for a specific user, filtering on that user would give us that view. It is a good idea when looking at a dataset to try filtering each column and seeing what unique values are available. The next step is to filter on one value in a column and then look at how this affects the remaining columns – what values are related to the filtered item in the first column?
We can check which event contexts Administrators operate in by filtering on the User full name column (selecting only Administrator 1) and then filtering the Event context column. Doing so shows us the following unique values:
This shows us that in this course there are only four Event contexts in which an Administrator is operating. What if we wanted to know what events an administrator is performing? For this we can filter the Event name column to see which unique events are listed.
Now we have a view of the different actions an administrator is taking in the course. By applying filtering to the dataset we are able to build up an understanding of what the dataset holds and how the data is related.
One of the more interesting columns in the logs is the Description column. This column describes the event that has occurred. The values that are returned include various id numbers from within the Moodle database, so they may not be that useful to you, but you are able to decode the descriptions to understand what is taking place. Let’s walk through the following entries to understand what the Description column is telling us:
This entry shows us that a specific user has viewed a given discussion in a forum in the course. Here’s another:
This entry tells us that a specific user viewed a page activity in the course. One more:
Here a specific user started a lesson. The pattern here is as follows:
A user performs an action on a component in a course module.
You can use the description column to understand the event that has taken place and to look for specific events that interest you.
We have used simple filtering in Excel to get a better understanding of what the data holds. This is achieved by looking at each column and understanding what values it holds and what these values tell us. Our next step is to decide what we want to know about the dataset. We can do this by formulating some questions, based on our initial investigation. What questions do you think you could answer using this dataset? Here are some I can think of:
- Which user has created the most forum posts?
- On which days has Administrator X been active?
- What time of day sees the most activity on the site?
- A user states that they viewed the required material. Do the logs back this claim up?
- A user claims that they have been active in the course across a period of time. Do the logs prove this statement?
- Which page in the course is the most popular?
- Are there any pages, lessons or materials that users access multiple times?
There are plenty more questions that you could look to answer using the data available in the logs. You may be able to answer simple questions, such as if a given user has viewed a given page, simply by filtering to find the required entries. Likewise, to find the first day a user accessed the course you could just filter on that user and sort by date. Most of your questions though will probably require a greater level of analysis. Your starting point is to understand what the dataset contains and in this article we have done this using some basic functionality available in Excel.
In future articles we will explore this dataset further using some basic analysis techniques in Excel. We will then look at some more advanced techniques and then at the sort of analysis we can complete using Python, a programming language popular with data analysts and data scientists. Stay tuned!