Have you ever opened a spreadsheet and stared at it blankly seeing just a wall of text and numbers? Have you been given a dataset and not known where to start? Have you pulled down a large dataset but are only interested in a small part of it? To make sense of data you often need to remove the noise and focus in on just what is of interest to you. One way to do this is through filtering.
Filtering data involves either selecting the data you are interested in or removing the data you are not interested in. You can see this distinction with the filter option in Excel. You can filter data by deselecting all of the data and then selecting the values you are interested in:
This is useful when you are only interested in a small number of the available items; you are selecting the data you are interested in. When you want most, but not all of the available items, it can be easier to just remove the values you don’t want:
Here you remove the values you don’t need and are left with just the data that interests you.
Reasons to Filter
There are several situations where it makes sense to filter your data rather than work with the full dataset, including when:
- You are interested in a specific subset of the dataset, such as between a specific date or from a specific region
- You have been provided the full dataset and need to reduce it yourself to the data of interest
- You cannot extract the full dataset due to its size or system-enforced constraints
- There is too much data to manage effectively with the tools you have available
- There are parts of the data that you know are inaccurate and want to remove
- You wish to explore the data to get a feel for it before beginning your analysis
- You want to quickly compare different subsets of the data
- You want to remove data that you consider to be noise or that could be hiding an underlying trend.
What to filter on
What you choose to filter on will depend on why you are filtering the data, what you want to know, and what data you have available. Let’s consider some Moodle-related examples of what you could filter on:
- A specific course or class
- A specific user
- Users enrolled in a given month
- Scores below 80%
- Users that have not logged in for over a month
- Users that have completed a required piece of training.
In this article we have looked at why you might want to filter your data and provided some Moodle-related examples of what you could filter on. In future articles we will look at some ways you could filter data that go beyond the Excel examples shown here. These will include filtering reports from within Moodle and filtering data using languages such as SQL and Python.