Balancing how data improves and deteriorates our online experiences
We’re growing more and more conscious of how we are being exploited through social media and advertisements that use your data to increase their bottom line.
It’s no surprise then that the market share for VPN’s (Virtual Private Networks) is forecast to grow almost three times in the next 6 years from $25 billion to $70 billion. As people grow more conscious of their data footprint, solutions to prevent “big brother” from spying on their every move will only become more popular.
So what is it that companies are doing that we should be so scared of? Is there any positive side to all of this data collection or is it only to fatten the pockets of those exploiting us?
As a Data Engineer, my job is heavily focussed on collecting and processing data and I’ve seen first hand what can be done, and it’s not all bad. In this post, I’m going to share both the good and bad of data collection in the hope of improving your understanding of why it’s being done in the first place.
Imagine every time you left your house and visited a store someone took a pen and paper and wrote down the date and time, which store you visited, what you picked up and looked at, how long you browsed for, what you purchased, how you paid and finally what time you got home.
Would you be creeped out? Because this is what is happening every time you visit almost any website.
In a lot of cases, you’ve already given the company your name, email and home address when you signed up online. Everything you do from that point on is tracked against you and is therefore like the creepy example above. You don’t even need to have an account with some services for them to track you. Facebook manages to track you even if you don’t have a Facebook account.
Each and every interaction is logged and stored somewhere and is being used in a myriad of ways from improving your experience on the site to shoving advertisements down your neck.
1. Write Your First AI Project in 15 Minutes
2. Generating neural speech synthesis voice acting using xVASynth
3. Top 5 Artificial Intelligence (AI) Trends for 2021
4. Why You’re Using Spotify Wrong
The EU’s GDPR (General Data Protection Regulation) is a solid first step in giving power back to you. It allows you to gain insight into what companies are storing about you and enforces that companies fulfil requests to remove that data upon request.
There are steps you can take to limit what is being tracked from changing your browser to limiting which types of cookies can be used on the page (where possible only allow functional cookies when the site asks). Apart from not using the internet at all, there aren’t really any ways to stop tracking altogether. Whether you’re shopping online, browsing social media or watching Netflix, they can still store data about your interactions.
Let’s take a look at the ways your data can be used for both good and bad.
I’m a firm believer in empowering people to take more control over their data. I think it’s especially important for companies to be transparent and give users control over what is and isn’t tracked.
There are a lot of positive uses for data and a lot of them are focussed around particular scientific initiatives. However, as a data engineer, I also understand the huge benefits of collecting and using data by the services you use every day. Not only for the companies that collect it but for you as the user of a website or service. Rather than focus on the projects that are using data for amazing things (seriously you should read about them), I’m going to focus on everyday services that affect a lot of us like Netflix and Spotify.
Cast your mind back to a time before Netflix and Spotify. It’s almost difficult to imagine for anyone under the age of thirty. They’ve become so embedded in our day to day life that it feels like they were there all along. Netflix and Spotify are both services that users pay for (if you pay for Spotify that is). They in return provide us with content to stream over the internet.
Netflix tracks every scroll, browse and hover as you look for what to watch next. They track what you watch, how often you watch it and at what times. From this, they can categorise you as a binge TV watcher, film nerd or anything in between. They take this data and in return, they not only provide you suggestions based on what they think you would like, but they also change the order in which you see things and what you see.
If you were to open Netflix now and take a screenshot of your home screen and get your friend to do the same thing at exactly the same time, I can guarantee you will both see different things. They are providing a hyper-personalised experience that’s designed to firstly, keep you using and paying for Netflix but also to ensure the content you likely want to see is exposed to you. They even use this data to decide which TV shows to make in the future based on their users viewing habits.
Spotify does exactly the same thing but with music. I’ve personally listened to artists I’d never have heard of if it wasn’t for Spotify. They’ve logged data about what I’ve listened to and exposed similar artists and styles of music back to me. All of this again keeps me using Spotify yet also improved my experience. It also provides exposure to smaller artists which improves their prospects too.
This is important to understand as in both of these examples there is a mutual benefit to both the company and you as a user.
Data can be used for good.
As someone who wants to find interesting things to watch and new music to listen to, I would say that this data is being used for good. Netflix and Spotify also benefit from it as I continue to use their service. It’s a win-win and nobody gets hurt.
My personal opinion is that data is at it’s best when it’s being used to enhance your experience and provides real tangible benefits to you. Data tracking doesn’t have to be some evil scheme used for social disruption.
In the examples above, your data is just one small droplet in a lake of millions of other users data. Only when all of that is combined together does it become useful. Yes, it feels weird to know that your every move on these services is being monitored but at the same time, the data engineers aren’t meticulously going through each users data one by one to understand who they are.
They are processing combinations of these data points to spot patterns and matching that with other users who have similar patterns. From there they know which users you have similar tastes to and then show you some of the stuff those users like. All of this is automated and done across millions of users and billions of data points and when you put it like that, your data seems fairly insignificant.
So that’s the good side. I think most of us can agree that services that use data to provide us with a better experience have definitely benefited us in one way or another. So what’s so bad about it?
We’ve seen data being used for our benefit but in most cases, you hear about data being used negatively. The bad press around digital tracking and companies exploiting you for their own personal gain should not be ignored.
Where there is good, there is also bad and unfortunately, people straight up prefer bad news to good which is why the bad stories about data are more prevalent. Without looking it up, I’d bet that the most talked-about company when it comes to data scandals is Facebook.
If you haven’t heard about how election campaigns have used Facebook to target and manipulate voters then I highly recommend watching The Great Hack on Netflix. Using personal data to manipulate voters and spread misinformation is one of the saddest things to come out of the internet since it’s inception, in my opinion.
Examples of these extreme cases can be found on every news site and I don’t want to bore you with another post about “how Facebook is bad and should be shut down”. Seriously, go watch The Great Hack and you’ll learn everything you need to know.
Instead, I wanted to focus on the scenarios where even if it seems like data is being used for good, it could be doing harm. I want to go back to the Netflix example above and dive into how they too could be having a negative impact.
The biggest problem I think we face when it comes to our digital future is digital addiction.
Let’s look at Netflix again. They’re logging data about my interactions with their service to provide me with relevant recommendations. They also use this to understand what types of TV shows people are most likely to watch and therefore what original content to create.
All sounds great, right? However, you could look at it from a more sinister viewpoint and say that what they’re actually doing, is finding more ways to keep you hooked on watching TV. Whilst you’re watching TV you’re neglecting other things in your life. You become addicted to binging through shows and before you know it you’ve wasted 2 hours of your life each day.
Heck, they even autoplay the next episode giving you only 5 seconds to stop it. This keeps you watching and the 5 seconds is just long enough for you to consider stopping but not long enough that you can actually find the remote. There is so much content that is so perfectly geared towards you and your preferences that it sometimes feels impossible to stop.
Also, because the services are hyper-personalised it’s a constant feedback loop of similar content. Yes, it may open you up to new content you might have ordinarily missed, but its likely similar content to that you’ve already seen. This takes all the thinking out of the equation, it stops us broadening our horizons and keeps us in a closed loop.
Spotify has recently tried to do something about this by creating playlists from other genres that they think you’d still enjoy. However, examples like this are unfortunately hard to find.
If you go down the rabbit hole of watching conspiracy theory “documentaries” on Netflix, you’ll keep seeing the conspiracy theory videos and not the other side of the coin. Further enforcing your beliefs down one path. This is how some people can be so convinced that the earth is flat. Once you start looking at “flat earther” documentaries and websites, that data is used to regurgitate the same stuff back at you over and over again, until it’s all you see and anything to the contrary seems like a lie.
These are the dangers posed by seemingly good uses of your data. The intentions start off good and in fact, often are good. However, if left unmoderated, they can cause addiction and fulfill self-perpetuating beliefs as we’re shown what the algorithms “think” we want to see, rather than a broad mix of ideologies and viewpoints.
There’s no easy answer to this question. The first step would be to gain a better understanding of what data is being collected and where possible limit it.
Second, self-moderation is also key. Remember that these online services are designed specifically to keep you using them for as long as possible. Until companies decide (or are forced) to take our digital wellbeing seriously, you should set rules around how much you use services. You could set alarms or even schedule your WiFi to be turned off at certain times. You can also use the inbuilt services built into Android and iOS to understand how much time you’re spending on certain apps and to warn you when you reach a certain amount.
Finally, we could all benefit from using our brains a bit more. Don’t let these services dictate your every move. They all have search functions, look for new and different content and points of view. At some point in our lives, we stopped being children that ate what we were given and chose what to eat for ourselves. We need to do this same thing online. And if a service makes it difficult for you to do this, you should seriously ask yourself whether it’s worth it at all.