Itinerary of a software company until the GDPR deadline - part 1

With T minus 3 days until effectively the whole world needs to be GDPR compliant, we’re putting the last checks in place ourselves. This series is meant to illustrate how we got there.

In case you’ve been residing underneath a rock for most of 2018, the General Data Protection Regulation (GDPR) is a new privacy & data protection law that will be coming into effect for all of the EU on the 25th of May 2018. The regulation focuses on the use and processing of user data, broadly defined: any information that could be used to identify a natural person. This won’t be another article on what the regulation entails, much better posts have been written on that subject.

Rather, in this blog series we’d like to first make an inventory of what dependencies we need to be compliant and what needs overhaul until May 25.

Data register: spreadsheets!

One of the first things we needed to do is create a list of all the systems and services we use, that could possibly contain or process user data. This is a project in itself, and it involved, but is not limited to:

  1. Recalling things from memory
  2. Sifting through credit card statements to find paid services
  3. Digging in onboarding documentation
  4. Analyzing our in house developed systems

We created a list of all internal systems and self hosted instances like Gitlab, Confluence and Jenkins, their use, data categories and applicable data subjects (customer or employee?). But also which systems and third parties the data is exported to, the period of time the data is kept and how the data is protected (ie: hashed, encrypted, anonymized/pseudonymized).

We then created a similar list of external service providers including hosting providers like Hetzner and AWS, as well as SaaS products like Trello, Zapier and Paypal. For this list we included all the categories listed above, and meticulously followed their status. An overwhelming amount of services claims to ultimately be compliant by the 25th of this month, joining the stare down game.

As it turns out we hold a lot more data on our employees than we do on our customers (and even less on our open source users). Point in case being the telemetry functionality (opt-out and completely anonymous) we recently added to the Fuse Panel - the command center for Passenger - to supplement user feedback with usage data and make better decisions regarding feature implementations for the panel. We actually know shockingly little about our customers.

Another example. A single customer might have a open source Passenger installation, have one of their tweets picked up by our Twitterbot and they might have contacted support using another email address than the one listed for their enterprise account. We’re not making that link, which would make data exports a lot more expensive, because we believe it’s none of our business.

Low hanging fruit

We found that data collection projects for which you don’t have the resources to follow up on, can be easily parted with, bringing you closer to your goal. An example is a list of started checkout processes your Sales team could chase.

We can also recommend using the GDPR deadline as an incentive for a 'spring cleaning'. Get rid of those trailing subscriptions to various SaaS products, it might even save you a buck!

Decreasing the log retention for our self-hosted services proved lucrative in another way as well. Some logs were preserved indefinitely without a good reason and caused our server to run out of disk space. But no more!

Next steps

In part 2 of this series we’ll expand on what tech and communication we needed to touch in order to ensure compliance.

We’d love to hear about the steps you’ve taken to ensure your product/service’s compliance!