29 June, 2018

Private data switch: GDPR compliance and personal location data protection [UPD: Aug 20, 2018]

A ready-made solution to hide personal location data and make your business GDPR compliant.

There's a lot of buzz about personal location data protection these days and it has further intensified with the introduction of the GDPR regulations. Businesses relying on location data are heavily concerned about the issue since failing to comply incurs fines and hits the reputation. Employees using corporate transport should feel safe and secure when running their private errands. This is exactly what the business-private switch intends to do — stop sending your location data when not at work.

Task at hand

Say, we have a channel that receives data from tracking devices. The tracking device has a switch connected to a certain digital input. If the switch is turned on, the tracking device position info must NOT come to a Business Application. For example, this typical telemetry message:

{"channel_id":4321,
"ident":"device_imei",
"timestamp":"current_time",
"gsm.cellid.1":36611,
"gsm.lac.1":102,
"gsm.mcc.1":257,
"gsm.mnc.1":1,
"position.altitude":307.899994,
"position.direction":0,
"position.latitude":53.905892,
"position.longitude":27.456909,
"position.satellites":10,
"position.speed":42,

"vehicle.mileage":399999,
"can.fuel.level":81,
"din":4}

must be transformed to:

{"channel_id":4321,
"ident":"device_imei",
"timestamp":"current_time",
"vehicle.mileage":399999,
"can.fuel.level":81,
"din":4}

Note that I also excluded GSM data (as one can define tracker’s location by GSM cell id) and speed-direction data (as one can estimate tracker’s position by building its track using secondary positioning data). Is it worth mentioning that if the private data switch is on, a driver must look at the rearview mirror more often to detect and avoid surveillance and pursuit? :)

The solution provided in this article relies on the MQTT pipeline:

  1. MQTT client subscribes to a special topic to receive new messages from the tracking devices
  2. Simple algorithm cuts off position data from messages received on subscription (or not, if private data switch is off)
  3. MQTT client publishes a processed message to another special topic
  4. The business application receives the processed messages

Looks simple, doesn’t it? Let’s see how it works.

Solution

To emulate this case I decided to do the following:

  • I’ve got the Wiatag app on my smartphone. This is my GPS tracker. In WiaTag I can create custom statuses — customized parameters to send with telemetry data from my phone. I created 2 statuses with the name custom.din (din stands for Digital INput like I have physical private data switch on my phone): public and private. So if my Private Data Switch is turned on, I will receive the special parameter “custom.din_index” = 1. (NOTE! In most GPS tracking devices the name of switch parameter will most likely be just "din")
  • I’ve got a Wiatag channel and set up my Wiatag app to send data to the channel’s IP:port.

wiatag setup

  • I’ve got the MQTT channel that will subscribe to a special topic and receive processed messages. Below is the screenshot of the MQTT channel configuration:

  • And last, but not least. The power that will analyze the content of messages in the Wiatag channel, excide the private data and push the result to the MQTT channel — meet flespi_pipeline.

Essence

flespi_pipeline is based on gmqtt library and is 80% the example from gmqtt’s Readme. Everything I added was:

  1. Configuration. You must write the channel_id to read messages from, Authorization Token, and Private Data Switch parameter name and value (I've used the name "custom.din_index" to test it with Wiatag but on most tracking devices it would be just "din"). You can modify some secondary stuff but basically, this is it.
  2. Function to process received messages:
def on_message(client, topic, payload, qos, properties):
    try:
        message_json = json.loads(payload.decode("utf-8"))
    except ValueError:
        print('invalid JSON received: ' + payload.decode("utf-8"))
        return

    """ define if pipline message processing required """
   if pds_parameter_name in message_json and pds_check_type(message_json[pds_parameter_name]):
        processed_msg = {}
        """ iterate over message parameters and exclude position info"""
        for parameter in message_json:
            """ if parameter starts from "position" or "gsm", exclude it from message """
            if not (parameter.startswith('position') or parameter.startswith('gsm')):
                processed_msg[parameter] = message_json[parameter]
        payload = json.dumps(processed_msg)

    """ publish processed(or not) message """
    client.publish(publish_topic, payload)

The above lines of code analyze the Private Data Switch parameter and eliminate the position info from messages.

Note that the decision of whether to process a message with the PDS algorithm or not is made by the function pds_check_type. There are three functions involved in the check (you can specify the required function name as a string on the configuration step):

  • check_value — simply checks if the PDS parameter is equal to the value specified in the configuration variable pds_turn_on_value;
  • check_bit_set / check_bit_not_set — check if a bit specified in the configuration is 1 or 0. This is useful for some trackers that pack binary representation of several digital inputs into one decimal value. E.g. value of din = 9 equals to the binary 0b00001001, which means that digital inputs number 1 and 4 have high voltage input. And if your PDS tumbler is connected to the digital input number 4 storing high voltage value when position data is to be muted, use function check_bit_set with pds_turn_on_value = 4.

Outcome

What do we have as a result? flespi channel with messages stored according to the Private data switch position. 

What can we do with this channel? Send processed messages to the target business platform. 

What is the right way to do this? Create a special ACL token that will be authorized to get data only from the channel with processed messages and publish messages to the topic the MQTT channel is subscribed to. This is an important remark since the usual all-allowed token will allow your Business Application to see all messages (including private data!). In my case it looks like this:

To make the solution reliable you need to host this script somewhere on a stable machine and most likely it would be some kind of a cloud Linux solution (BTW, it should also work on Windows platforms with minor modifications — let us know if interested). A detailed description of how to run the script automatically on system start and make the script a system service can be found here along with all configuration and service files required for most popular Linux systems.

Yes, this is not a magic button embedded into your application. But flespi plus 15 lines of code give you a fully functional solution to the task that occupied your head for quite a while. Not a huge time investment to show appreciation to your employees and stay in sync with the latest GDPR regulations, isn't it?