Monday, January 8, 2018

Blog 1 of 2018 / Markets for Consumer Data & User-Beware

Hello readers, and thanks for joining us for our first blog of the new year.

This year, in addition to our typical musings on financial markets, we'll be writing a little more on consumer data.

Hacks have been all the rage for a while already (Equifax and Yahoo! more recently; Target in 2013, Sony Playstation in 2011 and TJ Maxx in 2007 are not too-distant memories).

But our interest lies somewhere else: in what is happening behind the scenes with our data.

hiQ v LinkedIn

hiQ is a San Fran-based startup which was doing something pretty interesting: it was scraping data from LinkedIn and then selling that data or analyses done using that data.

LinkedIn didn't much like this, with hiQ's automated robots ("bots") bypassing LinkedIn's security measures to scrape the data, which LinkedIn felt undermined LinkedIn’s privacy commitments to its members.

They battled it out in court, with the court finding in August, probably reasonably, that LinkedIn could not stop hiQ from scraping the publicly-viewable information from LinkedIn's website.  In fairness, LinkedIn doesn't own its users' data -- it's our data! -- and therefore couldn't limit hiQ's ability to access or study it.

The ruling is interesting for several reasons, including some of the First Amendment-type arguments made by hiQ to support its right to scrape.  
“To choke off speech and the precursor of speech, the gathering of facts and the analysis of information, is a dangerous path down which we should not go,” 
         -- Harvard law professor Laurence Tribe, representing hiQ, reportedly told                 the judge.    
“hiQ believes that public data must remain public, and innovation on the internet should not be stifled by legal bullying or the anti-competitive hoarding of public data by a small group of powerful companies, ... It is important to understand that hiQ doesn’t analyze private sections of LinkedIn – we only review public profile information. We don’t republish or sell the data we collect. We only use it as the basis for the valuable analysis we provide to employers. ”
            -- hiQ said in a statement 

Okay, meh. But how about what happens next?  Like Microsoft (which owns LinkedIn) firms like Facebook, Amazon, eBay and Google, control and study copious amounts of customer data (and they sometimes get hacked and lose control of it).  But importantly they also sell it (as does hiQ).  We might like to think that they only sell aggregate data, but how would we know? 

Personal, Personnel Information

What interests us is that hiQ sells information about LinkedIn users to those LinkedIn users' bosses, including information, generated from scraping LinkedIn, about the likelihood of an employee leaving. hiQ's clients reportedly include companies like CapitalOne and GoDaddy, and hiQ's products include their Keeper product, which identifies, for employers, when their employees are at risk of leaving for another job. (For example, when employees are "looking around," they tend to make connections on LinkedIn.)

So that's a sale not necessarily of the more obvious, vanilla, personal information (name, address, date-of-birth), but user/employee/personnel's tendencies and movements. But it's almost certainly not aggregated: if it were aggregated, it would be worthless.  Sure, they're not selling the individual's vanilla data itself but they've done a basic analysis of individual's behavior and are selling the analysis.  Aggregated, it is not. 

hiQ would have no greater ownership interest in our data than LinkedIn would have.  Through hiQ's bots, we just have a simple work-around (imagine, for example, that LinkedIn were simply to buy a stake in hiQ).  If we knew that LinkedIn could "tell on us" to our employers -- and make money doing so -- would we have signed up?

We have the Latin expressions caveat emptor and caveat venditor to connote the short-hand principles of buyer-beware and seller-beware, when entering into transactions.  In 2018, the awkward expression caveat utilitor -- user-beware -- might just become part of our lexicon.

The value of Johnny's house, or Sandy's choice of handbags, is information that would help advertisers target Johnny or Sandy more appropriately.  If Johnny's house price is on the low end, all else equal one wouldn't push Maserati ads at him.  If Sandy is buying Louis Vuitton bags, well, maybe she would like this newly-released Prada bag or another Louis Vuitton bag.  But whose data is that, and do companies have a right to sell and profit from that information?  And how do we separate data, the sale of which may be limited on an individual basis, from analysis of data, which seems to be fair game.  Are these two both analyses?

  • Johnny's house was purchased for $200K this year
  • Last year, Sandy bought two of Brand X's bags and three of Brand Y's bags.

All the best for 2018.  Keep watching the Watchmen.

~ PF2