This isn't really a blog, its more of a holding page for my domain (seems a shame not to have a page), if I know you then add me on either LinkedIn or Facebook (links are on the right), however if I don't know you then I won't add you!

Tuesday 21 May 2013

Dissertation Series - Resistances to Data Mining - Privacy


The introduction of new technologies or adaptation of existing technologies within an organisation can bring with it resistance from the different layers of the organisation; from the management resisting introduction to the operational staff resisting the use/uptake.  There are many reasons for management/operational staff resisting changes in an organisation and approaches to mitigating them (Davidson, 2009), so the focus will be those specific to data mining.
Privacy
Data mining against individuals inevitably makes use of large amounts of personal data (Busovky, 2011), with this brings concerns of data privacy and the high profile data breaches reported in the media (BBC, 2009).
Wilder and Soat (Wilder et al, 2001) cite an example of N2H2 a Seattle based company that provides internet filtering content software to schools, using that data they planned to sell the anonymised aggregated data.
N2H2 began marketing the data, called Class Clicks, that it’s filtering tools collected on the website usage trends of elementary and high school students.  The data contained no names or personal information and complied with the new deferral Children’s Online Privacy Protection Act.  Yet N2H2’s new line of business brought such loud howls of protest from online privacy advocates that the company scrapped the effort
A fictitious example is given by Wang and Liu (Wang et al, 2011) to illustrate the real privacy concerns that could exist when mining a medical database.
released mining output can also be leveraged to uncover some combinations of symptoms that are so special that only rare people match them” “which qualifies as a severe threat to individuals privacy”
Many countries have legislation in place to protect individuals and ensure organisations put in place safe guards and controls to protect personal data, the main act in the United Kingdom being the Data Protection Act 1998, which covers many areas of data protection.  Specific to privacy the seventh principle of the act applies;
“Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data” (Information Commissioners Office A, 2012).
There are techniques to prevent unauthorised disclosure of personal data through data mining:
Anonymisation
Privacy can be ensured through anonymising data, however simply removing customer reference number/names is not in itself always sufficient as discussed by Vaidya et al (2005 p.8) “just because the individual is not identifiable in the data is not sufficient; joining the data with other sources must not enable identification”.
An established approach to ensure that data is truly anonymised is “k-anonymity” which is a process that involves the grouping of individuals together within the data (Vaidya et al, 2005 p.8).
Suppression can also be introduced to hide groups/data that consist of small and easily identified sample sizes; this requires footnotes and an accompanying narrative to explain that this has been done; to prevent a misunderstanding of any summarised data (Vaidya et al, 2005 p.8).
Clearly defined use of data
Another method to control concerns about privacy is to clearly outline to the data subjects at point of data collection what the data will be used for and the associated benefits to them. 
This is evidenced by the success of the Tesco clubcard scheme and its changed perception amongst it customers and it’s the separation of its mailings from previously “dumb” junk mail.
research consistency suggest that customers perceive the quarterly mailing from Tesco clubcard not as ‘junk mail’, but as personal mail” (Humby et al, 2004 p.116).
An example of poor understanding between the data subject and the organisaiton carrying out the data mining process is the case of pharmacies in the US that were selling data gathered from prescriptions to pharmaceutical companies to be data mined.  The pharmaceutical companies were then using that data to target marketing/sales towards specific doctors, based on the prescriptions they had written (Silverman, 2008).  The data subjects in this case (the doctors) represented by the American college of Physicians have opposed the use of this data for marketing (Walker, 2011).
However the example also speaks about the use of the data for other purposes;
direct safety messages to doctors, to track disease progression, to aid law enforcement, to implement risk-mitigation programs, and to do post-marketing surveillance required by the FDA” (Walker, 2011)
It is where there is a benefit and consent between the data subject, the organisation and its use of the data mining, that there is less likelihood of resistance to data being mined.

No comments:

Post a Comment