Wednesday, December 4, 2019
Privacy Threats in a Big Data-.com
Question: Discuss about the Privacy Threats in a Big Data. Answer: Introduction The advent of big data has created numerous opportunities for business and organizations; in the process, numerous amounts of data has been generated that exceed the capacity for present commonly used software tools fro proper capture, management, and timely analysis and use. Every two years, the quantity of data to be analyzed is expected to double. Most of this data is in unstructured form and coming from various inputs including sensors, social media, surveillance, scientific applications, image and video archives, medical records, internet searches and indexes, system logs, and business transactions (Kerr Earle, 2016). The number of devices connected with the Internet of Things is continuing to increase to unprecedented levels generating large data amounts that require processing to make sense of and use productively. It has also become popular and cost effective to use on demand cloud based computing and processing power to analyze and get insights into this data. As big data e xpands, the traditional security and privacy protocols tailored to private computing systems such as demilitarized zones and use of firewalls are no longer effective (Kache, 2015). In big data, security protocols are expected to work over heterogeneous hardware, network domains, and operating system components. The collection and use of peoples data in big data applications has been met with stiff resistance from consumers with growing concerns expressed over methods that organizations use to collect and use private data and information (Martin, 2015; LeVPN,2017). The potential impact and effects of privacy and security beaches can be highlighted by the recent Facebook page in which there was a massive breach of privacy and security with regard to user data (The Economic Times, 2018). This paper discusses the issue of privacy in big data, first by reviewing related work, discussing the challenges and issues faced, the methodology of research and a proposed approach, before evaluatin g performance and drawing conclusions Related Work According to (Lu, Zhu, Liu, Liu, Shao, 2014), because big data can generate new useful knowledge for economic and technical benefits, it has received great attention inn recent times because of its high volume, high velocity, and variety challenges (3Vs). Apart from the 3V challenges, security and privacy has also emerged as an important issue in big data; If data is not authentic, the mined information is unconvincing, and if privacy is not properly addressed, there may be reluctance of resistance for data sharing. As such, an efficient privacy preserving mechanism, using an algorithm, is proposed by the authors to guarantee security in big data. In a systematic review of literature and discuss the concept of big data and the issues and challenges facing big data, moving forward. The authors discuss the issues facing big data including storage, volume, processing, storage, transportation, and ownership, providing a basis for which to understand big data (Kaisler, Armour, Espinosa, Money, 2013). Xu, Jiang,Wang, Yuan, Ren (2014) through a review of literature and methods of data mining, specifically the knowledge Discovery In Databases process (KDD), discuss the techniques used in KDD based on their appreciation and understanding of big data privacy and security risks. By analyzing the KDD process, the authors identify issues that eventually result in data breaches or loss of privacy, including data integration, data selection, and data transformation. Further, the authors identify the types of users involved in KDD applications, including data providers, data collectors, data miners and decision makers. Following this review, the authors propose methods to ensure privacy and data protection while undertaking data mining. The proposed approaches include using privacy preserving-association rule mining, privacy preserving classification of data, use of decision trees, using the Naive Bayesian classification, and data provenance. These methods apply to different players in data mining. Moura Serro (2015) allude to the increased use and sharing or personal data and information to public clouds and social networks when using a variety of devices, making data privacy and security, especially in the context of big data an important and hot issue. The authors also allude that traditional methods for enhancing data security, including the use of demilitarized zones and firewalls are not suitable for application in computing systems to su pport security in bid data. By reviewing existing literature, discussing some of the sources and causes of risks to data security in big data, and using case studies, the authors propose the use of Software Defined networking (SDN) as a novel approach to implement security in big data and address data privacy concerns. Narayanan, Huey, Felten, (2016) argue that once data is released to the public, it is not possible to take it back; with time, additional datasets become public with more analytics and information on the original data, including PII can be revealed making big data information increasingly vulnerable to being re-identified especially because current ad-hoc methods of De-identification being presently used are prone to being exploited by adversaries. It is not possible to know the probability of data being re-identified in future, and so the authors call fr a precautionary approach to securing privacy in big data. Risks to data privacy go beyond stereotypical re-identification and that it is impossible to know for certain the privacy risk for data protected using ad-hoc De-identification. According to Tene Polonetsky (2013), big data, data mining, and data analytics play a huge and critical role; data can be mined and analyzed in its raw form without the need to store and access dat a from structured databases. However, it comes with the challenge and problem of data privacy concerns that can result in regulation that would necessitate a backlash and stifle the befits of big data. The researchers propose that policy makers must balance the benefits of big data with privacy concerns, especially the need for privacy and what is defined as personally identifiable information (PII). Sagiroglu Sinanc (2013) discuss the concept of big data and its various aspects and concepts, including sources of data, their transmission, storage, and data mining, and then discuss in detail the privacy issues and concerns in big data. The authors, in an extensive review of literature, show that keeping data in a single place increases chances for breaches becomes it becomes a target for attacks. The authors propose controlled storage management, with encryption, restricted access to data, and securing the networks through which big data is managed. Terzi, Terzi, Sagriroglu (2015) provide a fresh perspective on big data security and privacy where extra security measures must be put in place to ensure security. The authors suggest, based on their research and literature review, th at extra security must be placed on big data networks through encryption, controlled access to devices, controlled access to network resources, data should be made anonymous before being analyzed, communications should proceed in secure channels, and networks monitored continuously for threatsMethodology This paper uses a critical systematic review of literature in which clearly formulated questions are used to undertake explicit and systematic approaches are used for identification, selection, and critical appraisal of relevant research and for collecting and analyzing data from those studies I order to generate novel solutions to the issue of privacy in big data. Challenges and Issues As more data is collected from connected devices and systems, the existing security protocols such as fire walls and DMZS are becoming increasingly irrelevant as means for ensuring big data security. The present issues in big data security and privacy are in four main areas; infrastructure, data privacy, data management, and integrity and reactive security (Kaisler, Armour, Espinosa, Money, 2013). With regard to infrastructure, the main issues include secure distributed data processing and best security and privacy actions for non relational databases. As relates data privacy, the main issues include data analysis through data mining methods that preserve data privacy, using cryptography for data security and privacy, and granular access control. The challenges in data management and integrity relate to granular audits, secure data storage as well as transaction logs, and data provenance. Reactive security and privacy issues allude to Validation and end to end filtering and real tim e supervision of privacy and security levels. The internet of things (IoT) is a major area of concern as relates privacy and security in big data. It has become difficult to to do anything in the present life without someones identity being associated with the task, from surfing the web to making social media comments and engaging in e-commerce. Security breaches also greatly compromise security through vulnerabilities in web interface insecurity, insufficient authentication and authorization, lack of encryption, insecure cloud and mobile device interfaces, inadequate security reconfigurability, insecure firmware and software, and poor physical security. In addition, companies unknowingly track and collect user data and pass them on to other people such as marketers for commercial gain, exposing private user data without their consent. Proposed Approach A novel approach is used based on the use of several methods, tools, and techniques to ensure data privacy and security is maintained in big data use. The limitations of traditional techniques for ensuring data privacy and security can be overcome using modern approaches that include Fully Homomorphic Encryption (FHE), Secure Function Evaluation (SFE), and Functional Encryption (FE). FHE is an encryption approach that allows specific computation types such as RSA to be undertaken on cypher text and generate encryptions that when decrypted matches operation results performed on plain text. This enables databases queries to be encrypted and keeps user information private from the location this data is stored. FHE also allows private encrypted queries to search engines and also helps ensure private user information remains private. Searches can also be conducted on encrypted data, such as encrypted social media data that helps keep identities private. The use of open rights management s ystems, specifically, OpenSDRM: this is a system architecture that allows different content business models to be implemented. The architecture is shown below; This approach, together with FHE, will ensure social media information is mined with privacy and anonymity retained. The proposed novel approach assumes initial registration of system services on the platform meaning that each of the different services have to be registered individually on the platform. Unique credentials are assigned to each service on the platform. The rights management platform manages user generated content (UGC) that enables secure storage of content securely in locations that have been configured. When social media users upload UGC, it remains protected and the permissions, rights, and restrictions about this content is user defined, and so helps retain privacy. This enables content generators and those willing to use such content, such as data mining firms, are registered and authenticated in the socia l network platform as well as on the rights management platform. Because users willing to access UGC on the platform must be registered and authenticated and given that UGC is presented in special URI form, user privacy is achieved. This is because the special URI is intercepted by the platform for rights management allowing secure access process. Another approach is to is an intelligent intrusion detection and prevention system (IDS/IPS) based on a software defined network (SDN). A Kinetic module controls the IDS/IPS behavior using the Kinetic language, which is a framework for controlling SDN where network policies can be defined as Finite State machines (FSM). Several dynamic event types are able to trigger between FSM states transitions. The IDS/ IPS security module ensures non privileged hosts and infected hosts are dropped; infected but privileged hosts then traffic from that specific hosts is redirected to a garden wall host automatically where corrective measures are taken on the infected host, A non infected host has its traffic directed to the intended destination. Performance Evaluation Evaluating the two approaches using a simulation in Linux showed promising outcomes in ensuring user private data is secured. The use of FHE as well as IDS/IPS not only ensures that private user data is maintained bot in databases as well as in internet search queries, but that the information remains secure from intrusion and unauthorized access, such as attacks undertaken using hacking techniques. Conclusion The increased use of big data and several interconnected devices, as well as technological advancements have led to massive data volumes being generated. The generation and use of big data has several economic and technical innovation benefits, but also raises risks of data privacy breaches, along with the 3Vs challenges. In this paper, past approaches have been evaluated and using a systematic review of literature, a combined approach using FHE encryption technologies and IDS/IPS to ensure personal user data remains private and secure, even when insights are used for big data analytics. An evaluation of the approach shows the proposed methods are highly promising in ensuring big data privacy and security. References Kaisler, S., Armour, F., Espinosa, A., Money, W. (2013). Big Data: Issues and Challenges Moving Forward. In 46th Hawaii International Conference on System Sciences (pp. 995-1003). Hawaii: IEEE Computer Society.Kache, F. (Ed.). (2015). Dealing with digital information richness in supply chain management: A review and a Big Data analytics approach. Kassel: Univ.-Press.Kerr, I., Earle, J. (2016, August 10). Prediction, Preemption, Presumption | Stanford Law Review. Retrieved from https://www.stanfordlawreview.org/online/privacy-and-big-data-prediction- preemption-presumption/Lei Xu, Chunxiao Jiang, Jian Wang, Jian Yuan, Yong Ren. (2014). Information Security in Big Data: Privacy and Data Mining. IEEE Access, 2, 1149-1176. https://dx.doi.org/10.1109/access.2014.2362522 'Le VPN'. (2017, October 10). Why Do Companies Collect Big Data and Store Personal Data? | Le VPN. Retrieved from https://www.le-vpn.com/why-companies-collect-big-data/Lu, R., Zhu, H., Liu, X., Liu, J. K., Shao, J. (20 14). Toward efficient and privacy-preserving computing in big data era. IEEE Network, 28(4), 46-50. doi:10.1109/mnet.2014.6863131Martin, K. E. (2015). Ethical Issues in Big Data Industry. MIS Quarterly Executive, 4(2), 67-85. Retrieved from https://www.researchgate.net/publication/273772472_Ethical_Issues_in_Big_Data_IndustryMoura, J., Serro, C. (2015). Security and Privacy Issues of Big Data. Handbook Of Research On Trends And Future Directions In Big Data And Web Intelligence, 3(1), 20-52. https://dx.doi.org/10.4018/978-1-4666-8505-5.ch002Narayanan, A., Huey, J., Felten, E. (2016). A Precautionary Approach to Big Data Privacy. Data Protection On The Move, 24, 357-385. https://dx.doi.org/10.1007/978-94-017-7376-8_13Tene, O., Polonetsky, J. (2013). Big Data for All: Pr ivacy and User Control in the Age of Analytics. Northwestern Journal Of Technology And Intellectual Property, 11(5).Sagiroglu, S., Sinanc, D. (May 01, 2013). Big data: A review . In 2013 International Conference o n Collaboration Technologies and Systems (CTS 2013). 42-47. Ankara; Hawaii: IEEE Computer Society.Terzi, D., Terzi, R., Sagriroglu, S. (2015). A Survey on Security and Privacy Issues in Big Data. In The 10th International Conference for Internet Technology and Secured Transactions (pp. 202-206). London: International Conference for Internet Technology and Secured Transactions.'The Economic Times'. (2018, April 11). Mark Zuckerberg apologises to Congress over massive Facebook breach. Retrieved from https://economictimes.indiatimes.com/tech/internet/mark- zuckerberg-apologises-to-congress-over-massive-facebook- breach/articleshow/63704093.cms
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.