This post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be valid.
Blockchain technology represents a fundamental shift in how data can be shared, in an open and transparent method. At the core of blockchain’s appeal is the guarantee to all parties involved that the data they send and receive cannot be changed or tampered with—as well as being error-free. While this may sound like it has its main significance at a technical level, there are high-level business ramifications that executives should not ignore, such as the ability to prevent data leaks and ensure safe transfers of financial or medical data.
Blockchain technology allows data to be recorded in a manner that prevents any possible tampering, which enhances both the security and the transparency of the data. In light of the increasingly massive amount of data being processed today, the addition of this new technology enables greater security and governance. By outlining the advantages of blockchain and how it can blend seamlessly into an existing big data environment, we can see why blockchain is an important advance.
Transparency for All
While the technological details of how blockchain applications are implemented involve complex algorithms, the high-level concept is simple. If you are working with a set of bank transactions, for instance, and you to store the data in a distributed, decentralized manner, blockchain makes it obvious if any data is changed or mutated.
The decentralized nature of today’s data makes blockchain a complementary technology to help ensure security. It guarantees that it’s not possible to mutate data without anyone knowing about it, and this is vital in any industry that relies on the idea of a “single source of truth.” According to the CPA Journal, “the transactional data in blockchain could provide valid evidence showing any potential irregularities,” helping businesses combat financial reporting fraud such as the overstatement of revenue. This kind of transparency can have a direct impact on a business’s bottom line.
Blockchain Meets Big Data
Poorly managed data presents real problems for many enterprises: inaccurate hospital and patient data can lead to loss of life, and stolen data can mushroom into mass identity theft. A massive data breach can undermine public confidence in a company and have serious financial repercussions. These considerations make it essential that executives be on top of their companies’ security concerns.
A first-level pass at modernizing data assets is often to implement a Hadoop-based big data platform. By implementing a big data system, companies gain access to powerful distributed computing power and one or many scalable “data lakes” to store all their data. Blockchain adds a key component to this already transformative approach.
Blockchain ensures data objectivity—a single source of truth. Blockchain also represents a security layer that ensures that data is encrypted in such a way that only the people you want to can read your data. It makes it next to impossible for people to corrupt or manipulate the data—or even gain wrongful access to it—because the system raises an instant red flag when a problem occurs, and it uses a new, advanced encryption method to secure the data.
Blockchain is both reactionary—alerting users to changes—and proactive, by preventing the security threat. And even if the data is somehow breached, it still can’t be used. The effects have already been seen in the healthcare industry, where technologies using blockchain have provided the proper balance of security and governance for people’s health data.
Data Scientists Need Blockchain
Many industries have various advanced analytics use cases. Insurance companies, for example, are consumers of large, highly personal data sets: they use advanced analytics to offer real-time personalized price points, and they need both accurate and highly secure data sets. The sensitivity of this type of data could benefit from the addition of blockchain technology when working with this type of sensitive data.
Implementing blockchain can help increase the quality of data that powers many of these predictive models, and in so doing can have a direct business impact. Compromised data leads to bad results, and blockchain protects businesses in detecting this corrupt, compromised data. It is often the case that advanced algorithms are applied to data sets that suffer from data quality issues, and small data sets or messy data (coming from old, inaccurate data, or data that has become corrupted) often prove to be one of the largest limiting factors in gaining business advantages with data science.
In a large enterprise, the process often looks something like this: Data is created in multiple distributed systems, then transformed, and reingested into various downstream systems, and through these complex transformations, it becomes out-of-date or modified. For the sake of transparency, safety, and accuracy, your business simply can’t afford not to look into adding blockchain to their processes.
Find out more about what implementing a modern data warehouse can do for you and your business.