We are living in an age where cheaper storage and greater processing power has made almost all business functions to be data-driven, making data the most valuable asset of any organization. With data being generated, transmitted, and stored at an exponential rate, management of data is becoming more of a challenge with data being constrained to silos and generating disparate sources. Increasingly organizations find it difficult to collect and integrate information from this heterogeneous data to make real-time business decisions. Data Virtualization is a radical approach to this data management challenge by integrating disparate sources as abstraction in a Logical Layer to deliver real-time information for business decision-making and analytics.
Data virtualization (DV) is a data management method that allows applications to retrieve and manipulate data without requiring to have a base level details of the data such as physical location, data format, access methods, access language, etc
The data virtualization layer forms a logical data layer that integrates all enterprise data across disparate systems and formats, governs the unified data for centralized security and governance, and delivers it to business users in real time.
Data Virtualization’s abilities allow organizations' IT and business to innovate in their business. Some of the benefits of Data Virtualization include the following:
There are various data management products that provide Data Virtualization capabilities, however, for the purpose of this blog we will choose to use Denodo Platform. Denodo Platform has been identified as leaders in the Magic Quadrant 3 years in a row for Gartner Data Management Tools.
Denodo is a data virtualization platform that has been helping customers with data management challenges for close to 2 decades and is a mature product for a method that recently had more adoption rate. It Provides features such as robust query optimizer (Rule and Cost Based), advanced caching, and self-service data discovery that help serve data to consumers in real-time and a wide variety of formats.
The typical data eco-systems that form part of the Denodo Platform usage pattern can be categorized as three groups or layers, though it may appear to be a simple product grouping, however, it also groups the significant features that the Platform addresses, the layers are,
Let’s look into each of these layers a little more closely,
The Connect layer consists of all the data systems that are to be abstracted into the virtual layer. These range from the traditional Data warehouse, Application Databases, Distributed Systems, SaaS Solutions, etc. This layer also can be classified to denote apart from different data storage solutions, it also deals with different types of data storages ranging from standard tuples, columnar structures, semi-structured data formats like JSON, XML, etc. The connect layer is also location agnostic in principle, that is, the source can be on-prem, Cloud or hybrid. With close to 150 Out-Of-The-Box adapters and java-based extensible sources, secured by industry-standard measures like Kerberos, OAuth, and TLS, virtually all the data stores can be connected securely.
Once the data structures in the wide variety of data storages are abstracted as “Virtual Models” called “Views”, the interoperability becomes as easy as relationally operating on standard Relational Data Structures (like tables). It is this feature, the standard operating model for all the objects in this layer, and the relative ease to operate them that makes the Platform more Powerful, Dynamic, and Agile. Apart from the “Integration” features, it also comes with a suite of toolsets to manage Security, Governance, Self-Service, and AI-ML Recommendations.
The Integrated models are immediately available to the consumers in a wide variety of access methods. The standardization of relational modeling for no relational sources provided by the Connect and Combine layers allow accessing the models as relational sources using “VQL”, an offshoot of SQL both using JDBC and ODBC. Apart from these, the Platform also allows you to consume the data in popular format/access methods like REST, SOAP, GraphQL, OData, and GeoJSON.
Twenty20 Systems and Denodo have partnered to combine expert consulting services expertise with the industry’s leading data virtualization platform, to provide organizations with impactful, data-driven insights that accelerate speed-to-market solutions.
Working together, Twenty20 and Denodo enable organizations to:
A modernized data architecture leads to immediate cost and time savings for businesses, as well as future value in not accumulating technical debt through unnecessary data replication over time.
Our combined solution with Denodo as the data virtualization platform helps customers realize accelerated ROI and address immediate speed to market requirements for their business.
Twenty20's MissionImpact offering that delivers impactful mission critical outcomes to businesses in a matter of weeks aligns well with the architecture and capabilities that Denodo's data virtualization platform brings to the table. It eliminates the need for data replication and the associated increase in data infrastructure spending costs while keeping the focus on driving speed and impact.
One of the common challenges in the data management industry is the need to integrate data sources without replicating the data and make the data accessible to the application on demand, in the formats they expect. The Twenty20 Solution Blueprint leverages the abstraction on the Semantic layer, multi-protocol access points, and Powerful optimizer of the Denodo Platform placed strategically to address the above challenges, which can be visualized as the below architecture.
The solution is categorized into 3 logical functional areas,
Integration Services: This logical layer deals with the abstractions or views that connect and combine the different data sources from System of Records like the Data Warehouses, Data Lakes, Excel files, etc and build a generic, holistic view of the data per functional area, say Customer, which collects, cleans and combines data from more than one systems like Salesforce, ERP, etc.
Engagement Services: This logical layer deals with direct abstractions or further customization of the generic entities like Customer to serve the data for a specific requirement. For example, this could be a request to get the orders of the customer for a specific region. This can be delivered as relational tuples by defining the “Derived View” in the Denodo Platform or can be delivered as a JSON/XML on-demand by “Publishing” it as a REST or SOAP Service. Irrespective of the mode of delivery this layer is to address the request of specificity.
Real-Time Data Consumption Services: This Logical layer handles the configuration of the consumers and publishers for the real-time, event-based data flow, like Kafka, and integrate the incoming Real-time data with the Integration or Engagement Service and push the result to the consumers of the same nature, that is, Event-based listeners like JMS Queues or Kafka Topics.
Applications on the User Experience area can choose one or many of the layers from which layer the data needs to be consumed from right from the Relational access for the Dashboard to the mobile app that depends on a GraphQL Query to present data to the end users.
To better understand how Data Virtualization with Denodo works, let’s discuss this scenario where our global health insurance client wants to improve its current data landscape while minimizing the disruption to its core functions, make enterprise-focused enhancements to handle these improvements, and to avoid the problems that arise with an ill-equipped data landscape.
The key business need was to build a single unified Common Data Warehouse that would consist of all relevant data extracted from their two existing Data Warehouses coming from two merging entities of our client.
Further, as part of further data cleansing, they wanted to consolidate patient and client data into a single view. The company took a hard look at the state of its data landscape and decided to consolidate patient and clients’ data into a single view.
Finally, the company needed to make these improvements while also reducing time to market for delivering analytical views to business users, minimize disruption to core functions, and leverage existing enhancement capabilities and accelerators.
To improve the state of their data landscape and improve performance, we decided to build virtual views of both their company legacy data warehouses and combined them as a unified view. This was then cached in Denodo to improve performance. The unstructured nature of the company’s data meant that it needed to be cleaned up. For this, we built a virtual view in Denodo using Extract, Transform and Load(ETL) concepts.
To improve efficiency and optimize existing developmental workflows, we built a bespoke DevOps automation process. To ensure that the security concerns were addressed, we setup a Lightweight Directory Access Protocol(LDAP) and Secure Sockets Layer(SSL) security and governance. To further improve the state of their developmental workflows, we also helped setup a Continuous Integration/Continuous Deployment process for best practices.
We hope this blog provided an in-depth explanation about how Data Virtualization with Denodo works. If you have any questions or feedback related to this article, please feel free to reach out to us using the “Contact Us” page.
SP is our passionate data architect based in Bangalore. He's a true tech enthusiast who leverages his extensive knowledge and experience to design and implement cutting-edge data solutions. He is also a people person who brings a patient and detail-oriented approach to every project. In his free time, he can be found reading his favorite DC Comics. With his expertise in designing cutting-edge data solutions, SP is an invaluable member of our team.
Sowbhagya is one of the software engineers based out of our Bangalore office. An artist in her free time, Sowbhagya loves to draw anime characters. She can often be found spending quality time with her beloved dog. In addition to caring for her dog, she also prioritizes her own health and wellness by working out regularly, whether it's hitting the gym or working out at home. She is also passionate about cooking and enjoys experimenting with new recipes in the kitchen.