Use Cases for Cloud-Based OT Data Management and Systems Integration
The data lake concept emerged around 2015, as IT and operations sought to aggregate unstructured data in its native format for analytics and other data science purposes. Data lakes are non-governed cloud sandboxes, free from the data cleansing rigor required of classic data warehouses (e.g., historians). The data lake environment offers data scientists and analysts a container to load raw data, in its native format, from multiple systems so that they can aggregate and store data, develop data mining parameters, and deploy analytics rapidly.
Many customers are already applying data lake capabilities to several use cases:
- A corporate initiative forces the local site to move decades of historian data to the cloud
- Data scientists, either locally or at Corporate, want consolidated OT data for operational analytics
- Corporate tasks IT with integrating volumes of industrial data on the cloud for reporting purposes
- Remote sites need the ability to move multiple OT data streams through one data connection
- Corporate reliability management wants to monitor asset conditions enterprise-wide
- Global IT wants to Store and Forward (SaF) data from enterprise OT systems if connections are lost
Data Lakes, Data Warehouses & OT Data Defined
Data lakes store unstructured data from multiple sources in its native form (i.e., historian, asset monitoring, process data); a data warehouse stores well-defined, structured information (i.e., ERP, CRM, MRP data). Both are cloud solutions. Below are truncated definitions from Gartner:
- Data lake – collects unrefined data (that is, data in its native form, with limited transformation and quality assurance) and events captured from a diverse array of source systems.
- Data warehouse – houses well-known and structured data; supports predefined and repeatable analytics needs that can be scaled across many users.
- Operational technology (OT) – hardware and software that detects or causes a change by directly monitoring and/or controlling industrial equipment, assets, processes, and events.
- Programmable logic controllers (PLCs)
- Distributed control systems (DCS)
- Safety instrumented systems (SIS)
- Computerized maintenance management systems (CMMS)
- Operational/plant historians (e.g., OSI PI)
- Manufacturing execution systems (MES)
- Asset management systems (AMS)
- Continuous emissions monitoring systems (CEMS)
- Wired/wireless sensor data with software/online condition monitoring (IIOT)
Figure 1 – OT Users Often Waste Time Accessing Stranded Data Across Multiple Systems and Silos
IT/OT Data Management Costs and Interoperability Challenges
The depth and breadth of OT data found in most plants pose challenges to organizations seeking IT/OT integration. OT data is diverse and lives in many different databases. Dedicated point-to-point interfaces live between the user application data and create costly and complex architectures with dozes to hundreds of interfaces across the enterprise.
Figure 2 – Costly Interoperability Paths Are Required to Integrate OT Point Solutions to the Cloud
Connecting to the data that resides in the OT system poses the most difficult challenges. Most customers will want to get data to the cloud at an enterprise scale but will not know how to connect to the data in these systems without disrupting them. And cloud connectivity for point solutions doesn’t scale effectively, gets expensive quickly, and is often challenging to manage from a central location. Some of the questions we encounter include:
- How do I tunnel DCS data across firewalls (L2-L5 security) and land it in an enterprise cloud?
- How do I ensure I don’t break or disrupt any of my OT systems?
- How do I access stranded data and combine multiple OT data streams from a single site?
- How do I transfer the data through a single TCP connection across multiple firewalls?
- When network connections are lost, how do I Store and Forward (SaF) data from my OT systems?
Consider a Data Lake Solution
Data lakes offer a robust, simple deployment and cost-effective platform, acting as a single source of truth for integrating, connecting, and analyzing OT data. If your IT/OT teams are considering moving OT data to the cloud, be sure that your solution encompasses:
- Single platform designed for IT/OT integration and includes interoperability software components
- Future-proof architecture and database that can sustain your needs for decades
- Compatibility with current cloud providers (i.e., Microsoft Azure, AWS, or Google Cloud)
- Adaptive, flexible industrial data integration that will not cause software or production outages
- Built-in systems interoperability components that reduce or eliminate additional licensing costs
Figure 3 – Emerson’s Optics Data Lake Solution Integrates and Structures OT Data on a Single Platform
Emerson’s data lake solution closes the IT/OT data management gap by offering an OT data-centric, single platform compatible with enterprise cloud providers. The solution includes the interoperability software IT and OT leaders will need to achieve their integration requirements. Finally, sites can implement Emerson’s data lake solution without disrupting their OT systems and mission-critical production processes.