One of most exciting developments in safety is the application of data science to occupational health and safety (OHS) information. Data analytics has transformed the financial, retail and insurance sectors, providing unparalleled insights into customer choices and behaviour. Advancements in data analytics along with a major enlargement in the provision of OHS data will change how the practitioners apply data science.
Rich in information, poor in insight
The profession's use of data has historically been dominated by injury and ill health data, primarily because this information has been readily available. However, even the best analysis of incident data is only telling half the story because it ignores the potential influential factors such as changes to work type, hazards or exposure periods. Many of the factors that are potentially influencing performance are non-OSH information and much of this data is stored in different business systems or external sources. Data sources are also frequently stored within different formats and systems, making it very difficult to access. Advancements in data science enable us to access and mine different sources of information in speeds and methods not previously possible.
Improved analytical techniques enable other variables such as productivity, weather or scheduling to identify system-related factors that could be influencing performance. Conversely, data science also enables us to identify and evaluate the effectiveness of different interventions on OHS outcomes. This will enable a more systematic perspective to be taken and more evidential interventions to be made.
From words to meaning
One of the techniques set to transform OHS information is natural language processing (NLP) which transforms textual information into normalized, structured data that can be interpreted and analysed. OHS has many different types of data but much of it is textual, such as an individual's description of what happened. Conventional analytical techniques require information to be gathered in a structured way using pre-defined categories, such as the type of incident or injury location. The valuable information however is textual and analysing this across thousands of reports has so far been impractical. This will enable previously siloed information to be studied, allowing us to "read between the lines". NLP also enables voice recognition to be used to gather information, making it easier for people to report and engage with OHS.
The value of analytics
Advancements have also been made in analytical techniques that will improve how we evaluate the OHS data. There are four different types of data analytics, each progressively deriving more value. The greater the insight revealed generally requires larger amounts of data, time and specialist skills.
Descriptive analytics describes what happened and is the most basic type that has long been used in OHS. It uses historic data to identify patterns and enable interpretation from past events. It is typically conducted with single data sources and provides a simple summary of past events using performance trends, such as changes in number or type of accidents. While this addresses basic questions including what happened, how often and where, it really tells us very little.
Diagnostic analytics explains why something happened by examining relationships between two or more data sets. Correlating OHS information with operational data, such as vehicle telemetry, working hours or HR demographics can help identify patterns and reveal factors that are influencing performance. It will also enable more evidence-based interventions to be taken, reducing the risk of an investment or change failing. However, there is a risk of apophenia, this is where incorrect conclusions are drawn from the results. This is important because data analytics does not prove causation, it merely shows a pattern that requires interpretation through professional experience. LR SafetyScannerTM (Video) is just such a tool, helping its users identify the direct causes hiding within the text, at scale. Developed by LR's digital innovation practice, SafetyScanner uses Artificial Intelligence and NLP to quickly and easily collect raw accident description data from multiple formats, transforming it into meaningful insights.
Predictive analytics tells us what is likely to happen based on the current events. This type of analytics enables the prediction of an event happening, such as an increase in falls if cleaning is reduced or estimating the time when an event is likely to happen, such as an increase in road accidents when snow is forecast. Predictive analytics typically involves analysing two co-dependent variables using data mining and statistical techniques to produce a mathematical model to forecast the probabilities. LR's SafetyScanner is now starting to produce predictive insight, as it is amassing enough data to identify and project patterns.
With performance gains becoming increasingly difficult and the rates of serious injuries and fatalities plateauing, data science provides new insights to address these challenges. By applying the science of analytics to their data, practitioners can understand more about the performance of their OHS programme and more accurately predict the outcomes of decisions they make.
Learn more about LR's innovation practice
LR helps organizations in asset-intensive sectors to harness digital technologies to improve safety, gain greater operational insights and efficiencies, and solve pressing challenges. To learn more, visit our innovation section or get in touch.
Explore our latest articles