In this solution accelerator, we demonstrate how to use the Databricks Lakehouse Platform to better understand and quantify the holistic ESG impact of any investment in a company or business in order to generate alpha, mitigate reputation risk, and maintain the trust of both clients and shareholders.
ESG is a data and AI challenge. The world of ESG is highly unstructured by nature. If we look at the 40 most commonly disclosed policies, only 10 are metrics, and only 10 are hard numbers. The rest are purely policies, initiatives, mainly texts coming from different systems.
So how do we apply AI to quantify the unquantifiable and compare organizations in a much more data-driven way than a subjective score? And that feeds into many different use cases down the line from responsible investing, to supply chain resilience, carbon footprint reduction, and even reputational risk.
In the first series of notebooks, we show you how to extract those ESG initiatives from unstructured PDF documents using AI. And then bridge the gap between what a company says versus what a company does using alternative data. Applying that model on news analytics, on social media, to really bridge the gap, in a holistic view of an ESG practice for you to trust and act upon.
This first notebook shows practitioners how to programmatically extract PDF documents, extract each and every single sentence, every single initiative, and apply natural language processing, and AI to understand what those statements are all about. Using a vocabulary that is ESG specific with themes such as diversity and inclusion, code of conduct, supporting communities and green energy- the analysis is able to machine learn these themes and move away from lengthy and verbose commonly discussed policies to machine learning initiatives that helps us to summarize complex PDF documents into 24 machine learning policies that we can compare. We can also compare organizations side by side based on how much they disclose in each of those categories.
If we take an example for a specific industry, or across your investments, or for your different competitors, or your different suppliers, we look at how much Company A says they’re valuing employees compared to Company B; how much more company C invested in renewable energy compared to company D and so on. This analytical framework helps better understand these organizations and how they differ in their ESG activities.
By applying this model that uses machine learning in news analytics, we show you how to bring 100 million news articles in real-time to understand not only what a company is saying, but also what a company does in reality. Additionally, we also find what the reception was from the media and social networks.
WIth news and media analysis as a proxy, we show you how to extract each and every single article across the E, the S, and the G, and bring that real time view of ESG instead of waiting for year long CSR disclosures, This can be applied for every single business, large or medium companies, financial services or healthcare, and understanding the influence one business may have in another in a global market.
Questions arise- what about organizations that you do business with that may positively or negatively affect their ESG practice? Moreover, how can you act on those different insights using a market risk calculation looking at this from a reputational risk, or supply chain resilience perspective?
In this example, we show you how to create a simple synthetic portfolio and tie those ESG insights within the market risk framework. It is no longer about how good a company looks or how does a company do good – but also how performant this company may be.We have seen that say, and actually do a lot of ESG practices tend to have lower market volatility. We show in the context of value at risk, that an all synthetic portfolio is two times more volatile due to bad ESG practices.
We are, therefore, able to bridge the gap between the data science and engineering world into a BI and AI dashboard. Combining all those insights into one platform to understand how much a company says about E, S, and G, versus how much a company does across each and every one of those 24 machine learned policies, bringing this real time view of an ESG practice, informs us in real time about news events that may positively or negatively affect the ESG of every single company, and in-turn the impact it may have on market performance.
Ready to get started with ESG analytics? Click on the link in the description below to go right to our full write up for this solution and gain a deeper understanding of how the Databricks’ Lakehouse uniquely solves for the challenges associated with going from batch to streaming and BI to AI.
Or visit the Databricks Solution Accelerator Hub to see all our available accelerators as well as keep up to date with new launches.