A pair weeks in the past, dbt Labs made a giant splash at their yearly convention by asserting the brand new dbt Semantic Layer. This was a giant deal, spawning excited tweets, in-depth thinkpieces, and celebration from companions like us.
The time period “semantic layer” (also referred to as a “metrics layer”) has been round for many years. dbt didn’t invent the idea, nor the phrase, although their model is actually value being attentive to.
“However Austin, what’s a semantic layer then?” So glad you requested.
On this article, I’ll break down what a semantic layer is in easy phrases and why it is best to care about dbt’s Semantic Layer.
What’s a semantic layer, and the place did it come from?
Semantic layer is a really literal time period – it’s the “layer” in an information structure that makes use of “semantics” (phrases) that the enterprise consumer will truly perceive. Typically it’s known as the “enterprise layer” or the “metrics layer”.
As an alternative of uncooked tables with column names like “A000_CUST_ID_PROD”, knowledge groups construct a semantic layer and rename that column “Buyer”. Semantic layers assist to cover complicated code from enterprise customers. This code can get fairly complicated as knowledge groups attempt to seize the enterprise logic for key metrics, dimensions, and schemas.
So the place did this concept come from? Again within the day (I’m speaking in regards to the ‘90s and early 2000s), we had fairly fundamental knowledge tech. It was very sluggish and really arduous to make use of in case you didn’t have a deep IT background.
Large corporations like IBM, SAP, and Oracle constructed Enterprise Intelligence (BI) instruments like Cognos, Enterprise Objects, and Oracle BI, which might take smaller chunks of knowledge from a clunky knowledge warehouse and let IT individuals construct these semantic layers for enterprise customers. Basically, they have been extra human-readable knowledge layers for enterprise customers.
The problem with early semantic layers
This business-friendly layer appears like a “good to have” enchancment, however it was actually a necessity as a result of making an attempt to run even a fundamental report throughout a complete knowledge warehouse might take hours and even days. (Sure, days.)
Enter the primary downside: old-school semantic layers took wayyyyy too lengthy to construct, since individuals relied on IT to arrange and modify them. To make issues worse, they have been cumbersome to keep up since enterprise wants have been at all times altering.
The enterprise customers’ resolution… export to Excel!
Enter fancy new BI instruments like Tableau, Qlik, and Energy BI. The idea was that if we empower the enterprise customers to “self-serve” with low-code or no-code BI instruments, the IT bottleneck will go away and analytics will formally be democratized! No less than, that was the thought.
Enter the second downside: we deserted the semantic layer idea for years, in favor of agility.
Not like previous IT instruments, extra personas might purchase and use these new BI instruments. As an alternative of 1 BI instrument utilizing 1 semantic layer, constructed by 1 group from 1 knowledge warehouse, we had a number of BI instruments, being utilized by all types of groups with no actual semantic layer.
Simply image this state of affairs, which in all probability appears all too actual to most knowledge individuals. I carry my Tableau dashboard to a gathering, another person brings their Excel workbook, and another person brings a Energy BI dashboard. All of us then present completely different numbers for “whole income final quarter”. Uh oh!
After years of alternately ignoring and chasing the self-service BI dream, this matter blew up within the knowledge world once more. (We even flagged this as one of many six massive concepts from 2022 in our Way forward for the Trendy Knowledge Stack report.)
This began in January, when Base Case proposed “Headless Enterprise Intelligence”, a brand new strategy to fixing issues with enterprise metrics and phrases. A pair months later, Benn Stancil talked in regards to the “lacking metrics layer” in as we speak’s knowledge stack.
That’s when issues actually took off. Airbnb introduced that it had been constructing a home-grown metrics platform known as Minerva to resolve this challenge. Different distinguished tech corporations quickly adopted go well with, together with LinkedIn, Uber, and Spotify. Then dbt opened a PR hinting at a metrics or semantics layer, which included hyperlinks to these foundational blogs by Benn and Base Case.
This was such a scorching matter that one among our Nice Knowledge Debates was all in regards to the metrics layer, with a fiery dialogue between Drew Banin from dbt Labs and Nick Handel from Remodel.
The end result has been a giant open query within the knowledge and analytics world — how can we carry again all the good issues that IT beloved about semantic layers (consistency, clear governance, and trusted dependable knowledge) with out compromising the agility that analysts and enterprise customers demand?
Now lower than two years after this debate kicked off, plainly the way forward for the semantic layer has lastly turn into a actuality.
The dbt Semantic Layer
Enter dbt Labs and its new Semantic Layer!
The dbt Semantic Layer is the interface between your knowledge and your analyses: A platform for compiling and accessing dbt property in downstream instruments.
Knowledge practitioners can outline metrics of their dbt initiatives, then knowledge customers can question persistently outlined metrics in downstream instruments.
Cameron Afzal, Product Supervisor for the dbt Semantic Layer
The core idea behind dbt’s Semantic Layer is: outline issues as soon as, use them anyplace.
Why does that make individuals glad? This brings the idea of a semantic layer and its common metrics into dbt’s transformation layer. As dbt Labs put it, “Knowledge practitioners can outline metrics of their dbt initiatives, then knowledge customers can question persistently outlined metrics in downstream instruments.”
Knowledge groups can construct these fashions and metrics in dbt, after which tie them into their different developer instruments like model management and launch administration with the Semantic Layer.
No matter what BI instrument they use, analysts and enterprise customers can then seize knowledge and go into that assembly, assured that their reply would be the similar as a result of they pulled the metric from a centralized place.
dbt + Atlan
The dbt Semantic Layer is nice in its personal proper, however what makes it much more thrilling is the way it ties in with key instruments throughout the fashionable knowledge stack… and we’re one among them!
Alongside the dbt keynote, we introduced our partnership with dbt Labs and our integration with the Semantic Layer. With this, joint clients may have entry to an end-to-end governance framework for knowledge fashions and metrics within the trendy knowledge stack.
The dbt Semantic Layer created a normal method to outline metrics throughout your transformations and fashions. Now our integration brings these wealthy metrics into the remainder of the info stack.
With this integration, dbt metrics and fashions are first-class property in Atlan. Because of this they’re searchable and discoverable by means of our platform and a part of auto-generated, column-level lineage, similar to any Snowflake desk, Fivetran pipeline, or Looker dashboard.
Our native dbt Cloud integration ingests all dbt metrics and metadata about dbt fashions, merges it with metadata from all different instruments within the knowledge stack, creates column-level lineage from supply to BI, and sends that unified context again into instruments like Snowflake and the BI instruments the place individuals work day by day.
With highly effective affect and root trigger evaluation, trendy knowledge groups lastly have the instruments they want for end-to-end knowledge governance and alter administration at each stage of the info lifecycle.