Welcome to the 80 new members of this week (credits)

Tomorrow we'll enter in the last quarter of the year. This is crazy on how the time is flying. At the end of the year my freelancing activity will become my most significant professional experience. But at the same time I feel I've just started yesterday.

I'm so happy to see how to newsletter is turning these days. I really like to get feedbacks from you, so do not hesitate to reach me if you have something to say, it helps me a lot. In my plans I want to write more original content—that will be only for members (free and paid). But I struggle finding the time to do it. I need to rethink my time management and prioritisation. I'm super bad at it. How do you do it?

Data Fundraising 💰

As opposed to the last 2 weeks, fundraising are back this week. Money is coming back. But before, bad news. Docusign is laying off 9% of its staff.

Are tables data products?

Data mesh initiative brings at his root the domain ownership to data teams. With simple words the major change is obviously organisational. It puts technical teams closer to their business. In this case you may have to look at the Conway law to define your teams topologies.

In order to get your teams ready for the big change you'll need to identify data products every team will deliver. Data products are entities on which you apply product principles. Data products, among other things, have to be interoperable, discoverable, shareable, bounded and owned.

And it applies very well to tables. Tables are highly interoperable, discoverable and shareable—ok it's depends on your storage/engine, but still it's more than decent. Also with some processes you can easily make the tables bounded and owned. So yes, we can say that tables can be considered as a sufficient data product. BUT, not every table in the warehouse should be considered like so. LinkedIn decided to name these data products the Super Tables.

At LinkedIn Super Tables are unit of work like the jobs or the ads_event table. For instance their jobs table consolidate more the 57 sources into 158 columns. Which obvioulsy means a lot, 57 sources into one table is probably more than the average data team use in a whole warehouse. Every ST should enforce SLA to reach 99%+ availability. It then creates datasets everyone in the company can trust and use in downstream data flows.

LinkedIn move from Source-of-Truth tables to Super Tables (image from the source article)

Creating a Super Table is not an easy task. You'll need to clearly identify why people need the data to create this common asset that delivers value to the stakeholders. With domain data teams it's easier to do it because team are closer to their sources and dedicated per business, so, they should know better what's needed.

But still, once you have all the requirements you'll need to apply data modeling super skills.

As a data modeler you can help leadership bring in millions of dollars in revenue by adjusting a few lines of code.

As a final note on this, everyone is speaking about Kimball but no one read him—I confess myself—Justin wrote a post about the 4-steps dimensional design every data modeler should follow to create a well architecture tables.

ML Friday 🤖

Going back in time wihting my warehouse (credits)

Fast News ⚡️