Skip to content

Data News — Week 24.40

Data News #24.40 — Back in Paris, Forward Data Conference program is out, OpenAI and Meta new stuff, DuckCon and a lot of things.

Christophe Blefari
Christophe Blefari
7 min read
Back in Paris (credits)

Hey, hey, hey. I'm so sorry for this small break about the news. I was in middle of starting my new company, nao, and moving back from Berlin to Paris. Still I hope this edition finds you well, it will be a mix of personal news, OpenAI saga and usual data engineering stuff that I enjoy reading.

First things first, yes, I'm co-founding a company. We called the company nao and you can see it as a no-code semantic layer. Still I keep a post about it for later, but if you're interested, hmu.

Then, with my girlfriend we decided to move back from Berlin to Paris after 2 years there. It's a professional move for both of us, we will miss Berlin to be honest but a big part of our social life is in Paris. Being in Paris will ease all the events and IRL stuff I go / organise.

Forward Data Conference ✨

As a reminder, on November 25th I'm organising the Forward Data Conference. It will be a day to shape the future of the data community, where teams can come to learn and grow together. There are still tickets left—we sold around 80% of the tickets.

This week we announced the program, you can find it on our website. I really like the program we put in place, it a mix of Engineering and Strategic / Vision talks.

The conference will be held in French + English, a few talks will be given in French but we will subtitle them live and we will also find a way to always have something in English in parallel for all English native speakers.

You can use BLEF_FWD24 promo-code to get 15% reduction on your ticket.

PS: dear readers, if you proposed a talk to the FDC which has been rejected, I'm so sorry you did not get a detailed explanation, we received a lot of talks and I wasn't able to write a personal message to every talk that has been rejected. Tho, if you're wondering why, reach me and I will explain you.

AI News 🤖

  • OpenAI is our best saga about drama and tech, when the Netflix show is going out?
    • DevDay recap — OpenAI DevDay was the developer conference to announce features, models and stuff about their product. The "biggest" announcement was around Realtime API targeting the speech-to-speech applications.

      In addition they introduced prompt caching to save tokens costs, the possibility to fine-tune vision for GPT-4o. Last thing is Canvas, which is a new way to interact with the models, I'd say it's a mix of Notion and Anthropic better UI. This is mandatory for OpenAI to improve and diversify their public UI/UX in order to compete with large apps ecosystems.
    • Advanced Voice not available in EU — Advanced voice is a Siri interface on top of Chat-GPT capabilities. The unavailability in EU is lobbying at it's finest, fearing AI Act or GDPR could harm innovation. Explain to me why companies with the best engineers in the world can't find a way to make things legal.
    • They raised $6.6b at $157b valuation (and $4b in debt). Another $10b after the first in Jan 2023.
  • Meta, if there was a race, Meta would be well positioned, who would have thought after Metaverse choices?
    • Meta Movie Gen — Meta announce new research for movie generation models. Let's be honest for the moment it just feels unreal, like a video game or something in virtual reality. But in the end, this is maybe what we need?
    • New hardware (powered with AI) — Two promising product have been demonstrated a pair of glasses and a wristband that allows you to interact with virtual interfaces with your finger movements.
    • SAM 2, Segment Anything Model 2 can run on-device on Apple CoreML — A demo of image segmentation that run 100% offline and on-device. Industrial application might easily follow out of this.
    • Mark Zuckerberg says leaders should have technical skills if they want to call themselves a tech company. Yes, but technical leaders are also sometimes not the best ones, maybe the crazy ones, so other skills are required.
  • Introducing contextual retrieval — Anthropic introduced a new way to do RAG with more context, that performs better than standard.
  • Meta and Google announced automatic dubbing for resp. Reels and YouTube videos, this is something. Translation looks like a use-case that is almost solved with LLMs. It unlocks a world where languages are not anymore barriers, giving us access to instantly content and discussions all around the world, especially if it can run on-device, cheaply.
  • Web browser automation through agentic workflows — A Github repo with a demo using Gemini and Selenium to automate browser actions.
  • New AutoGen architecture — AutoGen is an open-source programming framework for agentic workflows, they designed a new architecture (to be honest I don't know what it means).
  • Klarna drama — Klarna CEO announced he will shutdown Salesforce and Workday to replace it with internal initiatives + AI. Let's see where it goes.
  • Paris police wants to keep AI surveillance in place post-Olympics — Who could have predicted?
  • Malt AI report — Malt is a French / European freelance marketplace and they dropped their new AI report. A few things I can note going through the report below.
    • Snowflake demand has largely increased and it's close to Databricks in volume, tho Hadoop demand is still larger 🙃
    • The biggest demand concern stuff around AI like LLM, Deep Learning, Machine Learning, scikit-learn, etc. — in 2024 there are 16k AI freelancer profiles
    • dbt pops out as a specific skill on freelancer profile
    • AI engineers and scientists have an average daily rate around 500€, which is 100€ more than tech and data general category.
    • AI supply is half data scientists half all other tech positions (DA, DE, Back-end, SE, DevOps).

Build the foundations (credits)

Fast News ⚡️

  • CfP for DuckCon in Amsterdam on January 31, 2025 — In January next week, the DuckCon will take place, the call for paper is still open until Oct 18th. I might propose something about yato (?).
  • dlt goes 1.0.0 — dlt announced their 1.0.0 version, as well as 1000 open-source customers in production. This version brings stability and marks a new milestone for the library.

    Side note, I'm a dltHub investor.
  • Airbyte is also going 1.0 — Following dlt (?), Airbyte is also going 1.0 with 3 objectives more use-cases, reliability and better throughput performance.
  • ❤️ NO SLIDES conference — Be careful before clicking on this link you might loose yourself in a loophole. Recently Timo organised a NO SLIDES conference, a conference where people would only share their screen and no slides. I participated to demo nao, but the demo failed, so the recording does not exist anymore (oups), still I've watched other few talks and really enjoyed.
  • ELT with Kestra, DuckDB, dbt, Neon and Resend — How with Kestra you can create a declarative data pipelines to move data using the trendy libraries.
  • DuckDB is the foundation.
  • Fast feedback when SQL writing — A nice experiment showcasing how writing SQL tomorrow would look like. Imagine getting results directly while typing to have a faster iteration loop.
  • BigQuery jobs explorer refreshed — Google team released a fresh new explorer for BigQuery Jobs.
  • Coursera and Joe Reis launched a Data Engineering Professional Certificate — I can't recommend Joe enough, he's one of the best when it comes to capture date engineering job and the syllabus is great.
  • Current state of Databricks SQL — "The best data warehouse is a lakehouse", lmao. Episode 21425325 in the competition between Snowflake and Databricks.
  • The data death cycle — 5 traps you wanna avoid to deliver value with Data & AI products: the tech trap, the doing trap, the project trap, the silo trap and the performance-first trap. And follow-up about silos by Hugo.

No comments

Mainly because of time and length of this issue.

Data Economy 💰


See you soon ❤️

Data News

Data Explorer

The hub to explore Data News links

Search and bookmark more than 2500 links

Explore

Christophe Blefari

Staff Data Engineer. I like 🚲, 🪴 and 🎮. I can do everything with data, just ask.

Comments


Related Posts

Members Public

Data News — Week 24.45

Data News #24.45 — dlt Paris meetup and Forward Data Conference approaching soon, SearchGPT, new Mistral API, dbt Coalesce and announcements and more.

Members Public

Data News — Week 24.37

Data News #24.37 — OpenAI o1 new series, building low cost platform with Model dlt and dbt, Data teams survey, feature store, Ibis without pandas.