AI+Data Weekly ( AI, Data, Iceberg, Polaris, Streamlit, Flink, Kafka, Python, Java, NiFi )
#167 - 09-December-2024
https://bsky.app/profile/paasdev.bsky.social
Big Announcement Coming
Happy Krampusnacht to all those who celebrate.
AWS Updates
π§ S3 Tables for Iceberg
βοΈ AWS re:Invent 2024 Announcements
π§ AWS Trainium Chips
The Coolness this week
βοΈ Apache Polaris + Iceberg Quickstart
β‘οΈ How to extract tables from pdfs
π Microsoft 1bit LLM BitNet
πΏοΈ Verifying Kafka Transactions Entry 2
πΏοΈ FLUSS: Streaming Storage
πΏοΈ Fluss -> Flow for Flink Real Time Analytics
π TableFlow - iceberg / kafka
βοΈ Snowflake Cortex AI + Slack
πΏοΈβοΈ Door dash flink, kafka, snowflake
π§ Prompt Stack -- all in one
π SpaCY Layout for PDF
π± Responsible AI Pathways
πΌ Megaparse documents python
π Time Series LLM
βοΈ Generate Synthetic Data in Snowflake
πΏοΈ LLMs and GenAI - When to use them
πΏοΈ Flink Observability with Prometheus
π‘ New SQL GUI
π« TDD for GenAI
π΅οΈ
π Open Source Agent Framework for Production
π» Cedit command line editor
π ServiceNow AgentLab
π€ Snowflake Lessons Learned in Replication
π Privastead
π Backup Icloud with nodejs on linux
π Backup Google with nodejs on linux
π HuggingFace macos chat source code
π Ollama working with structured output
π dspy ai how to
π Piazza updater
π Building a financial report with langgraph
ColPali Notebook with QWEN 2 VL
New Models
πΌ Open Source Video Foundation Model by Hunyuan
π marco-o1
βοΈ Amazon Foundation Models - Nova
βοΈ Snowflake Arctic Instruct
π« Large Scale World Model Google
π» PaliGemma Google SmOl Vision
π» Ollama 3.3
Upcoming
π» Dec 19: Conf42 IoT 2024: Virtual: https://www.conf42.com/Internet_of_Things_IoT_2024_Tim_Spann_opensource_build
Recent Tim Stuff
π» XTremePython 2024 - LLM
π» PyData NYC
π» Advanced RAG Techniques @ All Things Open Raleigh 2024
π» Building Real Time LLM Models
π» Big Data Conference EU Talk on Open Source Real-Time AI
π» CloudX AI Real-Time
π» BuildStuff - Adding Generative AI
πββ¬ Conf42 Prompt Engineering
π₯ 06 Nov 2024 AI Alliance Talk in Manhattan
π» 08 Nov 2024 PyData NYC slides
Apps, Demos, Examples, Models, Notebooks and Projects
π RAG 101
π¦ Milvus Knowledgebase
π» AIM Ghosts
π Unstructured Data - Ghosts - Part 1
π€ Multimodal RAG is not Scary Ghosts
βπΌ Advanced RAG Techniques
Technologies
CODE + COMMUNITY
Β© 2020-2024 Tim Spann https://www.youtube.com/@FLaNK-Stack
(AI + Vectors + LLM + Streaming + IoT)
Top comments (1)
Great weekly roundup! For those interested in vector search capabilities, it's worth noting that Astra DB with its vector capabilities is another solid option, especially when integrated with streaming data pipelines. Looking forward to more updates on this space!