How businesses are using Lakehouse? Why should you care about it?

If you’re familiar with the term data warehouse, you’ll know that it’s a system for storing structured data for business intelligence and reporting purposes. However, as businesses have started to appreciate the value of unstructured data, such as images, videos, and voice recordings, a new type of framework called the data lake has emerged. While the data lake is a powerful and flexible infrastructure for storing unstructured data, it lacks certain critical features, such as transaction support and data quality enforcement, leading to data inconsistency.

To address these issues, a hybrid architecture was needed that could store both structured and unstructured data. This led to the development of the data lakehouse, which unifies structured and unstructured data in a single repository. Organizations that work with unstructured data can benefit from having a single data repository instead of requiring both a warehouse and a lake architecture.

Data lakehouses allow for structured and schema just like those used in a data warehouse to be applied to the unstructured data type that is typically stored in a data lake. This enables data users, such as data scientists, to access information more quickly and efficiently. Intelligent metadata layers can also be used to categorize and classify the data, enabling it to be cataloged and indexed like structured data.

Data lakehouses are particularly well-suited to inform data-driven operations and decision-making by organizations that want to move from business intelligence (BI) to artificial intelligence (AI). They are cheaper to scale than data warehouses and can be queried from anywhere using any tool, rather than being limited to applications that can only handle structured data, such as SQL.

As more organizations recognize the value of using unstructured data together with AI and machine learning, data lakehouses are becoming increasingly popular. They represent a step up in maturity from the combined data lake and data warehouse model, and they will likely become simpler, more cost-efficient, and more capable of serving diverse data applications over time.

Headless Plex Client using HiFiBerry and Raspberry Pi 3

Plex media server is a great way to manage personal media collection (mp3 music, family videos and photos). I’ve been using Plex Media Server running in a Synology NAS for quite sometime to manage a sizeable collection of music, videos and photos.

I stream my personal collection, especially some of my favorite Indian musicians through the living room stereo and bedroom stereo (powered by a Class T Amplifier (Trends Audio Class-T TA 10.1) connected to a pair of Axiom speakers). This stereo system was running “Rasplex” for quite sometime on a Raspberry Pi with HifiBerry DAC+ Pro board, but lately been experiencing issues and bugs that’s been frustrating. And was looking for various alternatives that’ll seamlessly work with Plex to stream audio to my stereo. Recently, stumbled across a project promoted/implemented by Plex CTO to run PlexAmp run in Raspberry Pi.

This guide shows how I setup a headless PlexAmp using the latest build from Plex.

Hardware

HifiBerry Board

Install Ubuntu Server in Raspberry Pi

  • Install Ubuntu 11 (bullseye) in server mode, no graphical display.
    • Follow the steps in this page to install Ubuntu Server in Raspberry Pi 3 (Link) or here .
  • install ssh server in the ubuntu server.
    • sudo apt-get install openssh-server
    • and enable it by running following (so that ssh server will start when the system reboots):
      • sudo systemctl enable –now ssh

Configure and Enable HifiBerry Board

  • Follow instructions here to configure and enable Hifiberry board in ubuntu.
    • Hifiberry drivers are already included in the Linux Kernel for Raspberry Pi OS. So you just need to follow the instructions below.
    • vi /boot/config.txt file
      • comment out “dtparam=audio=on” line (Basically put a “#” in front of the line)
      • add audio=off to the “Enable DRM VC4 V3D driver” section
        • “dtoverlay=vc4-kms-v3d, audio=off” (I guess, this disables built-in driver!)
      • add following lines  to the same file.
        • dtoverlay=hifiberry-dacplus
        • force_eeprom_read=0
  • Then created /etc/asound.conf file with following:
                   pcm.!default {
                      type hw card 0
                   }
                   ctl.!default {
                      type hw card 0
                   }
  • Reboot “reboot”
  • run “aplay -l”
    • You should see following output:
    • card 0: sndrpihifiberry….

Install PlexAmp

Now that we’ve installed OS and HifiBerry is enabled and configured, next step is to install and enable PlexAmp client. As far I can tell, there is no guide to cleanly install and debug the client. I believe this software is still actively developed, but I was able to get it working with simple tweaks. Following is what I did:

  • To install PlexAmp (In server mode with no graphics) I followed instructions here.
  • first install nodejs server
    • sudo apt install nodejs
  • Download latest headless PlexAmp client for Raspberry Pi
  • Untar the file
    • tar -xvf Plexamp….tar.bz2
  • Go to “plexamp” directory
    • cd plexamp
  • Now run the node webserver
    • node js/index.js
    • Above step will start the PlexAmp client scripts that are part of the nodejs server.
  • At this point make sure you can see the “Raspberry PI” (screenshot below) in your iPhone or Android Plex client.
Check to make sure “hifiberry” shows up in the clients.
  • Once you confirm the PlexAmp client shows up in the iPhone app, you know the install and configuration is successful.
  • Next step is to make sure the PlexAmp client starts automatically when Raspberry OS bootsup. For doing that, you need to run PlexaAmp as a service.

sudo cp plexamp.service /lib/systemd/system/
sudo systemctl daemon-reload 
sudo systemctl enable plexamp
sudo systemctl start plexamp

That’s it.. you should be able to stream audio from your Plex app in iPhone, select PlexAmp as the client to play audio through your home stereo!. Enjoy.

If you are curious.. here is my setup:

Databricks vs Snowflake: Performance blog war!

Healthy and open competition is great for any industry, especially the technology industry that is super conscious of price/performance. Databricks and Snowflake are two great companies duking it out in this price/performance game recently. They’ve published blogs with claims and counter claims. Great insights provided by both blogs and interesting read as well. If you are too rushed to read the entire blog(s), the images below gives you a snapshot of what is said in the blogs. Enjoy!.

Earlier this month, Databricks published a blog claiming “World record” performance for processing 100TB tpc-ds benchmark. It also said corroboration by Barcelona supercomputing center (BCS). And squarely aimed it at Snowflake!https://databricks.com/blog/2021/11/02/databricks-sets-official-data-warehousing-performance-record.html

Chart 1: Elapsed time for test derived from TPC-DS 100TB Power Run, by Barcelona Supercomputing Center.

Source link

Chart 2: Price/Performance for test derived from TPC-DS 100TB Power Run, by Barcelona Supercomputing Center.

Source link

Yesterday, Snowflake published a counter blog (written by the founders benoit and thierry). Basically says, you don’t need a third party, just do it yourself in our cloud platform, it is so simple to verify!!.https://www.snowflake.com/blog/industry-benchmarks-and-competing-with-integrity/

Source Link

Source Link

Nikola Motors Fraud

How to Lie Your Way to $34 Billion. If you want to understand what happened to the Tesla’s challenger in the EV trucking industry, this short video is a great start. It is astonishing that even after federal lawsuit and proving that the entire thing is fake, this company is still worth $5B!. Hopefully they’ve something substantial that is not known to general public.