Q: 12
Which file format is used for storing Delta Lake Table?
Options
Discussion
Option A Official guide and Databricks docs both confirm Parquet is used for Delta Lake storage format.
Option A. Under the hood Delta Lake tables are stored as Parquet files, just with extra metadata and transaction log. Pretty sure about this, seen it in official guide and labs.
A tbh, Delta Lake adds features but the actual files on storage are still Parquet. The "Delta" part is mostly about the transaction log and metadata. Anyone disagree or see a case where that's not true?
Does "file format" here mean what's on disk, or the type defined in table creation? Little unclear how they're framing it.
I don't think it's B. A is correct here since Delta Lake uses Parquet files for actual storage, and the word "Delta" in option B is more about how Databricks handles transactions and versioning on top. Easy to get tripped up by naming though-B looks tempting, but if you check the filesystem, you'll see Parquet files. Anyone prefer B for another reason?
Not gonna lie, this always confused me at first too. It's A in this case since the files themselves are Parquet on disk. "Delta" refers more to the transaction/log layer. If anyone has seen it different in the newer Databricks updates, would love a sanity check.
Wow Databricks and their naming, always trips people up. It's actually A, since Delta Lake tables store data as Parquet files under the hood. Makes sense if you've looked at the filesystem. But I get why B looks tempting.
Its A, the trap here is B since you work with 'Delta' tables in Databricks but the underlying storage is Parquet files. I've seen similar wording in practice exams, always points to Parquet as actual file type. Could be confusing if you only think about what users see. Let me know if anyone thinks otherwise!
Be respectful. No spam.