Databricks: Thank you for the tremendous response at Databricks AI Summit 2025. Your interest in our work inspires us to keep pushing the boundaries of data innovation.
|
Qbeast announces investment from PeakXV
|
Databricks: Thank you for the tremendous response at Databricks AI Summit 2025. Your interest in our work inspires us to keep pushing the boundaries of data innovation.
|
The most advanced open-table optimization for data lakes
Store your data where you want, analyze it how you like — our platform works out of the box with leading object storage solutions and is fully compatible with your favorite BI tools and data processing engines.
Works in any Data Lake
Compatible with all BI and Transformation Tools
How we organize our data
With Delta Lake format, Qbeast adds the necessary information to query efficiently
The usage of an index helps avoid reading the entire dataset, reducing the amount of data transfer involved and speeding up the query. Qbeast allows you to index your data on as many columns as you need and filter directly the files to answer the search.
Approximate Queries
Qbeast enables approximate queries, the ability to provide approximate answers to queries at a fraction of the cost of executing the query. With the Qbeast-Spark, you can access a statistical representative sample of the dataset and return the result of the query within a margin of error.
File Optimization
When writing new data, the file layout could be harmed, producing lots of small files or heavily large ones, making it uneasy to retrieve the results with the least noise possible. Optimization fixes the overflowed areas and improves the query's useful payload by reading more fine-grained files.
Easy to Deploy
It works with any Data Lake storage (S3, Azure and GCS) and is compatible with any BI/ML tool of your choice. Only takes 10 minutes to deploy and enjoy the benefits of querying Qbeast Tables.
CREATE TABLE purchases ( id INT, user_id INT, product_id STRING ) USING qbeast OPTIONS ('columnsToIndex'='user_id,product_id');
INSERT INTO TABLE purchases SELECT id, user_id, product_id FROM raw_purchases;
Examples
Seamlessly integrate Qbeast with Databricks, Snowflake and more. Automate your data workflows and unlock faster, sharper insights so your team can focus on what matters.
Multicolumn Filtering
SELECT * FROM customers WHERE age > 20 and city = ‘Barcelona’
The usage of an index helps avoid reading the entire dataset, reducing the amount of data transfer involved and speeding up the query. Qbeast allows you to index your data on as many columns as you need and filter directly the files to answer the search.
Approximate Queries
SELECT avg(age) FROM customers WHERE city = ‘Barcelona’ TABLESAMPLE (1 PERCENT)
Qbeast enables approximate queries, the ability to provide approximate answers to queries at a fraction of the cost of executing the query. With the Qbeast, you can access a statistically representative sample of the dataset and return the result of the query within a margin of error.
Optimization
QbeastTable.forPath(spark, tmpDir).optimize()
As your table grows, Qbeast optimization will dynamically adjust to the shape and density of your data. Our unique index delivers balanced file sizes, with records grouped based on the dimensions interesting for your business and use-cases.