Skip to main content

Lakehouse Storage (Coming Soon)

Our Managed Lakehouse Storage consists of Apache Minio. Minio is a high-performance, scalable, open-source object storage system designed with a focus on simplicity and best-in-class performance. One of its standout characteristics is its AWS S3 compatible API, providing seamless integration for applications already interfacing with S3. It supports features like versioning, bitrot protection, and encryption, ensuring data integrity and security. Minio is also built for cloud-native environments; it's scalable, highly available, and supports orchestration systems like Kubernetes, which makes it an excellent fit for modern, microservice-based architectures. Moreover, it is language-agnostic, with SDKs available in multiple languages, including Go, Java, Python, and JavaScript, making it versatile for various applications. With its easy setup, excellent documentation, and active open-source community, Minio stands out as a go-to solution for object storage needs.

Apache Iceberg

Our Managed Lakehouse storage is designed to work with data formats of almost any kind. Therefore we offer it in combination with Apahce Iceberg. Apache Iceberg is an open-source table format for big data processing that provides reliable, scalable, and efficient data management. It addresses the challenges of storing and processing large datasets by offering features like schema evolution, time travel, and transactional consistency. Iceberg provides a unified view of data that allows concurrent reads and writes while ensuring consistent views of the data. It supports a wide range of data formats and integrates seamlessly with popular big data processing frameworks such as Apache Spark and Apache Hive. Iceberg's metadata management capabilities enable efficient query optimization, making it ideal for analytical workloads and data lake architectures. Its design principles focus on scalability, performance, and data reliability, making it a valuable tool for organizations dealing with large and evolving datasets.