About Onehouse
Onehouse delivers a new bedrock for your data, through a cloud-native managed lakehouse service built on an open, interoperable, industry-proven technology. Founded by a former Uber data architect and the creator of Apache Hudi, Onehouse accelerates the inevitable transition of the data lake into a lakehouse, unlocking incremental processing to replace old-school batch processing on the lake. Onehouse makes it possible to blend the ease of use of a warehouse with the scale of a data lake into a fully managed product. Engineers can build data lakes in minutes, process data in seconds, and own data in open source formats instead of being locked away to individual vendors.
https://www.onehouse.ai
Job Description
Are you a passionate Software Engineer who wants to reinvent the future of data systems for the entire industry? Have you always wanted to dig deeper and contribute directly to open source projects like Apache Hudi or Apache Spark/Flink and work on database, data lake or data warehouse technologies? As a software engineer at Onehouse, you will contribute directly to Apache Hudi and the surrounding open source ecosystem, while deploying and operating these technologies at massive scale for our customers.
Responsibilities
Build systems that enable users to manage petabytes of data with a fully managed cloud service.
Build functionality that enables data systems to be cloud native (self managed), scalable (auto scaling) and secure (different levels of access control).
Build scalable job management on Kubernetes to ingest, store, manage and optimize petabytes of data on cloud storage.
Design systems that help scale and streamline metadata and data access from different query/compute engines.
Exhibit full ownership of product features, including design and implementation, from concept to completion.
Be passionate about designing for future scale and high availability, while possessing a deep understanding of common failure patterns and their remediations.
Uphold a high engineering bar around the code, monitoring, operations, automated testing, release management of the platform.
Qualifications
3+ years of experience as a software engineer with experience developing distributed systems.
Strong, object-oriented design and coding skills (C/C++ and/or Java preferably on a UNIX or Linux platform).
Experience with inner workings of distributed (multi-tiered) systems, algorithms, and relational databases.
Deal well with ambiguous/undefined problems; ability to think abstractly; articulate technical challenges and solutions.
Speed and hustle → Ability to prioritize across feature development and tech debt.
Ability to solve complex programming/optimization problems.
Ability to quickly prototype optimization solutions and analyze large/complex data.
Clear communication skills.
Bonus skills
Experience working on database systems, Query Engines or Spark codebases.
Experience working on cloud based (data focused) services.
Deep understanding of Spark, Flink, Presto, Hive, Parquet internals.
Hands-on experience with open source projects like Hadoop, Hive, Delta Lake, Hudi, Nifi, Drill, Pulsar, Druid, Pinot, etc.
Who We Are
At Onehouse, our mission is to aid companies of all sizes in supercharging their data engineering/data science, by automating painful data infrastructure buildout. We are a team of self-driven, inspired, and seasoned builders that have created large-scale data systems, as well as globally distributed platforms that sit at the heart of some of the most well known companies out there including Uber, Linkedin, Confluent, Microsoft. We are set out on an ambitious goal to build the world's best fully managed and self-optimizing data lake platform. We are very well funded and backed by some of the top-tier VCs in Silicon Valley, and as well as numerous well-known angel investors from top Silicon Valley companies.
Why join us
Fun team, challenging problems! One day, we will be managing the largest database in existence!
Contribute directly to open source, including an exciting and growing data project - Apache Hudi
Create instant impact by contributing to Hudi, which is already in use by numerous large enterprises globally
Experienced team with numerous staff level engineers, to learn and grow with.
Early opportunity on a very happening space, everybody agrees the next few years will reshape the data landscape
Founding team is the creator of a large, fast-growing technology category - transactional data lakes
We are growing fast and looking for rising talent who can grow with us to become future leaders of the team. Come help build this unicorn-to-be!