There is no AI Strategy without a Data Strategy. Getting GenAI to work is mission-critical for most companies, but 90% of AI projects haven't deployed. Why? Poor data quality - it is the #1 obstacle companies have in getting GenAI projects into production.
We've helped some of the best brands like Amazon, Mayo Clinic, AmFam, and Nespresso solve their data issues and deploy their AI strategy with Day 1 ROI.
Simply put, Shelf unlocks AI readiness. We provide the core infrastructure that enables GenAI to be deployed at scale. We help companies deliver more accurate GenAI answers by eliminating bad data in documents and files before they go into an LLM and create bad answers.
Shelf is partnered with Microsoft, Salesforce, Snowflake, Databricks, OpenAI and other big tech players who are bringing GenAI to the enterprise.
Our mission is to empower humanity with better answers everywhere.
As a Backend Engineer at Shelf, you will focus on building robust backend services for large-scale data processing. We use Python (and Node.js) to create data pipelines and handle data from diverse storage solutions. Your work will center on ensuring data flows efficiently, remains well orchestrated, and can operate seamlessly at scale. You’ll be tackling complex data ingestion, transformation, and orchestration challenges, building the core infrastructure that powers our platform.
But we're not just moving data; we're focused on solving the crucial data quality problems that underpin successful AI initiatives. Shelf is uniquely positioned to address these challenges head-on, as we provide data quality solutions and data enrichment capabilities that are key to building accurate and trustworthy AI systems. We're not simply building a platform, we're building the very foundation for the next generation of AI. This means your work will directly impact the accuracy, reliability, and ultimately, the usefulness of AI across the enterprise landscape.
- Do you enjoy crafting efficient, testable code and want to be part of the engine behind advanced data processing?
- Do you have a passion for building truly robust and accurate systems?
- Are you looking for fast professional growth in a very demanding and challenging environment?
If you can answer these three questions confidently with “Yes!”, then this might just be the role for you: a unique opportunity to build products that have a huge impact on real-world AI applications.
- Design, implement, and optimize our distributed ETL pipeline, focusing on background processing logic, data transformation, and scalability.
- Develop modular and composable components capable of efficiently processing large-scale data across a diverse range of storage solutions, including S3, RDS/PostgreSQL, Elasticsearch, DynamoDB, data warehouses, and data lakes.
- Implement ML model integrations within the data pipeline, working closely with Data Scientists on model deployment and data flow.
- Develop clean, maintainable code in Python, adhering to best practices in observability, cost-efficiency, and robust error handling.
- Proactively identify and address performance bottlenecks and inefficiencies in current systems, proposing solutions to improve scalability and reliability, while ensuring continuous production stability through thorough testing and monitoring practices.
- Share your knowledge, participate in code reviews, and advocate for best practices to advance our backend development standards.
- Over 4 years of professional software engineering experience, including more than 1 year specializing in Python.
- Deep understanding of distributed systems, concurrency patterns, and ETL-oriented workflows.
- Comfortable working with diverse data stores (SQL and NoSQL), including schema design and performance tuning at scale. Experience with cloud-based data lakes and data warehouses is a plus.
- Experience with event-driven architectures, distributed processing techniques, CQRS.
- Proven experience building scalable backend applications on either AWS or Azure, including a strong understanding of their respective services for compute, storage, and data processing.
- Ability to write well-structured, testable code with thoughtful abstractions and interfaces.
- Strong problem-solving skills, including the ability to troubleshoot performance bottlenecks and legacy code.
- Upper-Intermediate or better English skills for technical communication and documentation.
- Any hands-on experience with NLP, unstructured data processing, Node.js/TypeScript, or RAG pipelines is a significant plus.
- Ability to effectively present work both verbally and visually, and create clear, well-structured documentation.
- B2B contract.
- Company Stock Options.
- Hardware: MacBook Pro.
- Modern technical stack. Develop open-source software.
- GitHub Copilot subscription.
- LLM credits for other OSS AI code assistants and internal AI tools.
- Our Leadership Team has deep knowledge management and AI domain expertise and enterprise SaaS background to execute on the company vision.
- We love our customers, and our customers love us. Ask a Shelf customer why, and they’ll tell you it’s because of our innovative capabilities, rock-solid reliability, they truly enjoy working with our people, but most of all – it’s the improvements they see in their business KPIs.
- We have raised over $60 million in funding and our investors include Tiger Global, Insight Partners, Base10, and others.
- We have high velocity growth powered by the most innovative product in our category, 3X growth for 3 years in a row.
- We now have over 100 employees in multiple U.S. states and European countries, and we have ambitious hiring goals over the next few months.