Company Overview
We are developing an innovative HW/SW computing platform for DL inference acceleration which sets new, unprecedented, bars of high performance and cost. Our platforms are targeted for cloud and enterprise (in-perm) environments.
Join a dynamic team focused on developing core AI/ML pipelines.
Role Summary:
- Developing, optimizing, and analyzing diverse workloads (including those from open-source, client input, or internal projects) on advanced computing platforms, with a strong emphasis on natural language processing, computer vision, and large-scale language models (LLMs).
- Evaluating and comparing workload performance across various inference platforms.
- Collaborating with clients to address their needs, ensuring seamless and efficient deployment of their AI workloads:
- Identifying challenges and proposing hardware or software enhancements to boost performance and streamline integration.
This position provides the chance to work on innovative and emerging technologies, spanning deep learning, algorithm optimization, and computer systems architecture.
- Bachelor’s or Master’s degree in Computer Science or Computer Engineering.
- 2–3 years of hands-on experience with Python and deep learning frameworks.
- Expertise in creating inference pipelines that integrate trained models with pre- and post-processing components.
- Experience in CUDA, OpenCL, C, or C++.
- Familiarity with hardware systems or embedded platforms is an advantage.
- Hybrid model- 2 times a week in office