Data Engineer
-, Warszawa +4 Locations
Transition Technologies MS
Join our AI Emerging Technology & External Collaborations team, a strategic group focused on delivering AI-enabled solutions and harnessing emerging technologies to advance the Pharma and DIA Partnering business. We are at the forefront of innovation - building scalable data infrastructure, applying advanced data engineering practices, and integrating cutting-edge external capabilities to accelerate decision-making and unlock business value.Our work spans across multiple high-impact initiatives, from developing intelligent data products to enabling end-to-end AI workflows. Whether it's structuring complex data ecosystems or collaborating with external research institutions, this is a unique opportunity to help shape the future of data and AI at company by directly supporting key strategic decisions across our global Partnering organization.
Your responsibilities:
As a Data Engineer, you will play a critical role in architecting and implementing the data infrastructure to support our AI initiatives. You will collaborate with data scientists, business stakeholders, and external collaborators to enable clean, accessible, and well-modeled datasets that fuel advanced analytics and machine learning solutions.
Design and Build Scalable Data Pipelines:Develop and optimize pipelines to ingest, transform, and curate structured and unstructured data from both internal and external sources.
Data Profiling, Mapping & Standardization:Profile data, identify quality issues, and align disparate datasets. Define data models and standardization frameworks to support scalable, reusable, and AI/ML-ready data products.
Data Product Engineering & API Development:Build, manage, and document secure, reusable data assets and APIs to power advanced analytics and machine learning use cases.
Architect and Operationalize Data Infrastructure:Contribute to the architecture and implementation of scalable data platforms, including the Partnering Data Insight Hub, leveraging AWS-native services.
AI/ML Enablement:Collaborate with data scientists and AI/ML engineers to ensure data solutions are optimized for downstream AI applications, including support for LLM and AI agent workflows.
Metadata Management & Data Governance:Integrate data lineage, governance, and metadata management practices into all solutions to ensure compliance and traceability.
Monitoring and Event Frameworks:Design and implement event-driven monitoring systems to track changes in key datasets, enabling real-time alerts for critical updates (e.g., clinical trial data, research releases).
Container and Workflow Orchestration:Deploy and manage scalable, portable data processing workloads using containerization (e.g., Docker) and orchestration frameworks such as Amazon EKS. Support orchestration of AI workflows using tools like Google Agent Development Kit (ADK) and similar frameworks.
Continuous Improvement:Evaluate and evolve data engineering tools and practices to improve performance, maintainability, and scalability of our data solutions.
We are looking for you, if you have:
Basic Qualifications:
Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related technical field.5+ years of experience in data engineering or a similar role, preferably in a complex enterprise environment.Proven experience building and maintaining scalable data pipelines and architecture in the cloud (preferably AWS).
Strong proficiency in SQL, Python, and data modeling.
Solid understanding of data quality, integration, transformation, and curation best practices.
Experience with structured and unstructured data sources and working with APIs.
Strong communication and collaboration skills – able to work across teams and functions.
Solid understanding of data governance, data security, and data privacy principles.
Preferred Qualifications:
Experience with defining data standardization and data modeling for complex data ecosystems.Hands-on experience with AWS data services (e.g., Glue, Redshift, Lake Formation, Lambda, S3, Athena, etc …).
Familiarity with data cataloging, metadata management, and governance frameworks.
Experience working in support of AI/ML pipelines and data science workflows.
Understanding of healthcare/life sciences or partnership/business development domains is a plus.
Experience working in an Agile development environment.
We offer:
Interesting and challenging projects
Flexible working hours
Friendly, non-corporate atmosphere
Stable working conditions (CoE or B2B)
Possibility for self-development and promotion in the company
Rich benefits package
Possibility to work remotely
We reserve the right to contact the selected candidates.
We are a rapidly growing IT company with global reach. We deal with IT outsourcing and implementation projects in flexible cooperation models, providing access to competence and experts in technologies from mainstream to cloud. TTMS' greatest strength is its skilled professionals, so people are at the heart of our organisational culture.