Data Scientist
Pharma
Start ASAP/to determinate
100% remote
B2B up to 50 e/h netto+VAT
The role of the Data Scientist in RAG & Document Intelligence focuses on designing and implementing AI solutions that enhance the accessibility of knowledge within an organization, transforming complex enterprise documents into actionable insights.
Main Responsibilities:
Optimize RAG pipelines by experimenting with various strategies for parsing, chunking, and retrieval to improve answer quality and reduce errors.
Extract structured information from unstructured content, ensuring high-quality input for processing.
Design and conduct experiments to evaluate models based on accuracy, latency, and cost, and derive insights from the data.
Implement NLP techniques to solve real-world problems, enhancing user experience through effective query handling.
Monitor and evaluate the performance of AI models and make necessary adjustments for cost-efficiency.
Key Requirements:
Strong Python skills with experience in machine learning and generative AI workflows.
Solid understanding of NLP principles like text representation and semantic search.
Experience in designing and optimizing RAG pipelines for unstructured documents.
Proficiency with document parsing and handling diverse formats.
Familiarity with evaluation frameworks for LLMs and defining specific quality metrics.
Knowledge of multi-agent AI frameworks.
Experience with vector databases and cloud services (Azure/AWS).
Strong analytical skills with an experimental approach to problem solving.
Fluent in English (written and spoken).
Nice to Have:
Hands-on experience with Databricks GenAI products.
API development and integration skills.
Familiarity with Model Context Protocol.
Knowledge of knowledge management and taxonomy design.
Other Details:
Impact Level: Greenfield AI Initiative
Collaboration: Work with various business units across the organization.
Data Scientist
Data Scientist