Requirements:
• Must have 5+ years of experience
• Strong proficiency in data engineering concepts and practices.
• Extensive experience in applying data science and machine learning techniques.
• Working knowledge of extracting information from unstructured data sources, particularly PDFs.
• Hands-on experience with Large Language Models (LLMs) such as GPT and Gemini.
• Knowledge of prompt engineering and fine-tuning techniques.
• Practical experience in building Retrieval-Augmented Generation (RAG) systems.
• Experience in processing large volumes of unstructured data, preferably in the insurance, legal, or healthcare sectors.
• Proven experience in extracting valuable insights from diverse data sets.
• Familiarity with Vector Databases, Azure Cloud, LangChain, Lama Index, and OCR tools and techniques.
• Working knowledge of PDF to text extraction tools like PDFMiner, PyMuPDF, or PDFPlumber, OCR tools.
• Skilled in machine learning, deep learning, computer vision, natural language processing (NLP), and generative AI models.
Good to have:
Working knowledge of Python and libraries such as NumPy, pandas, scikit-learn, and TensorFlow/PyTorch.
Preferred Qualifications:
BE/MS/PhD in Computer Science, Data Science, Machine Learning, or related fields.
Interested candidates can share their resumes at varsha@hiehq.com
Note: We are looking for passionate candidates who can join immediately.