At Agile Defense we know that action defines the outcome and new challenges require new solutions. That’s why we always look to the future and embrace change with an unmovable spirit and the courage to build for what comes next.
Our vision is to bring adaptive innovation to support our nation's most important missions through the seamless integration of advanced technologies, elite minds, and unparalleled agility—leveraging a foundation of speed, flexibility, and ingenuity to strengthen and protect our nation’s vital interests.
Requisition #: 1282
Job Title: Senior Data Scientist
Job Title for Careers Page: Senior Data Scientist
Location: Crystal City, VA
Clearance Level: Active TS/SCI
SUMMARY
The candidate will design, develop, and implement scalable data pipelines and ETL processes using Apache Airflow, with a focus on data integration and AI applications. Responsibilities include developing messaging solutions with Kafka to support real-time streaming and event-driven architectures, building and maintaining high-performance data retrieval solutions using ElasticSearch/OpenSearch, and optimizing Python-based data processing solutions. The role involves integrating batch and streaming data techniques to enhance data availability, deploying and managing cloud-based infrastructure for scalable solutions, and ensuring compliance with security requirements when handling classified data. The candidate will work closely with cross-functional teams to define data strategies and develop technical solutions aligned with mission objectives.
JOB DUTIES AND RESPONSIBILITIES
• Design, develop, and implement scalable data pipelines and ETL/ELT processes using Apache Airflow, integrating batch and streaming techniques to enhance data availability and accessibility.
• Develop and optimize Python-based data processing solutions, with strong programming skills in Python and proficiency in multiple scripting languages.
• Develop messaging solutions utilizing Kafka to support real-time data streaming and event-driven architectures, including microservices development.
• Build and maintain high-performance data retrieval and indexing solutions using ElasticSearch/OpenSearch, ensuring efficient data storage, retrieval, and processing.
• Deploy and manage cloud-based infrastructure and services (e.g., AWS, MinIO) to support scalable and resilient data solutions.
• Work closely with cross-functional teams to define data strategies, develop technical solutions aligned with mission objectives, and ensure adherence to security and compliance requirements for classified data.
Education, Background, and Years of Experience
• 7–10 years of relevant experience.
SKILLS & QUALIFICATIONS
• Demonstrated experience with data wrangling, data visualization, and analytics, including use of SuperSet and GIS tools.
• Hands-on experience with containerization and orchestration tools such as Docker and Kubernetes.
• Experience in R, C++, Python, JS, Gephi, sklearn, SQL, and other analytical or visualization frameworks.
• Experience with natural language processing, machine learning, network analysis concepts, and vector databases or embedding models for AI applications.
• Understanding of LLM prompt engineering and associated ETL applications.
• Expertise in workflow orchestration, data pipeline automation, and database optimization techniques.
• Working knowledge of Linux environments and version control with Git.
• Exposure to Apache Spark for large-scale data processing.
• Preference for candidates with experience in data integration processes (ETL/ELT) and a strong understanding of modern data architectures supporting AI and analytics.