Candidates: Create an Account or Sign In
We’re looking for an experienced DevSecOps Engineer for an initial 3 month contract with great scope of extension. The role will be primarily remote based but will require site visits from time to time and due to the nature of the work consultants must be UK nationals and willing to undergo Security Clearance (or hold it already).
What you’ll need:
Proven experience in a DevOps, Platform Engineering, or Infrastructure Engineering role supporting production environments
Strong experience working with Linux-based systems, including shell scripting (Bash) and VM administration
Hands-on experience running and supporting containerised services using Docker, with the ability to read, understand, and modify Docker configurations
Ability to support and troubleshoot AI/LLM-powered applications in production or pre-production environments
Experience operating multi-container environments supporting backend services such as databases, vector stores, or inference engines
Solid understanding of CI/CD pipelines, with practical experience using GitLab, including pipeline configuration and environment variables
Strong working knowledge of Git and source code management best practices
Experience configuring and maintaining NGINX (or equivalent) as a reverse proxy for backend services and web applications
Experience standing up and managing infrastructure using Infrastructure as Code tools (e.g. Terraform)
Proficiency in Python, including supporting backend services, scripts, or LLM application frameworks
Experience working with relational databases (e.g. PostgreSQL), including deployment, configuration, and operational support
Comfortable working across multiple projects at different stages of delivery, supporting shared platforms used by multiple teams
Strong problem-solving skills with the ability to debug infrastructure, CI/CD, and application-level issues
Ability to collaborate effectively with Data Science, AI, and Engineering teams, understanding boundaries between infrastructure, inference, and model training responsibilities
Support and scale the AI platform to run multiple LLM-based projects across varying levels of maturity and technical stacks
What you’ll be doing
Own and maintain shared infrastructure used by multiple AI/LLM projects, ensuring stability, performance, and efficient use of space and storage
Provide day-to-day DevOps and platform support to an increasing number of teams onboarding onto the new environment
Design, deploy, and manage Linux-based virtual machines running containerised services for AI workloads
Operate and support Dockerised core services, including Qdrant, LLDAP, PostgreSQL, and vLLM, confidently reading and modifying Docker configurations as required
Manage and configure NGINX reverse proxy services to expose project front ends and internal tools securely
Support and contribute to CI/CD pipelines (GitLab), including pipeline configuration, environment variables, and repository management
Collaborate closely with AI Lab teams to support project onboarding, troubleshooting, and platform evolution as tooling and architectural patterns mature
Deploy and support open-source LLMs (e.g. Hugging Face models such as Intern Neural 7B) and associated inference tooling
Support LLM application frameworks (LangChain, vLLM) and Python-based services without owning model training (handled by Data Science teams)
Stand up and manage cloud and infrastructure-as-code components using tools such as Terraform
Provide general backend and infrastructure support, with some overlap into CI/CD and DevOps best practices
Ensure platform reliability, security, and scalability as visibility and demand increase following successful AI initiatives