Generative AI Engineer (High-Tech)

  • schedule 40 hours
  • place Eindhoven
  • file_copy Freelance / Interim
  • alarm 01-04-2026
  • location_city Partially On-Site

Generative AI Engineer (High-Tech)

At the AI Competence Center of a client in the high-tech sector, we are looking for an AI Engineer passionate about Generative AI and Agentic AI systems, someone who thrives on optimizing models for efficient on-device deployment. You will work on large language models (LLMs), large multimodal models (LMMs), and Vision-Language-Action (VLA) models, ensuring they run reliably and efficiently on NPU-based platforms.

Your mission will be to translate cutting-edge research into production-ready solutions, focusing on model compression, system optimizations, and agentic capabilities such as function calling and tool orchestration. Experience with designing secure and reliable agentic workflows, including guardrails and safe tool invocation, is considered a strong plus.

If you are inspired by deeply understanding the inner workings of LLMs, designing system-level optimizations, and building agentic systems under resource constraints, then you’ll want to join a growing AI Competence Center within the high-tech industry.


What You’ll Do

Optimize LLMs and multimodal models for on-device deployment

  • Investigate, develop and apply advanced quantization (8-bit, 4-bit, mixed precision), pruning, and distillation techniques for deriving optimized models for NPU targets.

Accelerate inference performance

  • Investigate, develop and implement system optimizations such as speculative decoding and other efficient decoding algorithms tailored for edge environments.

Engineer agentic AI capabilities towards tiny agents

  • Investigate methodologies for enhancing the performance of small language models towards enabling tiny agents at the edge, while ensuring these follow safety principles.

Work with inference engines and deployment frameworks

  • Deploy optimized models using Ollama, llama.cpp, ONNX Runtime, and TFLite for efficient NPU inference.

Benchmark LLMs and agentic systems

  • Design benchmarking pipelines for assessing the performance of Generative and Agentic AI systems on-device.

Develop demonstrators and proof-of-concepts

  • Build technology PoCs for relevant use cases such as industrial safety monitoring, in-cabin sensing, and other edge AI applications for showcasing key technologies.

Move key technologies from research into product solutions

  • Translate advanced optimization techniques and agentic AI features into production-ready implementations and collaborate with product teams to integrate these features into the client’s software and hardware portfolio.


Your Profile

  • MSc, PhD or EngD in a technical specialism, such as Computer Science or a related field.

  • 5+ years of experience in software/AI engineering with deep exposure to LLMs, VLMs, and systems performance.

  • Experience with LLM quantization techniques (e.g., SmoothQuant, SpinQuant, QuaRoT), pruning (Wanda, SparseGPT, etc.) and system optimizations like speculative decoding.

  • Track-record working with AI frameworks such as PyTorch or TensorFlow.

  • Experience with Agentic AI frameworks (e.g., LangChain, Google ADK, SmolAgents, etc.).

  • Understanding of safety and security considerations for agentic systems (guardrails, policy enforcement, secure function calling) is a plus.

  • Experience with AI toolchains and inference engines (CUDA, TensorRT, TFLite, ONNX, Ollama, etc.) preferred.

  • Affinity and experience with embedded systems and NPU accelerators required.

  • Experience with embedded software architecture, build systems and version control.

  • Experience with GNU/Linux, embedded systems, development boards and processors.

  • Familiarity with ML-Ops environments (MLFlow, ClearML, etc.).

  • Knowledge of Yocto / OpenEmbedded and cross-compilation toolchains for ARM is beneficial.

  • Solid programming experience in C, C++, Python and Bash on Linux systems.

  • Excellent communication skills in English and experience working in multi-site and multicultural teams.

Contact

LinkedIn Sabrina van Boxmeer
Accountmanager
phone 06 159 55 781

Apply in four simple steps

Reply now On to the ideal match!
phone 085 0250045