Deep Learning Engineer (Full-time - Remote)
PerPlant
Posted 6+ months ago
Deep Learning Engineer for Edge Applications
PerPlant is a startup on a mission to democratize AI in agriculture by offering a cost-effective, plug-and-play AI-based camera sensor to optimize pesticide and fertilizer use and increase harvest yield. Our goal is to support sustainable farming practices with innovative technology and actionable data. In this role, you will develop and implement cutting-edge computer vision algorithms for Edge applications. You will focus on real-time plant count, crop type and stage classification, non-crop area classification and localization (sand spots, stones, water suffering areas, tractor tracks, field boundaries), weed detection, and other related tasks.
As a Deep Learning Engineer, you will play a critical role in the development of our product’s real-time perception stack and launch of advanced features.
Key Responsibilities
In this role, your key responsibilities will include:
Designing and developing complete Deep learning pipeline(Data acquisition, Pre-Processing, Training, Validation, Deployment) for real-time image and video analysis.
Implementing and optimizing models on Edge devices with limited computational resources.
Collaborating with cross-functional teams to define project requirements and deliverables.
Analyzing and evaluating the performance of deep learning models and algorithms.
Improving and optimizing existing models and algorithms.
Qualifications
To be successful in this role, you will need:
Proven experience of 3+ years in developing production-level Deep learning-based multi-task perception models.
Proven Experience with deploying models on Edge devices and limited computational resources (Nvidia Jetson or other similar).
Strong proficiency in C++ , Cuda and Tensor RT (for model deployment, re-simulation, pre/post-processing) & Python, TensorFlow, PyTorch, and other deep learning frameworks.
Familiarity with Sensor Fusion, Localization, ROS, and 3D reconstruction or overall working in Robotics perception applications or Automated driving systems will be nice to have.
Experience with software documentation, tracking (ex: Jira) and versioning tools.
What do we offer:
Competitive Salary and Employee stock options in a fast-growing company.
Danish work-life-balance.
Due to the small size of the team, you will have the opportunity to be involved in a wide range of tasks and you’ll have a big say in what you’ll be working on.
A flexible workplace with a young founder team.
Learning and adopting new skills
Application process:
Deadline: NA
The address of employment is as you prefer.
If you are passionate about using technology to make a positive impact on the planet and the environment, and you are looking for an exciting and challenging opportunity, we encourage you to apply for this role. We are excited to hear from you and look forward to working wit you.
Questions for interview -
1. Optimizing Deep Learning Models for Edge Devices
Question: How would you approach optimizing a deep learning model for deployment on an Nvidia Jetson device, considering constraints like memory, latency, and power consumption?
Follow-up Questions:
Can you walk me through how you would prune and quantize a deep learning model?
Explain how TensorRT optimization works and when you would choose FP16 over INT8.
How do you balance the trade-off between model accuracy and performance when applying these techniques?
Expected Answer: Candidates should discuss techniques like pruning, quantization, and converting models to ONNX format. They should understand how to use TensorRT’s builder to create an optimized inference engine with reduced precision like FP16 or INT8, and explain how to profile model layers to identify bottlenecks.
2. Multi-Task Perception Models
Question: How do you address conflicts between tasks (e.g., object detection vs. semantic segmentation) when designing a multi-task learning model? What loss function strategies do you use?
Follow-up Questions:
How do you design the shared backbone and task-specific heads in such models?
What approaches do you use to dynamically adjust loss weights during training?
Expected Answer: A solid response should include an understanding of weighted loss functions, multi-task learning strategies (e.g., shared backbone with task-specific heads), and potential use of techniques like GradNorm to balance gradients. Candidates should show an understanding of when tasks might benefit from shared features and when they need separate representations.
3. Sensor Fusion and Robotics Perception
Question: In a multi-sensor setup involving cameras, LiDAR, and IMUs, how do you handle synchronization and data fusion? Explain how you manage sensor noise and inaccuracies in such systems.
Follow-up Questions:
How do you implement Kalman or Extended Kalman Filters for sensor fusion?
Can you discuss any challenges you’ve faced with time synchronization between sensors?
Expected Answer: The candidate should talk about time synchronization strategies (e.g., using timestamps, message filters in ROS), methods for compensating for sensor drift, and implementing sensor fusion algorithms like Kalman filters. They should also understand how to handle outlier data and sensor noise.
4. CUDA and C++ in Model Deployment
Question: How do you optimize CUDA kernels for deep learning inference? What techniques would you use to maximize parallelism on the GPU?
Follow-up Questions:
How would you write custom CUDA kernels for non-standard operations in TensorRT or PyTorch?
Explain the importance of memory coalescing and shared memory in CUDA.
Expected Answer: A strong candidate should explain how to maximize GPU parallelism by optimizing thread/block configurations, reducing memory transfer overhead, and using shared memory. They should be familiar with writing custom CUDA kernels and explain concepts like memory coalescing, occupancy, and leveraging Tensor Cores in mixed-precision arithmetic.
5. Model Interpretability and Debugging
Question: When deploying a perception model in a robotics application, how do you diagnose and mitigate issues like false positives or localization drift in real-time?
Follow-up Questions:
What techniques do you use to evaluate and visualize feature maps or intermediate activations in real-time?
How do you handle edge cases where the model is uncertain or produces incorrect predictions?
Expected Answer: Candidates should discuss strategies like using Grad-CAM for visualizing feature importance, deploying shadow models for A/B testing, and employing fallback mechanisms or ensemble models to handle uncertain predictions. They should also be able to explain methods for debugging model outputs, such as examining class activation maps or conducting failure mode analysis.
6. Real-Time Systems and ROS
Question: In a real-time robotics perception system, how do you ensure low latency in processing sensor data and model inference? What role does ROS play in managing real-time constraints?
Follow-up Questions:
How do you design a low-latency communication pipeline in ROS?
What are the limitations of ROS for real-time systems, and how would you mitigate them?
Expected Answer: The candidate should explain using ROS2 or custom real-time kernels for managing time-sensitive data. They should discuss strategies like minimizing message latency through efficient topic publishing/subscribing, using nodelets for zero-copy transport, and managing process priority. Additionally, they might mention using real-time scheduling (e.g., SCHED_FIFO) and techniques like direct memory access (DMA) for faster data transfers.
7. Edge AI and Power Efficiency
Question: When deploying deep learning models on edge devices with limited power resources, how do you balance power efficiency and model performance?
Follow-up Questions:
How do you use dynamic voltage and frequency scaling (DVFS) in edge devices?
What techniques would you apply to reduce inference time without sacrificing much accuracy?
Expected Answer: The candidate should discuss using DVFS, optimizing models for reduced precision, and employing lightweight architectures like MobileNet or EfficientNet. They should also understand power-aware computing strategies like disabling non-essential cores during low-computation tasks or using neural architecture search (NAS) for optimizing models specific to hardware constraints.
8. Data Handling and Preprocessing for GIS in Robotics
Question: How do you handle large geospatial datasets in edge devices? What preprocessing steps do you take to optimize data for real-time use?
Follow-up Questions:
How do you integrate geospatial data with ROS for navigation and path planning?
What techniques do you use for efficient data storage and retrieval in constrained environments?
Expected Answer: A qualified candidate would talk about downsampling large datasets, compressing point clouds, and using optimized storage formats like GeoTIFF for raster data. They should also explain spatial indexing (e.g., R-trees) and how to leverage ROS plugins for handling large map data.
9. Model-Driven Development and Testing
Question: How do you ensure the robustness and reliability of a perception model deployed on a robotic platform? Explain how you handle model updates and versioning.
Follow-up Questions:
What’s your approach to testing and validating models before deployment in production environments?
How do you manage continuous integration/continuous deployment (CI/CD) for models in a robotics application?
Expected Answer: The candidate should mention unit testing, simulation environments (e.g., Gazebo), and using real-world scenarios for model validation. They should also discuss implementing CI/CD pipelines for models, automated regression testing, and strategies for rolling out model updates without service disruption.
These deeper questions will give you a better sense of the candidate's expertise in advanced topics relevant to robotics perception, edge AI, and real-time systems.