About Mecka AI
Mecka AI is building the data infrastructure layer for robotics and embodied AI. We partner with leading AI labs and robotics companies to deliver high-quality, real-world datasets used to train, evaluate, and deploy robotic systems—where model performance is dictated by data quality.
The Role
While our existing perception division handles state estimation and spatial mapping, this role is dedicated to one of the most critical bottlenecks in embodied AI: dexterous manipulation and human-object interaction. We are hiring a Research Scientist to architect and train proprietary foundation models from scratch focused on 3D hand tracking and articulated pose estimation.
Your core mandate is twofold: building our in-house equivalents to cutting-edge 3D hand and mesh recovery architectures, and developing highly robust interaction models tailored for the heavily occluded, chaotic domain of egocentric manipulation. Beyond these core pillars, you will serve as a lead problem-solver for emergent perception challenges as our hardware and downstream robotics needs evolve.
To achieve this, we can provide a massive, continuous stream of high-quality, proprietary ground-truth manipulation data captured by our infrastructure. You will use this data advantage to train networks that surpass current public baselines, owning the complete hand-object perception loop for our data engine.
What You'll Work On
Architecting Proprietary Articulation Models
Zero-to-One Model Development: Design, implement, and train state-of-the-art networks for 3D hand pose estimation, dense mesh recovery, and kinematic tracking.
Large-Scale Distributed Training: Scale multi-view and temporal ML architectures across multi-GPU clusters to handle massive, multi-modal datasets of human hands in action.
Loss & Architecture Innovation: Push the boundaries of current paradigms by developing novel loss functions that enforce biomechanical constraints, temporal smoothness, and physical plausibility.
Egocentric Hand-Object Interaction (HOI)
Egocentric Manipulation Modeling: Build and train custom architectures capable of handling the extreme motion blur, severe self-occlusion, and rapid rotations inherent in first-person object manipulation.
Dynamic Scene Understanding: Use your models to track objects through complex grasps, segment tools from hands, and map contact points and forces to provide rich regularization for downstream action-conditioned robotics models.
Emergent Perception R&D
Rapid Prototyping: Tackle novel, unmapped AI challenges as they arise. You will rapidly prototype and deploy new models for tasks spanning tactile-visual fusion, fine-grained action segmentation, and novel hardware sensor integrations.
Agile Problem Solving: Pivot to resolve sudden algorithmic bottlenecks in the data engine, adapting the latest research to unblock new product capabilities for our robotics customers.
Dense Contact & Physics-Aware Tracking
Interaction Integration: Connect the outputs of your foundational tracking models into highly optimized pipelines that reason about physical contact surfaces and object affordances, directly bridging the gap between human video data and robotic control policies.
Who You Are
Required Background
Deep expertise in Deep Learning, 3D Computer Vision, and specifically Articulated Tracking / Hand Pose Estimation.
Proven experience training large-scale vision models from scratch, not just running inference or fine-tuning existing checkpoints.
Strong theoretical and practical understanding of parametric hand models, inverse kinematics, and dense mesh estimation.
Mastery of PyTorch and deep learning scaling frameworks.
Experience handling and curating massive, multi-terabyte image and video datasets for training.
Comfortable operating in a fast-paced environment where priorities can shift rapidly to capitalize on new research or hardware capabilities.
Warning: Research Scientist positions require hyper-specific expertise. Please limit your applications to one research role. Applying to multiple Research Scientist positions suggests a lack of focus and may result in the rejection of all submissions. You may, however, apply to other non-research roles alongside your research application.
Strong Signals:
First-author publications in top-tier venues (CVPR, ICCV, ECCV, NeurIPS) focusing on 3D hand tracking, hand-object interaction (HOI), dexterous manipulation, or human mesh recovery.
Specific experience working with massive egocentric manipulation datasets (e.g., Ego4D, Ego-Exo4D, DexYCB, Epic-Kitchens) and solving the unique optimization challenges they present.
Experience writing custom CUDA kernels to accelerate 3D operations, differentiable rendering of meshes, or collision/contact computation.
Why This Role?
The Data Advantage: You will have access to a scale and quality of proprietary spatial and temporal ground truth for human manipulation that most academic researchers only dream of.
Pure R&D & Model Ownership: You are not maintaining legacy systems; you are given a blank slate and the compute resources to build the state-of-the-art.
High Impact: The kinematic priors and interaction models you architect will directly define how the next generation of embodied AI agents learn to physically grasp and manipulate the world.
