Bengaluru, Karnataka, India
32K followers 500+ connections

Join to view profile

About

Hey, I'm Dr. Arjun Jain.

I grew up in Jiaganj on the Ganges. My mom became a…

Articles by Arjun

Activity

Join now to see all activity

Experience & Education

  • Fast Code AI

View Arjun’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • VRU Pose-SSD: Multiperson pose estimation for automated driving

    Proceedings of the AAAI Conference on Artificial Intelligence

    We present a fast and efficient approach for joint person detection and pose estimation optimized for automated driving (AD) in urban scenarios. We use a multitask weight sharing architecture to jointly train detection and pose estimation. This modular architecture allows us to accommodate different downstream tasks in the future. By systematic large-scale experiments on the Tsinghua-Daimler Urban Pose Dataset (TDUP), we obtain multiple models with varying accuracy-speed trade-offs. We then…

    We present a fast and efficient approach for joint person detection and pose estimation optimized for automated driving (AD) in urban scenarios. We use a multitask weight sharing architecture to jointly train detection and pose estimation. This modular architecture allows us to accommodate different downstream tasks in the future. By systematic large-scale experiments on the Tsinghua-Daimler Urban Pose Dataset (TDUP), we obtain multiple models with varying accuracy-speed trade-offs. We then quantize and optimize our network for deployment and present a detailed analysis of the efficacy of the algorithm. We introduce a two-stage evaluation strategy, which is more suitable for AD and achieve a significant performance improvement in comparison to state-of-the-art approaches. Our optimized model runs at 52~ fps on full HD images and still reaches a competitive performance of 32.25~ LAMR. We are confident that our work serves as an enabler to tackle higher-level tasks like VRU intention estimation and gesture recognition, which rely on stable pose estimates and will play a crucial role in future AD systems.

    Other authors
    See publication
  • Multiview-consistent semi-supervised learning for 3d human pose estimation

    Proceedings of the ieee/cvf conference on computer vision and pattern recognition

    The best performing methods for 3D human pose estimation from monocular images require large amounts of in-the-wild 2D and controlled 3D pose annotated datasets which are costly and require sophisticated systems to acquire. To reduce this annotation dependency, we propose Multiview-Consistent Semi Supervised Learning (MCSS) framework that utilizes similarity in pose information from unannotated, uncalibrated but synchronized multi-view videos of human motions as additional weak supervision…

    The best performing methods for 3D human pose estimation from monocular images require large amounts of in-the-wild 2D and controlled 3D pose annotated datasets which are costly and require sophisticated systems to acquire. To reduce this annotation dependency, we propose Multiview-Consistent Semi Supervised Learning (MCSS) framework that utilizes similarity in pose information from unannotated, uncalibrated but synchronized multi-view videos of human motions as additional weak supervision signal to guide 3D human pose regression. Our framework applies hard-negative mining based on temporal relations in multi-view videos to arrive at a multi-view consistent pose embedding and when jointly trained with limited 3D pose annotations, our approach improves the baseline by 25% and state-of-the-art by 8.7%, whilst using substantially smaller networks. Lastly, but importantly, we demonstrate the advantages of the learned embedding and establish view-invariant pose retrieval benchmarks on two popular, publicly available multi-view human pose datasets, Human 3.6 M and MPI-INF-3DHP, to facilitate future research.

    Other authors
    See publication
  • Theano: A Python framework for fast computation of mathematical expressions

    arXiv e-prints

    Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers-especially in the machine learning community-and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art…

    Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers-especially in the machine learning community-and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models.

    See publication
  • Joint training of a convolutional network and a graphical model for human pose estimation

    Advances in neural information processing systems

    This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform…

    This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.

    Other authors
    See publication

Patents

  • Computer-implemented method and apparatus for tracking and reshaping a human shaped figure in a digital world video

    US9191579 B2

    The invention concerns a computer-implemented method for tracking and reshaping a human-shaped figure in a digital video comprising the steps: acquiring a body model of the figure from the digital video, adapting a shape of the body model, modifying frames of the digital video, based on the adapted body model and outputting the digital video.

    See patent
  • Method and System for Triggering an Event in a Vehicle

    EP3895064

    The invention as defined relates to a method for triggering an event in a vehicle, using a hand gesture.

    See patent
  • Method for Identifying a Hand Pose in a Vehicle

    WO2020048814

    Embodiments of present disclosure relate to method for identifying a hand pose in a vehicle, and a system for performing an event in a vehicle. Initially, for the identification, a hand image for a hand in the vehicle, is extracted from a vehicle image of the vehicle. Plurality of contextual images of the hand image is obtained based on the single point. Further, each of the plurality of contextual images are processed using one or more layers of a neural network to obtain a plurality of…

    Embodiments of present disclosure relate to method for identifying a hand pose in a vehicle, and a system for performing an event in a vehicle. Initially, for the identification, a hand image for a hand in the vehicle, is extracted from a vehicle image of the vehicle. Plurality of contextual images of the hand image is obtained based on the single point. Further, each of the plurality of contextual images are processed using one or more layers of a neural network to obtain a plurality of contextual features associated with the hand image. A hand pose associated with the hand is identified based on the plurality of contextual features using a classifier model.

    See patent
  • System and method for deployment of airbag based on head pose estimation

    INA201911039220

    An intelligent airbag deployment control system implemented in a vehicle is disclosed. An input unit receives input images of an occupant in a vehicle from an image sensor unit. A processing unit processes the images to determine and track head localization information based on amplitude and depth parameter of the image. Further, the head localization information is predicted to determine future position and orientation of the passengers head. The future head localization information is…

    An intelligent airbag deployment control system implemented in a vehicle is disclosed. An input unit receives input images of an occupant in a vehicle from an image sensor unit. A processing unit processes the images to determine and track head localization information based on amplitude and depth parameter of the image. Further, the head localization information is predicted to determine future position and orientation of the passengers head. The future head localization information is predicted by
    processing the determined head localization information using Long Short Term Memory (LSTM) neural network architecture. The processing unit generates a control signal to indicate direction of removal of flap of an airbag and amount of pressure in the airbag, while deployment of the airbag

    See patent

Projects

  • Automatic recognition of advertising trademarks in sports videos

    -

    This project was conducted in collaboration with Sport System Europe s.r.l. (www.sportsystem.com). We developed a semi-automatic system for automatic recognition and annotation of logos in sports videos. I was involved in the project both for scientific research and development of the logo recognition prototype; we used C/C++ and the OpenCV library.

    Other creators
    See project

Recommendations received

9 people have recommended Arjun

Join now to view

More activity by Arjun

View Arjun’s full profile

  • See who you know in common
  • Get introduced
  • Contact Arjun directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses