“I had the privilege of studying with Prem. He has a fantastic ability of simplifying complex problems. Prem has a very high intellect combined with ability of coming out with practical solutions. Deceptively simple, whacky sense of humour and great human being.”
Activity
-
Bryan Vorndran from the Federal Bureau of Investigation (FBI) kicks off our Executive dinner at RSA Conference followed by Hany Farid at GetReal, and…
Bryan Vorndran from the Federal Bureau of Investigation (FBI) kicks off our Executive dinner at RSA Conference followed by Hany Farid at GetReal, and…
Liked by Prem Natarajan, PhD
-
We are proud to share that Mr. Birdev Siddhappa Dhone, an alumnus of COEP Technological University, Civil Engineering Batch of 2020, has achieved an…
We are proud to share that Mr. Birdev Siddhappa Dhone, an alumnus of COEP Technological University, Civil Engineering Batch of 2020, has achieved an…
Liked by Prem Natarajan, PhD
-
It is so exciting to see Capital One's Chat Concierge take the market by storm, and I am proud that we were mentioned as one of the most advanced…
It is so exciting to see Capital One's Chat Concierge take the market by storm, and I am proud that we were mentioned as one of the most advanced…
Liked by Prem Natarajan, PhD
Publications
-
A Constrained Optimization Approach to Combining Multiple Non-Local Means Denoising Estimates
Signal Processing
There is an ongoing need to develop image denoising approaches that suppress noise while maintaining edge information. The non-local means (NLM) algorithm, a widely used patch-based method, is a highly effective edge-preserving technique but is sensitive to parameter tuning. We use a variational approach to combine multiple NLM estimates, seeking a solution that balances positivity constraints and gradient penalties against Stein's Unbiased Risk Estimate (SURE). This method greatly reduces…
There is an ongoing need to develop image denoising approaches that suppress noise while maintaining edge information. The non-local means (NLM) algorithm, a widely used patch-based method, is a highly effective edge-preserving technique but is sensitive to parameter tuning. We use a variational approach to combine multiple NLM estimates, seeking a solution that balances positivity constraints and gradient penalties against Stein's Unbiased Risk Estimate (SURE). This method greatly reduces parameter sensitivity and improves denoising performance vs. other NLM variants.
Other authorsSee publication -
Robust named entity detection from optical character recognition output
International Journal on Document Analysis and Recognition (IJDAR) June 2011, Volume 14, Issue 2, pp 189-200
In this paper, we focus on information extraction from optical character recognition (OCR) output. Since the content from OCR inherently has many errors, we present robust algorithms for information extraction from OCR lattices instead of merely looking them up in the top-choice (1-best) OCR output. Specifically, we address the challenge of named entity detection in noisy OCR output and show that searching for named entities in the recognition lattice significantly improves detection accuracy…
In this paper, we focus on information extraction from optical character recognition (OCR) output. Since the content from OCR inherently has many errors, we present robust algorithms for information extraction from OCR lattices instead of merely looking them up in the top-choice (1-best) OCR output. Specifically, we address the challenge of named entity detection in noisy OCR output and show that searching for named entities in the recognition lattice significantly improves detection accuracy over 1-best search. While lattice-based named entity (NE) detection improves NE recall from OCR output, there are two problems with this approach: (1) the number of false alarms can be prohibitive for certain applications and (2) lattice-based search is computationally more expensive than 1-best NE lookup. To mitigate the above challenges, we present techniques for reducing false alarms using confidence measures and for reducing the amount of computation involved in performing the NE search. Furthermore, to demonstrate that our techniques are applicable across multiple domains and languages, we experiment with optical character recognition systems for videotext in English and scanned handwritten text in Arabic.
Other authorsSee publication -
Stochastic Segment Modeling for Offline Handwriting Recognition
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
In this paper, we present a novel approach for incorporating structural information into the hidden Markov modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on real-world handwritten images, because breaks and merges in glyphs typically create new connected…
In this paper, we present a novel approach for incorporating structural information into the hidden Markov modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on real-world handwritten images, because breaks and merges in glyphs typically create new connected components that are not observed in the training data. To mitigate the problem of having to derive accurate segmentation from connected components, we present a novel framework where the HMM based recognition system trained on shorter-span features is used to generate the 2D character images (the ldquostochastic segmentsrdquo), and then another classifier that uses structural features extracted from the stochastic character segments generates a new set of scores. Finally, the scores from the HMM system and from structural matching are used in combination to generate a hypothesis that is better than the results from either the HMM or from structural matching alone. We demonstrate the efficacy of our approach by reporting experimental results on a large corpus of handwritten Arabic documents.
Other authorsSee publication -
A Wearable Headset Speech-to-Speech Translation System
Proc. ACL 2008 Workshop on Mobile Language Processing
We present a wearable, headset integrated eyes- and hands-free speech-to-speech (S2S) translation system. It employs an n-gram speech recognition engine, a rudimentary phrase-based translator for translating recognized text, and a rudimentary text-to speech (TTS) synthesis engine for playing back the English translation. [Pretty good for 2008, if I do say so myself.]
Other authorsSee publication -
Character-Stroke Detection for Text Localization and Extraction
9th International Conference on Document Analysis and Recognition (ICDAR 2007)
We present a new approach for analysis of images for text localization and extraction. Our approach puts very few constraints on the font, size and color of text and is capable of handling both scene text and artificial text well. In this paper, we exploit two well-known features of text: approximately constant stroke width and local contrast, and develop a fast, simple, and effective algorithm to detect character strokes. We also show how these can be used for accurate extraction and motivate…
We present a new approach for analysis of images for text localization and extraction. Our approach puts very few constraints on the font, size and color of text and is capable of handling both scene text and artificial text well. In this paper, we exploit two well-known features of text: approximately constant stroke width and local contrast, and develop a fast, simple, and effective algorithm to detect character strokes. We also show how these can be used for accurate extraction and motivate some advantages of using this approach for text localization over other color-space segmentation based approaches. We analyze the performance of our stroke detection algorithm on images col- lected for the robust-reading competitions at ICDAR 2003.
Other authorsSee publication -
Optimal Estimation of Rejection Thresholds for Topic Spotting
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
In many applications of topic spotting technology, especially those that require a human review of in-topic documents, a low false alarm rate is a key requirement. Topic spotting techniques typically include a rejection scheme to filter out off-topic documents. In this paper we present a robust methodology for rejecting off-topic messages that, in addition to modeling the topics of interest, uses a so-called alternate model for topics that are not included in the set of topics of interest…
In many applications of topic spotting technology, especially those that require a human review of in-topic documents, a low false alarm rate is a key requirement. Topic spotting techniques typically include a rejection scheme to filter out off-topic documents. In this paper we present a robust methodology for rejecting off-topic messages that, in addition to modeling the topics of interest, uses a so-called alternate model for topics that are not included in the set of topics of interest. Specifically, we introduce two novel techniques for estimating topic-specific rejection thresholds - a parametric technique that can be viewed as transformation of topic-independent thresholds, and a nonparametric technique based on constrained optimization of false rejections subject to a pre-specified number of false acceptances. Our experiments on newsgroup messages demonstrate that when adequate training data is available topic-specific threshold estimation techniques can outperform topic-independent thresholds in terms of the ROC curve.
Other authorsSee publication
Patents
-
Home Call Router
Issued US 8340262
Disclosed is a method and system for routing telephone calls within a household. In the disclosed home call routing system, a head of the household or other person with administrative authority within the home can control the routing of telephone calls by establishing and modifying call system parameters such as call priorities, traffic times, caller identities, routing rules, etc. through a home computer.
-
Multiframe Videotext Recognition
Issued US 8,290,273
Multi-frame persistence of videotext is exploited to mitigate challenges posed by varying characteristics of videotext across frame instances to improve OCR techniques. In some examples, each frame of video is processed to form multiple binary images, and one or more text hypotheses is formed from each binary image. In some examples, one or more combined images are formed from multiple frames processed to form a binary image and a corresponding text hypothesis. The text hypotheses are combined…
Multi-frame persistence of videotext is exploited to mitigate challenges posed by varying characteristics of videotext across frame instances to improve OCR techniques. In some examples, each frame of video is processed to form multiple binary images, and one or more text hypotheses is formed from each binary image. In some examples, one or more combined images are formed from multiple frames processed to form a binary image and a corresponding text hypothesis. The text hypotheses are combined to yield an overall text recognition output.
Other inventors -
-
Method and apparatus for training an automated speech recognition-based system
Issued US 7346507
A method and apparatus for building a training set for an automated speech recognition-based system, which determines the statistically optimal number of frequently requested responses to automate in order to achieve a desired automation rate. The invention may be used to select the appropriate tokens and responses to train the system and to achieve a desired "phrase coverage" for all of the many different ways human beings may phrase a request that calls for one of a plurality of…
A method and apparatus for building a training set for an automated speech recognition-based system, which determines the statistically optimal number of frequently requested responses to automate in order to achieve a desired automation rate. The invention may be used to select the appropriate tokens and responses to train the system and to achieve a desired "phrase coverage" for all of the many different ways human beings may phrase a request that calls for one of a plurality of frequently-requested responses. The invention also determines the statistically optimal number of tokens (spoken requests) required to train a speech recognition-based system to achieve the desired phrase coverage and optimal allocation of tokens over the set of responses that are to be automated.
Other inventors -
-
Unsupervised Training in Natural Language Call Routing
Issued US 7092888
A method of training a natural language call routing system using an unsupervised trainer is provided. The unsupervised trainer is adapted to tune performance of the call routing system on the basis of feedback and new topic information. The method of training comprises: storing audio data from an incoming call as well as associated unique identifier information for the incoming call; applying a highly accurate speech recognizer to the audio data from the waveform database to produce a text…
A method of training a natural language call routing system using an unsupervised trainer is provided. The unsupervised trainer is adapted to tune performance of the call routing system on the basis of feedback and new topic information. The method of training comprises: storing audio data from an incoming call as well as associated unique identifier information for the incoming call; applying a highly accurate speech recognizer to the audio data from the waveform database to produce a text transcription of the stored audio for the call; forwarding outputs of the second speech recognizer to a training database, the training database being adapted to store text transcripts from the second recognizer with respective unique call identifiers as well as topic data; for a call routed by the call router to an agent: entering a call topic determined by the agent into a form; and supplying the call topic information from the form to the training database together with the associated unique call identifier; and for a call routed to automated fulfillment: querying the caller regarding the true topic of the call; and adding this topic information, together with the associated unique call identifier, to the training database; and performing topic identification model training and statistical grammar model training on the basis of the topic information and transcription information stored in the training database.
Other inventorsSee patent
Recommendations received
2 people have recommended Prem
Join now to viewMore activity by Prem
-
A great podcast for every enterprise thinking about their AI strategy. Prem Natarajan, PhD , Chief Scientist,EVP and Head of AI at Capital One…
A great podcast for every enterprise thinking about their AI strategy. Prem Natarajan, PhD , Chief Scientist,EVP and Head of AI at Capital One…
Liked by Prem Natarajan, PhD
-
Thanks, Noah Kravitz and NVIDIA for having me on your AI podcast. I had a great time talking about the work we’re doing at Capital One to drive…
Thanks, Noah Kravitz and NVIDIA for having me on your AI podcast. I had a great time talking about the work we’re doing at Capital One to drive…
Shared by Prem Natarajan, PhD
-
Grateful to be part of extraordinary initiatives at Capital One!
Grateful to be part of extraordinary initiatives at Capital One!
Liked by Prem Natarajan, PhD
-
It was great to meet and chat with Prof Satish Tiripathi, President of the University at Buffalo, after 28 years at the Empire AI Symposium today at…
It was great to meet and chat with Prof Satish Tiripathi, President of the University at Buffalo, after 28 years at the Empire AI Symposium today at…
Liked by Prem Natarajan, PhD
-
We're pleased to share that we've received all regulatory approvals for the acquisition of Discover Financial Services and look forward to closing…
We're pleased to share that we've received all regulatory approvals for the acquisition of Discover Financial Services and look forward to closing…
Liked by Prem Natarajan, PhD
-
It has been a blast to watch the explosive growth of Capital One Software and their latest product launch of Databolt - a new tokenization solution…
It has been a blast to watch the explosive growth of Capital One Software and their latest product launch of Databolt - a new tokenization solution…
Shared by Prem Natarajan, PhD
-
Agentic AI is one of Silicon Valley’s hottest topics, but what does that really mean? Capital One’s Chief Scientist and Head of Enterprise AI, Prem…
Agentic AI is one of Silicon Valley’s hottest topics, but what does that really mean? Capital One’s Chief Scientist and Head of Enterprise AI, Prem…
Liked by Prem Natarajan, PhD
-
I couldn't resist the latest AI trend...
I couldn't resist the latest AI trend...
Liked by Prem Natarajan, PhD
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More