Fu et al., 2019 - Google Patents
Embodied one-shot video recognition: Learning from actions of a virtual embodied agentFu et al., 2019
View PDF- Document ID
- 1661407110054970067
- Author
- Fu Y
- Wang C
- Fu Y
- Wang Y
- Bai C
- Xue X
- Jiang Y
- Publication year
- Publication venue
- Proceedings of the 27th ACM international conference on multimedia
External Links
Snippet
One-shot learning aims to recognize novel target classes from few examples by transferring knowledge from source classes, under a general assumption that the source and target classes are semantically related but not exactly the same. Based on this assumption, recent …
- 230000003416 augmentation 0 abstract description 28
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
- G06K9/00718—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Fu et al. | Embodied one-shot video recognition: Learning from actions of a virtual embodied agent | |
| Wang et al. | Unsupervised deep representation learning for real-time tracking | |
| Fu et al. | Depth guided adaptive meta-fusion network for few-shot video recognition | |
| Peng et al. | Visda: The visual domain adaptation challenge | |
| Wang et al. | Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance | |
| Zhu et al. | Dense feature aggregation and pruning for RGBT tracking | |
| Huang et al. | Tracknet: A deep learning network for tracking high-speed and tiny objects in sports applications | |
| Hurault et al. | Self-supervised small soccer player detection and tracking | |
| Jiang et al. | SoccerDB: A large-scale database for comprehensive video understanding | |
| Zhang et al. | Object-aware aggregation with bidirectional temporal graph for video captioning | |
| Zhang et al. | Synthetic data generation for end-to-end thermal infrared tracking | |
| Yan et al. | Multi-clue fusion for emotion recognition in the wild | |
| Liu et al. | PKU-MMD: A large scale benchmark for skeleton-based human action understanding | |
| Lin et al. | Exploring explicit domain supervision for latent space disentanglement in unpaired image-to-image translation | |
| Liu et al. | Human pose estimation in video via structured space learning and halfway temporal evaluation | |
| Liu et al. | Deep learning based basketball video analysis for intelligent arena application | |
| Yuan et al. | Unsupervised video summarization via deep reinforcement learning with shot-level semantics | |
| Wu et al. | Synthetic data supervised salient object detection | |
| JP2023129179A (en) | Method and apparatus for summarization of unsupervised video with efficient key frame selection reward functions | |
| Zhang et al. | Object discovery from a single unlabeled image by mining frequent itemsets with multi-scale features | |
| Liu et al. | Spatiotemporal graph neural network based mask reconstruction for video object segmentation | |
| Yan et al. | Fine-grained video captioning via graph-based multi-granularity interaction learning | |
| Liu et al. | Self-supervised motion perception for spatiotemporal representation learning | |
| Pang et al. | Human action adverb recognition: Adha dataset and a three-stream hybrid model | |
| Zheng et al. | Multi-spectral vehicle re-identification with cross-directional consistency network and a high-quality benchmark |