Writing
Reading
这里不会所有的都放,大概率会放到Experimental Tracking
Human Pose Estimation
Super-Resolution
- Look Back and Forth- Video Super-Resolution with Explicit Temporal Difference Modeling
- Local Texture Estimator for Implicit Representation Function
- Learning Continuous Image Representation with Local Implicit Image Function
Action Recognition
- TDN- Temporal Difference Networks for Efficient Action Recognition
- STM- SpatioTemporal and Motion Encoding for Action Recognition
- TEA- Temporal Excitation and Aggregation for Action Recognition
- TEINet- Towards an Efficient Architecture for Video Recognition
- Recognize Actions by Disentangling Components of Dynamics
Convolution
- Searching Central Difference Convolutional Networks for Face Anti-Spoofing
- InternImage- Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Object Detection
- Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection
- Motion Guided Attention for Video Salient Object Detection
- TF-Blender- Temporal Feature Blender for Video Object Detection
Human-centric Visual Analysis
Person Re-Identification
Face Recognition
Mamba
- Vision Mamba - Efficient Visual Representation Learning with Bidirectional State Space Model
- VMamba - Visual State Space Model
- U-Mamba - Enhancing Long-range Dependency for Biomedical Image Segmentation
- Mamba-UNet - UNet-Like Pure Visual Mamba for Medical Image Segmentation
- U-shaped Vision Mamba for Single Image Dehazing
- VM-UNet - Vision Mamba UNet for Medical Image Segmentation
- VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation