[Paper Review] Language models are few-shot learners
Language models are few-shot learners Brown, Tom, et al. “Language models are few-shot learners.” Advances in neural information processing systems 33 (20...
Language models are few-shot learners Brown, Tom, et al. “Language models are few-shot learners.” Advances in neural information processing systems 33 (20...
Language models are unsupervised multitask learners Radford, Alec, et al. “Language models are unsupervised multitask learners.” OpenAI blog 1.8 (2019): 9...
Exploring the limits of transfer learning with a unified text-to-text transformer Raffel, Colin, et al. “Exploring the limits of transfer learning with a ...
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension Lewis, Mike, et al. “Bart: Denoising seq...
Deep contextualized word representations Matthew E. Peters, Mark Neumann, et al. “Deep contextualized word representations” NAACL 2018.
Efficient dialogue state tracking by selectively overwriting memory Kim, Sungdong, et al. “Efficient dialogue state tracking by selectively overwriting me...
Transferable multi-domain state generator for task-oriented dialogue systems Wu, Chien-Sheng, et al. “Transferable multi-domain state generator for task-o...
Dense passage retrieval for open-domain question answering Karpukhin, Vladimir, et al. “Dense passage retrieval for open-domain question answering.” arXiv...
Bert: Pre-training of deep bidirectional transformers for language understanding Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transform...
Improving language understanding by generative pre-training Radford, Alec, et al. “Improving language understanding by generative pre-training.” (2018).
Enriching word vectors with subword information Vaswani, Ashish, et al. “Attention is all you need.” Advances in neural information processing systems 30 ...
Enriching word vectors with subword information Bojanowski, Piotr, et al. “Enriching word vectors with subword information.” Transactions of the associati...
Efficient Estimation of Word Representations in Vector Space Mikolov, Tomas, et al. “Efficient estimation of word representations in vector space.” arXiv ...
Two-stream action recognition-oriented video super-resolution Zhang, Haochen, Dong Liu, and Zhiwei Xiong. “Two-stream action recognition-oriented video su...
TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution Tian, Yapeng, et al. “TDAN: Temporally-Deformable Alignment Network for Video Sup...
Fast spatio-temporal residual network for video super-resolution Li, Sheng, et al. “Fast spatio-temporal residual network for video super-resolution.” Pro...
Edvr: Video restoration with enhanced deformable convolutional networks Wang, Xintao, et al. “Edvr: Video restoration with enhanced deformable convolution...
Recurrent back-projection network for video super-resolution Haris, Muhammad, Gregory Shakhnarovich, and Norimichi Ukita. “Recurrent back-projection netwo...
Frame-recurrent video super-resolution Sajjadi, Mehdi SM, Raviteja Vemulapalli, and Matthew Brown. “Frame-recurrent video super-resolution.” Proceedings o...
Selective Refinement Network for High Performance Face Detection Chi, Cheng, et al. “Selective refinement network for high performance face detection.” Pr...
DSFD: dual shot face detector Li, Jian, et al. “DSFD: dual shot face detector.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognit...
Pyramidbox: A context-assisted single shot face detector Tang, Xu, et al. “Pyramidbox: A context-assisted single shot face detector.” Proceedings of the E...
RetinaFace : Single-stage Dense Face Localisation in the Wild Deng, Jiankang, et al. “Retinaface: Single-stage dense face localisation in the wild.” arXiv...
WIDER FACE : A Face Detection Benchmark Yang, Shuo, et al. “Wider face: A face detection benchmark.” Proceedings of the IEEE conference on computer vision...
Abd-net: Attentive but diverse person re-identification Chen, Tianlong, et al. “Abd-net: Attentive but diverse person re-identification.” Proceedings of t...
FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking
Deep learning for person re-identification: A survey and outlook. Ye, Mang, et al. “Deep learning for person re-identification: A survey and outlook.” arX...
A strong baseline and batch normalization neck for deep person re-identification Luo, Hao, et al. “A strong baseline and batch normalization neck for deep...
Mask r-cnn He, Kaiming, et al. “Mask r-cnn.” Proceedings of the IEEE international conference on computer vision. 2017.
Fast r-cnn Girshick, Ross. “Fast r-cnn.” Proceedings of the IEEE international conference on computer vision. 2015.
You only look once: Unified, real-time object detection Redmon, Joseph, et al. “You only look once: Unified, real-time object detection.” Proceedings of t...
Spatial transformer networks Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. “Spatial transformer networks.” Advances in neural information processi...
Learning spatiotemporal features with 3d convolutional networks Tran, Du, et al. “Learning spatiotemporal features with 3d convolutional networks.” Procee...
Deformable convolutional networks Dai, Jifeng, et al. “Deformable convolutional networks.” Proceedings of the IEEE international conference on computer vi...
Segnet: A deep convolutional encoder-decoder architecture for image segmentation Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla. “Segnet: A deep...
Fully convolutional networks for semantic segmentation Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segm...
Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs Chen, Liang-Chieh, et al. “Deeplab: Semant...
Support Vector Guided Softmax Loss for Face Recognition Wang, Xiaobo, et al. “Support vector guided softmax loss for face recognition.” arXiv preprint arX...
Focal Loss for Dense Object Detection Lin, Tsung-Yi, et al. “Focal loss for dense object detection.” Proceedings of the IEEE international conference on c...
Video inpainting by jointly learning temporal structure and spatial details Wang, Chuan, et al. “Video inpainting by jointly learning temporal structure a...
Deep video inpainting Kim, Dahun, et al. “Deep video inpainting.” proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
Character region awareness for text detection Baek, Youngmin, et al. “Character region awareness for text detection.” Proceedings of the IEEE/CVF Conferen...
Efficient and accurate arbitrary-shaped text detection with pixel aggregation network Wang, Wenhai, et al. “Efficient and accurate arbitrary-shaped text d...
Star-net: a spatial attention residue network for scene text recognition Liu, Wei, et al. “Star-net: a spatial attention residue network for scene text re...
PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit Mou, Yongqiang, et al. “PlugNet: Degradation Aware Scene...
Efficient dialogue state tracking by selectively overwriting memory Kim, Sungdong, et al. “Efficient dialogue state tracking by selectively overwriting me...
Transferable multi-domain state generator for task-oriented dialogue systems Wu, Chien-Sheng, et al. “Transferable multi-domain state generator for task-o...
Spatio-temporal filter adaptive network for video deblurring Zhou, Shangchen, et al. “Spatio-temporal filter adaptive network for video deblurring.” Proce...
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution Xiang, Xiaoyu, et al. “Zooming Slow-Mo: Fast and Accurate One-Stage Space-T...
FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking
Contextnet: Improving convolutional neural networks for automatic speech recognition with global context Han, Wei, et al. “Contextnet: Improving convoluti...