Kneron, Inc.

Senior AI and Deep Learning Architect – Model Compression and Quantization

Kneron, Inc., San Diego, California, United States, 92189

Senior AI and Deep Learning Architect – Model Compression and Quantization

Research and develop state-of-the-art model compression techniques including QAT, model distillation, pruning, quantization, model binarization, and others for deep learning models.Implement novel deep neural network architectures and develop advanced training algorithms to support model structure training, auto pruning, and low-bit quantization.Apply and optimize model compression and quantization techniques to a variety of models in computer vision applications, audio applications, and others.Research and optimize model compression and quantization techniques for Kneron AI accelerator and jointly optimize hardware architecture for compressed models.Requirements

M.S./PhD in Computer Science, Machine Learning, Mathematics or similar field (Ph.D. is preferred)3+ years of industry/academia experience with deep learning algorithm development and optimization.3-5 years of software engineering experience in an academic or industrial setting.Research experience on any model compression and model quantization technique including model distillation, pruning, post-train quantization, quantization aware retrain, model binarization, and NAS.Experience on model accuracy loss analysis for model compression and quantization is a strong plus. Noise modeling and noise analysis are strong plus.Strong experience in C/C++ programming is a plus.Hands-on experience in computer vision and deep learning frameworks, e.g., OpenCV, TensorFlow, Keras, PyTorch, and Caffe.Ability to quickly adapt to new situations, learn new technologies, and collaborate and communicate effectively.Experience with parallel computing, GPU/CUDA, DSP, and OpenCL programming is a plus.Top-tier conference publication records, including but not limited to CVPR, ICCV, ECCV, NIPS, ICML, are strong plus.Location#J-18808-Ljbffr