Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND APPARATUS FOR QUANTIZING NEURAL NETWORK MODEL, AND COMPUTING DEVICE AND MEDIUM
Document Type and Number:
WIPO Patent Application WO/2024/021361
Kind Code:
A1
Abstract:
Embodiments of the present disclosure relate to a method and apparatus for quantizing a neural network model, a computing device, and a medium. The method comprises: updating a neural network model on the basis of a training data set; adjusting a first set of parameters of a first portion of the updated neural network model to be within a first range; and adjusting a second set of parameters of a second portion of the updated neural network model to be within a second range, wherein the second range is wider than the first range. The method also comprises quantizing the first set of adjusted parameters by using a first number of bits, and further comprises quantizing the second set of adjusted parameters by using a second number of bits, wherein the second number is greater than the first number. On the basis of the method, and in combination with differential quantization of parameters of a neural network model during training process, the compression efficiency and the execution efficiency of the neural network model are improved while the parameter precision and the model performance are maintained.

Inventors:
JIANG YONGSEN (CN)
Application Number:
PCT/CN2022/130432
Publication Date:
February 01, 2024
Filing Date:
November 07, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DOUYIN VISION CO LTD (CN)
International Classes:
G06N3/08
Foreign References:
CN113610709A2021-11-05
CN111582432A2020-08-25
CN110799994A2020-02-14
CN110717585A2020-01-21
CN114139683A2022-03-04
US20190340492A12019-11-07
Other References:
See also references of EP 4336412A4
Attorney, Agent or Firm:
KING & WOOD MALLESONS (CN)
Download PDF: