While going through the source code of nvdla, I noticed it was generated with ness version 2.0 backend=verilog. “ness” is a NVDIA in-house tool.
The process of updating weights is one training iteration involving backpropagation. The latter can be broken down into 4 distinct steps:
- Forward pass
In this first training step, the intention is to extract weight from the first pass of the training data into the neural network.
- Loss function
The training loss (precision) is calculated from a loss function, e.g mean square error. On the first iteration, the loss will be high. Then, iteration after iteration, the neural network learns and the training loss (difference) between the prediction label and training label decreases.
- Backward pass
To achieve a minimum loss, the weights need to be adjusted by taking a derivative of the loss with respect to the weights.
- Weight update
The weights of each layer are then updated accordingly with the calculated weights from the training iterations.
Quick news about the development of VHDL language.
It seems that the IEEE P1076 VHDL Standards Working Group has been busy preparing VHDL-2017. It is said that it is almost ready and it is time for the VHDL community to prepare to ballot. Voting on the upcoming standard will be conducted between May and July 2017.
Since Cadence added Azuro’s ccopt by default in SOC Encounter, I think it is time to look deeper in this clock concurrent optimization. The latter was initiated by a company called Azuro which was then acquired by Cadence in July 2012.
Clock concurrent optimization merges clock tree synthesis (CTS) with physical optimization, simultaneously building clocks and optimizing logic delays based directly on a propagated clocks model of timing. Thus, the clock tree synthesis (CTS) becomes timing-driven and tightly coupled with placement and logic sizing. This is different from traditional techniques of sequential optimization and useful skew.
CTS is done by combining the benefits of pre-route layer-aware optimization and useful skewing.
Buffer insertion is not only done to just reduce the skew but also to do time borrowing and to improve the overall speed of the design.
What is the benefits of using CCopt ?
- Increased chip speed or reduced chip area and power
- Reduced IR drop
Following an update of MacOS to Sierra, the Arduino toolchain failed with
MSpanList_Insert error message. An update of Arduino was required to fix this compatibility issue.
path_adjust is used to either relax or tighten the timing constraints (even though defined by path_delay).
e.g to relax the constraints by 100ps
rc:>/ path_adjust -from [all_inputs] -to [all_outputs] -delay 100 -name pa_i2o
e.g to tighten the constraints by 100ps
rc:>/ path_adjust -from [all_inputs] -to $all_registers -delay 100 -name pa_i2r
path_adjust is thus use to guide RTL Compiler to concentrate and focus on register to register timing instead of optimizing inputs to output paths.
When such path_adjust exceptions are created there are located at
“relative C threshold” and “total C threshold” in RC extraction are set by
Upon a change of my server’s certificates, my git configuration required to be updated. Else, git pulls will fail with the following error message:
SSL: certificate verification failed
The fix itself is simple, first download a copy of the new X509 certificate and then update the
sslCAinfo field in the
$realtime is the de-facto system function I use when writing SystemVerilog assertions. The limitation of
$time lies in its 64-bit integer value.
$realtime enables SystemVerilog assertions to be re-used in gate-level simulations. I’m glad that Ben Cohen, the renowned SVA expert, agrees with me on this.