There has been a significant shift among public-key cryptography (PKC) cryptosystems to elliptic curve cryptography (ECC) due to its efficiency and security. Furthermore, recent emerging technologies and applications push their components, particularly their performance, to their maximum efficiency. This dissertation studies several high-performance hardware architectures for the elliptic curve-based cryptographic processors (ECC Processor).
In the beginning part of this dissertation, we present a generic, low-complexity, high-performance multiplier architecture. The multiplier is constructed from a novel variant of Karatsuba multiplication suitable for hardware implementation, providing parallelization at the digit multiplication level with lower complexity while avoiding long delay propagation. The presented formula also can work on asymmetric input of digit multipliers, making it suitable for modern FPGAs with asymmetric Digital Signal Processor (DSP) blocks. Furthermore, the proposed multiplier architecture can be applied to a wide range of other cryptographic schemes.
We then present two solutions for high-performance ECC processor architectures: one that is heavily optimized based on a particular modulus prime form and one that is generic for arbitrary prime modulus, allowing for greater flexibility in choosing curve domain parameters. On both implementation strategies, the proposed ECC processors outperform all the related works in literature in terms of throughput as well as Area×Time efficiency.
The first ECC processor architecture takes advantage of the performance of an elliptic curve constructed from a specific prime form (e.g., Solinas prime). We propose a high-performance ECC processor architecture over Curve448, whose popularity has recently increased. We demonstrate how the proposed multiplier architecture introduced in the first part can significantly improve overall performance. As a result, we propose the interleaved fast reduction technique, which takes full advantage of the modulus form as well as the multiplier based on the asymmetric variant of the Karatsuba formula. Side-channel attack countermeasures such as scalar blinding, base-point randomization, and continuous randomization are included in the proposed architecture.
The second ECC processor architecture offers greater hardware flexibility that is independent of specific modulus forms. We present a high-performance, generic, and unified ECC processor architecture on Weierstrass curves over arbitrary prime modulus. For underlying field arithmetic, we proposed a technique to eliminate the need for conditional correction throughout the Elliptic Curve Point Multiplication (ECPM) operation by carefully defining the upper bound of input/output. Accordingly, we propose a novel and efficient pipelined Montgomery Modular Multiplier (pMMM) built from a pipelined Multiplier-Accumulator (pMAC), which is essentially constructed from a multiplier architecture based on a novel variant of the Karatsuba formula. The proposed ECC processor can be further used for the curve that is birational equivalent to the Weierstrass curve.