Simon & Speck algorithms in Secure Smart Home Applications
1. Introduction
Embedded systems influence our lives in all of its aspects, in areas such as automotive, avionics, smart homes or wearables, from small factor devices to physically large ones, used in industrial applications.
The security of computer networks is and always will be a hot topic, but in the context of embedded systems, with their application specific characteristics, makes these security procedures not suitable.
These procedures could refer to resource constraints such as computational capabilities (memory and power), operation in hostile environments or the need for time-critical response.
Another thing to consider when discussing embedded systems security is the fact that these devices, more often than the classical computer systems, interact with the physical world. This could lead, in a situation generated by a security incident, to asset damage, human injury or even death.
The continuous decrease in size of these devices, driven by the huge advances in the semiconductor industry, had given birth to a lot of new use cases: smart devices, wearables, home automation etc.At the same time these new technologies have also created new security challenges.
The most popular and powerful cryptographic algorithms that exist today were designed without taking into consideration the power and complexity needed by the processors on which it ran.
These algorithms are not very well suited for low-power/cost applications, where the complexity of the device needs to be kept to a minimal, to achieve the small power consumption and cost target.
Also, security is very important for many of these devices, as most of it are used in critical applications: a hacker should not be able to take control of your home appliances, door locks, HVAC systems or even your car.
That is why it is very important to design cryptographic algorithms that are both efficient and secure, have small memory footprint and easy to implement and deploy on multiple platforms. Lightweight cryptography is the research field that tries to find the right balance between the above requirements.
Lightweight cryptography applications span across multiple areas: mobile devices, RFID tags, electronics locks, sensor networks in smart homes etc.
The need for security in low power embedded systems was tackled by numerous lightweight cryptographic algorithms, to name only a few: Present, Piccolo, Klein, Twine, Katan and Ktantan, LED, HIGHT and CLEFIA.
As it can clearly be seen, there are quite a few cryptographic algorithms out there “in the wild”, but the problem that all share is that it were designed to perform well on a single platform and were not meant to provide the same performance across a range of devices.
All these lightweight algorithms are block cipher type, a block cipher being an extremely versatile cryptographic primitive.
A lot of research work has been done in the area of security for constrained devices, lightweight cryptography addressing this issue by developing block cipher algorithms for such scenarios.
1.1 Introducing Simon and Speck
SIMON and SPECK were proposed by the NSA in 2013, two block cipher lightweight cryptographic algorithms that bring to the table a very good performance, small memory footprint, and outperform most of the existing lightweight ciphers in terms of efficiency and compactness.
More than that, the designs are very simple, using only basic arithmetic operations such as modular addition, XOR, bitwise AND and bit rotation.
The idea and purpose behind Simon and Speck algorithms is to fill the need for secure, flexible lightweight block ciphers.
Each offers excellent performance on hardware and software platforms, are flexible enough to address a variety of implementations on a given platform.
Both perform exceptionally well across the full spectrum of lightweight applications, but Simon is tuned for optimal performance in hardware, and Speck for optimal performance in software.
Something needs to be said, before moving further, that a block cipher does not deliver a certain level of security by itself. Different use-cases have different security needs, and in every case a thorough analysis must be done to achieve the wanted level of security, but a block-cipher is one of the most versatile cryptographic primitive, and the belief is that any lightweight scenario can be based on an appropriately-sized block cipher.
The obvious question that should arise now is: Why not use AES for such low-power applications?
AES has been proposed and it is used in some use cases of low power scenarios, however, for the most constrained environments, AES is not the right choice: in hardware, for example, the consensus in the academic literature is that area should not exceed 2000 gate equivalents [GE], while the smallest available implementation of AES requires 2400 GE.
Most of the existing block-ciphers designed to work with constrained devices and applications perform well on dedicated Application-Specific Integrated Circuits (ASICs), and can be realized by small circuits with minimal power requirements, other perform well on devices with minimum flash, SRAM or power. Unfortunately, there is no algorithm that can have a good performance on all this scenarios, one design choice to optimize on one platform has a negative impact on another use-case.
Simon and Speck are two lightweight block-ciphers that are meant to provide the best performance and flexibility across all of the constrained scenarios, in software or hardware implementation.
The NSA researches support that Simon and Speck outperforms both the best comparable hardware algorithms (in terms of the area required to achieve a given throughput), and the best comparable software algorithms (in terms of code size and memory usage). These will help application developers to match their application requirements with their security needs, without sacrificing any of the performance.
Table 1.1 shows some of the most popular hardware and software specifications for Simon, Speck and a few other lightweight cryptographic algorithms.
The data shown in the table below shows minimal-area hardware implementations achieving a throughput of at least 12 kilobits per second (kbps) at 100 kHz, and for software implementations minimizing what the algorithm developers call the balanced performance metric :
A so-called balanced design is one that has a good score amongst more implementations, keeping in mind the balanced performance metric equation.
In the table below, one can see a side-by-side comparison between a balanced implementation of Simon and Speck and other popular lightweight cryptographic algorithms.
For the algorithms used to compare with Simon and Speck, their implementation with maximal balanced performance metric was used. Another thing to keep in mind when analyzing the results, is that the code size refers to the encryption implementation only, and not decryption also (decryption size was subtracted where data was found including it, for algorithms used for comparison).
For the sake of completion, if the decryption code would be included in the table, for Speck another approximately 100 bytes should be added (as a side note,a smaller size could be achieved by exploiting the similarities between the encryption and decryption).
For Simon the cost is negligible because the decryption algorithm is the encryption algorithm (up to swaps of words and reordering of round keys).
The software implementation of Simon and Speck was coded in assembly on an Atmel ATmega128 8-bit microcontroller running at 16 MHz and the hardware implementations were done in VHDL and synthesized using Synopsys Design Compiler 11.09-SP4 to target the ARM SAGE-X v2.0 standard cell library for IBM’s 8RF 130nm (CMR8SF-LPVT) process( DC supply voltages for the process are 1.2 V, and throughput values given assume a clock speed of 100 kHz).[1]
Table 1.1 Performance comparisons. Size is block size/key size; hardware refers to an ASIC implementation, and software to an implementation on an 8-bit microcontroller; clock speeds are 100 kHz (hardware) and 16 MHz (software). The best performance for a given size is indicated in red, the second best in blue. Numbers in brackets are NSA estimates; “–” means these values were unavailable at the time of writing. [1]
Looking over the results, it can be seen that the best balanced implementations of Simon and Speck algorithms were achieved when no SRAM was used.
This was achieved by having the encryption algorithm in flash together with the pre-expanded key, removing the need for key schedule (will explain the process later).
The main ideas after studying the results would be :
– Present-80 looks to best hardware specific lightweight algorithm, requiring only 1030 GE while delivering a throughput of 12.4 kbps at 100kHz. By comparison, Simon 64/96 and Speck 64/96 ( that both use a key longer by 16 bits) achieve an even higher throughput with a hardware implementation that would use only 838 and 984 GE.
– AES is one of the best block cipher encryption algorithm for 8-bit microcontrollers, making it the obvious choice for a lot of lightweight software implementations. The area where Simon and Speck128/128 can replace AES, is in the extremely constrained applications and also in the hardware implementations, where the 2400 GE of AES is too much, Simon and Speck 128/128 can be implemented in half of that size. But , the most significant issue with AES 128 in lightweight application is that , such applications do not usually need a block size of 128 bit, the versatility of Simon and Speck coming into the scene, providing various block size implementations with minimal loss in security strength.
2. Lightweight Cryptographic Mechanisms
Most of the embedded devices used in the large variety of existing applications share limitations in terms of memory, storage, power. The cryptographic algorithms needed by this devices to deliver a good level of security have an impact on the system, for example:
– On size ( memory take a big part of the device surface )
– On cost ( linked with the size and complexity )
– On speed ( optimized code results in a faster result )
– On power consumption (the quicker the cpu returns to idle state after executing instructions, the less power is used)
Legacy cryptographic implementations were focused on providing the best security performance without taking into consideration device complexity and power usage needed to achieve this. Lightweight cryptography researches the area of ultra constrained devices in power, computational capabilities etc, that need to provide a good level of security.
Solution proposed by the researches in this field are mostly in the hardware implementation area, which is more suitable for these devices, but also in software and hybrid implementations.
The hardware implementation proposals are focused on reducing the logic gate required to implement the cipher, by eliminating the redundant components, with the help of a specific algorithm design that should facilitate this.
The metric, gate equivalent, should be as small as possible, while still providing a good level of security. This small metric makes the circuit cheap and assures it consumes little power ( as stated earlier, a 2000 GE is the acceptable threshold for constrained devices, but a 1000 GE threshold is taken into consideration for even smaller devices, like 4-bit microcontrollers).
Energy and power consumption are other two important metrics that are taken into consideration when designing an algorithm for highly constrained devices.
The metrics used while designing with software implementation in mind are memory size (flash and ram) and processing requirements ( cpu complexity).
The advantage of software implementations over the hardware ones, is the portability. Trying to benefit from the best from both worlds, hybrid solutions have been developed, the hardware taking care of the basic cipher functionality while software performing the data and communication manipulation.
Cryptographic co-processors fall into this category, with the drawback that the throughput is affected by the communication delay between the software and hardware component. This solutions are used in specific applications, like RFID tags or portable devices.
2.1 Lightweight Block Cipher Design Consideration
Lightweight, used in a cryptographic context, refers to an algorithm that is design to be used on constrained platforms. If an algorithm behaves well on a microcontroller for example, it is not necessary that it would have the same performance when implementing it in hardware. The main idea behind the design considerations used for lightweight algorithms is to be as platform independent as possible, so no attempts to optimize for a specific use-case were made.
Design ideas that provided best performances on ASICS-s and also on 8-bit microcontrollers were preferred, thinking that if we would obtain good performance here, these will span across other platforms also: FPGAs, 4/16-bit microcontrollers, 32-bit processors etc.
So, two main directions were crystallized when designing a lightweight algorithm : to have a small hardware implementation and also have software implementations on low-power microcontrollers with the least amount flash and SRAM usage.
Based on the fact that most of the lightweight applications do not necessarily need high throughput, low-area hardware designs could benefit from the usage of low complexity round functions in the algorithm implementation, with the drawback of requiring many rounds to achieve it.
When designing Simon and Speck, based on the values from literature, a minimum throughput requirement for low-frequency hardware implementations has been set to at least 12 kilobits per second (kbps) at 100 kHz.
Another thing to keep in mind when designing lightweight algorithms, is that it should be efficiently implemented on a variety of platforms and also to support a variety of implementations on the same platform. For example, on hardware implementations, for ultra-constrained devices, the lowest implementations in terms of GE should be achievable, but it should also easily scale to support larger implementations if hardware resources are not so scarce, providing higher throughput when possible.
In software on the other hand, small flash and sram usage implementations should be achievable but also high-throughput/low-energy designs.
Returning to the hardware design considerations, another consideration is the level on which the algorithm can be serialized. If we can update a bit per cycle, we have a fully serialized algorithm, but if we update a block per cycle, our implementation is unserialized. Algorithms that are bit-serial translate in a very small hardware implementation, but not necessarily in a fast one, while trying to increase the throughput of such algorithms could lead to an unnecessarly usage of chip area.
Most of the symmetric block-cipher algorithms are based on S-boxes, making it impossible to serialize it at a level below the S-box width. This can be seen clearly on Table 1.1, AES being built arround 8-bit S-boxes, can provide a throughput much higher than 12 kbps.
Lightweight algorithms should provide an increased flexibility over the standard ones, because they need to support a large multitude of applications and devices, the same variety should be found in the block and key sizes. While in the high power applications, block sizes of 64 and 128 bits are most common, in the low power applications area, 48 or 96 bits block sizes are found frequently.
If we talk about key size, we should keep in mind that it should match the needed level of security, for example a low cost/power device could achieve it with only a 64 bit key size, while other applications would need a higher length key (128 or 256 bit).
Simon and Speck were designed to provide this flexibility, by design them as block-ciphers, each supporting block sizes of 32, 48, 64, 96, and 128 bits, with three key sizes each. Bellow, you can see in Table 2.1 the block/key sizes currently supported by these two algorithms.
Table 2.1 Simon and Speck parameters [1]
The last idea that I would emphasize ending this chapter, is that, although Simon and Speck were subjected to some level of cryptanalysis, they do not rise to the level of AES or DES on this matter, so , when choosing the security needs for an application, a thorough analysis needs to be made on picking the right pair of block/key size.
The designers are confident that the cryptographic community will turn their heads to Simon and Speck analysis also, the simple round functions around which the algorithms are designed can contribute to that.
Simon and Speck were designed against attacks that encrypt and decrypt large amounts of data and also against attacks based on flipping key bits, instead efforts were not made against attacks in the open-key model, and these algorithms are not validated to be used as hashes.
3. The SIMON family of Block Ciphers
The Simon block cipher with an n-bit word (2n-bit block) is referred as Simon2n, where n must be 16, 24, 32, 48, or 64. Simon2n with an m-word (mn-bit) key will be referred to as Simon2n/mn.
Simon64/128 refers to the 64-bit plaintext blocks Simon version and using a 128-bit key. Simon and Speck are built around the Feistel rule of motion. Simon is designed to have the best performance when implementing it in hardware, being easy to serialize, and one should be extra careful when implementing Simon in software, to not experience performance issues.
3.1 Round Functions
Encryption and decryption with Simon2n uses the following operations on n-bit words:
• bitwise XOR, ⊕,
• bitwise AND, &, and
• left circular shift, Sj , by j bits.
For k ∈ GF(2)n, the key-dependent Simon2n round function is the two-stage Feistel map
Rk : GF(2)n × GF(2)n → GF(2)n × GF(2)n defined by :
Rk (x, y) = (y ⊕ f(x) ⊕ k, x),
where f(x) = (Sx & S8x) ⊕ S2x and k is the round key. The inverse of the round function, used for decryption, is:
Rk-1 (x, y) = (y, x ⊕ f(y) ⊕ k).
where x and y are plaintext/ciphertext blocks.
The key schedule process of Simon takes a key and from it generate a sequence of T key words k0, . . . , kT-1, where T represents the number of rounds. The encryption map is then made up from the composition Rk(T-1) ◦ · · · ◦ Rk1 ◦ Rk0 , read from right to left.
Figure 3.1.1 shows the effect of the round function Rki on the two words of subcipher (xi+1, xi) at the ith step of this process.
Figure 3.1.1 Feistel stepping of the Simon round function [1]
From the last figure one can notice that Simon does not include any whitening on the plaintext or ciphertext. There is a good reason for that, it could greatly affect circuit size so it was left out. Also, the purpose of the first and last round is solely to bring in the first and last key, not having any cryptographic impact on the plaintext.
3.2 Key Schedules
All Simon rounds , except for the round keys, are the same, with the operations being symmetric with respect to the circular shift map on n-bit words. To eliminate slide properties and circular shift symmetries, the key schedule uses a sequence of 1-bit round constants.
Five sequences z0, . . . , z4 are defined to provide cryptographic separation between Simon versions having the same block size.
The sequences are defined by one of the period 31 sequences:
u = u0u1u2 . . . = 1111101000100101011000011100110 . . . ,
v = v0v1v2 . . . = 1000111011111001001100001011010 . . . ,
w = w0w1w2 . . .= 1000010010110011111000110111010 . . . .
The first two are obtained from z0 = u and z1 = v. The last three, z2, z3, and z4 with a period of 62, are obtained by performing a bitwise XOR between the period 2 sequence t = t0t1t3 . . . = 01010101 . . . with u, v, and w, and :
z2 = (z2)0(z2)1(z2)2 . . . = 1010111101110000001101001001100
0101000010001111110010110110011 . . . ,
z3 = (z3)0(z3)1(z3)2 . . . = 1101101110101100011001011110000
0010010001010011100110100001111 . . . ,
z4 = (z4)0(z4)1(z4)2 . . . = 1101000111100110101101100010000
0010111000011001010010011101111 . . . ,
where (zi)j is the j th bit of zi .
To obtain u, v and w we define the following 5×5 matrices U, V, and W over GF(2):
Every ith element of each sequence is obtained after initializing a 5 bit LFSR ( linear feedback shift register ) to 0001, while stepping i times with the right matrix and extracting the right-hand bit, obtaining :
(u)i = (0, 0, 0, 0, 1) Ui (0, 0, 0, 0, 1)t
If we define c = 2n− 4 = 0xff…..fc, we obtain for Simon2n with m key words (km-1, . . . , k1, k0) and constant sequence zj , the following round key generation function, for 0 < i <T-m:
Figure 3.2.1 shows the key schedules.The key words are used as the m round keys, being loaded into the SR (shift register) with k0 on the right and km-1 on the left:
Figure 3.2.1 Key expansion for Simon two, three and four word [1]
and in Table 3.2.1 the version-dependent choice of the constant sequence zj is represented.
Table 3.2.1 Simon parameters [1]
The simplicity of the Simon design can be seen through the pseudocode, shown below:
n = word size (16, 24, 32, 48, or 64)
m = number of key words (must be 4 if n = 16,
3 or 4 if n = 24 or 32,
2 or 3 if n = 48,
2, 3, or 4 if n = 64)
z = [11111010001001010110000111001101111101000100101011000011100110, 10001110111110010011000010110101000111011111001001100001011010, 10101111011100000011010010011000101000010001111110010110110011, 11011011101011000110010111100000010010001010011100110100001111, 11010001111001101011011000100000010111000011001010010011101111]
(T, j) = (32,0) if n = 16
= (36,0) or (36,1) if n = 24, m = 3 or 4
= (42,2) or (44,3) if n = 32, m = 3 or 4
= (52,2) or (54,3) if n = 48, m = 2 or 3
= (68,2), (69,3), or (72,4) if n = 64, m = 2, 3, or 4
x,y = plaintext words
k[m-1]….k[0] = key words
Key expansion:
for i = m….T-1
tmp ← S-3 k[i-1]
if (m = 4) tmp ← tmp ⊕ k[i-3]
tmp ← tmp ⊕ S-1 tmp
k[i] ← ~k[i-m] ⊕ tmp ⊕ z[j][(i-m) mod 62] ⊕ 3
end for
Encryption:
for i = 0….T-1
tmp ← x
x ← y ⊕ (Sx & S8 x) ⊕ S2 x ⊕ k[i]
y ← tmp
end for
4. The SPECK family of Block Ciphers
The various implementations of Speck were designed to provide a good performance in both hardware and software, but implementation on microcontrollers gives the best results when using this algorithm.
4.1 Round Functions
Speck2n is build around the following basic logical operations:
• bitwise XOR, ⊕,
• addition modulo 2n , +, and
• left and right circular shifts, S j and S -j , respectively, by j bits.
For k ∈ GF(2)n , the key-dependent Speck2n round function is the map
Rk : GF(2)n× GF(2)n → GF(2)n × GF(2)n defined by :
Rk(x, y) = ((S-𝞪 x + y) ⊕ k, S𝛽y ⊕ (S-𝞪 x + y) ⊕ k),
with rotation 𝞪 = 7 and β = 2 if n = 16 (block size = 32) and 𝞪 = 8 and β = 3 otherwise.
The inverse of the round function, used for decryption, is made up of modular subtraction instead of modular addition:
Rk-1 (x, y) = (S𝞪((x ⊕ k) − S -𝛽 (x ⊕ y)), S -𝛽 (x ⊕ y)).
The Speck parameters can be seen in the table 4.1.1 below :
Table 4.1.1 Speck parameters [1]
The T key words k0, . . . , kT-1 , are generated from an initial key during the key schedule step, where T equals to the number of rounds.
Figure 4.1.1 represents the effect of the single round function Rki , where the composition RkT-1 ◦ · · · ◦ Rk1 ◦ Rk0 , read from right to left represents the encryption.
Figure 4.1.1 Speck round function
(x2i+1, x2i) denotes the subcipher after i steps of encryption [1]
4.2 Key Schedules
Round functions are used to generate round keys, ki. If K is a key for a Speck2n block cipher, we could have K = (ℓm-2, . . . , ℓ0, k0), where ℓi , k0 ∈ GF(2)n , for a value of m in {2, 3, 4}. ki and ℓi are defined by :
ℓ i+m−1 = ( ki + S-𝞪 ℓi ) ⊕ i and
ki+1 = S𝛽 ki ⊕ ℓi+m−1
The value ki is the ith round key, for 0 ≤ i < T, as seen in the figure 4.2.1
Figure 4.2.1 : Speck key expansion. Ri is the round function , and i the round key [1]
Just like in the case of Simon, the design simplicity of Speck can be best seen through the pseudocode:
n = word size (16, 24, 32, 48, or 64)
m = number of key words (must be 4 if n = 16,
3 or 4 if n = 24 or 32,
2 or 3 if n = 48,
2 or 3 or 4 if n = 64)
T = number = 22 if n = 16
of rounds = 22 or 23 if n = 24, m = 3 or 4
= 26 or 27 if n = 32, m = 3 or 4
= 28 or 29 if n = 48, m = 2 or 3
= 32, 33, or 34 if n = 64, m = 2, 3, or 4
(α, β) = (7,2) if n = 16
(8,3) otherwise
x,y = plaintext words
ℓ[m-2]..ℓ[0],k[0] = key words
Key expansion:
for i = 0..T-2
ℓ [i+m-1] ← (k[i] + S-𝞪 ℓ[i]) ⊕ i
k[i+1] ← S𝛽 k[i] ⊕ ℓ[i+m-1]
end for
Encryption:
for i = 0..T-1
x ← (S-𝞪 x + y) ⊕ k[i]
y ← S𝛽y ⊕ x
end for
5. IOT and Smart Homes Security
IOT , or Internet of Things refers to the objects, or “things” that are interconnected through digital networks.
Most of the electronics in one’s home are already connected to the network and the list of devices that could be part of a web-like network, is only growing ( from your TV to the fridge, washing machine or even the HVAC systems ).
The idea of a Smart Home, is not new, and Home Automation solutions were deployed , to some extent , in the past also, but with a greater cost and smaller flexibility.
In 1975, Pico Electronics of Glenrothes, Scotland developed X10, allowing compatible products to talk to each other over the already existing electrical wires of a home.
Communicating over electrical lines is not always reliable because the lines get “noisy” from powering other devices.
An X10 device could interpret electronic interference as a command and react, or it might not receive the command at all.Only recently, with the latest developments in the semiconductor industry which produced smaller, less power hungry and cheaper chips, Smart Home solutions took off, becoming more accessible to the general public, and more flexible, communicating mostly over the radio interface.
Using a wireless networks for communicating provides more flexibility for placing devices, but introduces some drawbacks.
One of the most important drawback, is the need for security, as the information circulates through an open medium, the radio spectrum, and it is easy to imagine what would happen if a hacker could inject false data, as coming from the sensor network.
While the security at the central node ( gateway node ) , is not an issue, these devices being more complex and powerful by design , the real challenge is the end-node side, where, for keeping these devices cheap and low-power ( as these nodes should run on a small coin battery for years), their complexity should be kept small.
Implementing a well known and strong symmetric encryption algorithm ( like AES or DES/3DES) on these devices is not an option (mostly because the software and especially the hardware implementation of these algorithms is very complex), so , lightweight cryptographic algorithms are more suitable for such use-cases.
For example, Simon 64/128 ( block size/key size) has a 1000 GE (gate equivalent) footprint, 16.7kbps throughput , 282 bytes code size(flash) , and 515 kbps throughput for the software implementation.
5.1 Secure Wireless Sensor Networks
A wireless sensor network, represents an autonomous distribution of sensors with the purpose of monitoring physical and environmental conditions (temperature, pressure, humidity etc.), and to transmit that data to a central node, that could act as a gateway node.
Every sensor is made up of a few main parts : a wireless transceiver, a microcontroller, a circuit used to interface with the sensor and a power source ( usually a small coin battery).
The Gateway Node is usually more complex, and acts as a bridge between the sensor network and other IP networks. An example can be seen in the figure 5.1.1.
Figure 5.1.1 Wireless Sensor network [#TODO-> add link to bio]