The Intel Software Defined Silicon - A brief overview

Producing a general purpose CISC processor for a myriad of potential cloud customers is not an easy and cheap task for Intel. The current state of the x86 instruction set is indeed vast, consisting of thousands of instructions, which continually receive new inclusions. At the same time, the diversification of the product portfolio offered in the cloud provider market has been increasing significantly and has enormous potential for expansion in the coming years. And that makes life for Intel engineers a real nightmare. Whether for commercial, legal or security reasons, it is not possible to Intel to predict every case scenario which some instructions may need to be disabled in the future.

In some cases, when such circumstances can be foreseen in time, it is possible to disable certain instructions using fuses or by providing a model-specific register in such a way that the customer can access to disable such instructions. However, there are several situations in which it is desirable to disable some or even entire families of instructions, which are far beyond the scope of predictability of any CPU designer. Furthermore, it would not be a viable solution, for example, to add custom configuration options for each instruction in the current x86 set composition. It would be too complicated, requiring countless considerations and verifications with legacy cases, perhaps even requiring the creation of a new platform from scratch.

There are also situations in which the disabling of certain instructions takes place at the request of a customer. From an economic point of view, it is interesting for cloud providers, for example, to provision a number of nearly identical machines with processors of the same level of capability and value, ensuring that only those who need the more expensive features are paying for them. Thus, the cloud service provider might want to hide certain instructions so that it can update machines in its cloud services without giving additional functionality to customers who do not wish to pay for it.

Therefore, seeking a general answer to all these issues, Intel have implemented its software-defined silicon mechanism in the Sapphire Rapids CPUs, which can be programmed to flexibly enable or disable any instruction within the x86 ISA, while maintaining backward compatibility with legacy uses.

The full ISA flag registers

Block diagram of a processor containing the full ISA flag registers.
Block diagram of a processor containing the full ISA flag registers.

The basic idea behind the mechanism of the Intel SDSi is to use a bitmap array which covering a part or all instructions supported in the x86 ISA. This bitmap is provided by full ISA flag registers which is a non-volatile register within the processor but could be provided eventually in cache, in firmware, or in some other memory structure. In some cases, such registers could be used to divide the instruction set into some regions, in such way that these instructions can then be loaded into the register and checked in real time, in order to allow the application of this mechanism in smaller and less expensive processors. In this specific case of using the mechanism in budget processors, there may be a noticeable performance loss when swapping groups of instructions in and out of the cache. Therefore, the reader should be aware that there is potential for this mechanism to be applied beyond the current objective of meeting the needs of cloud providers using the expensive Xeon Scalable family processors, acting as a general-purpose alternative to handle legacy instructions for the entire x86 platform.

The bitmap array often has more entries than necessary to support the number of known instructions, providing flexibility to support newly added instructions on the platform later. When the system is very first started, all instruction entries is enabled.

If privileged software, such as a hypervisor, wants to disable a given instruction, it writes a disable bit to the entry in the bitmap table for that instruction. At runtime, microcode flow checks whether the instruction is enabled or disabled before executing the instruction. If the instruction is disabled, then according to system configuration, the instruction may be treated as a no-op (NOP), or alternately an error code such as #UD is thrown.

Flowchart illustrating the method to disable flag checking.
Flowchart illustrating the method to disable flag checking.

The microcode will also perform legacy checks if no actions are triggered by the bitmap entry. If both the NOP and the #UD flag are unset, microcode will check legacy mechanisms such as legacy registers, fuses, microcode, firmware, or other mechanisms that may be used to disable specific instructions. If no flags are set, and no legacy disabling is set, then the microcode simply performs the intended instruction. Thus, the full ISA flag register provides an additional verification layer, meaning that any legacy mechanisms for disabling instructions do not suffer any interference.

A new flexible locking mechanism

Since two bitmap entries are provided for each instruction: One indicating that the instruction should be considered a NOP and another indicating that the instruction should be considered non-existent, and that the attempt to execute the statement must throw an exception, the #UD exception, it is interesting to analyze the possible effects in cases where they are enabled.

Block diagram illustrating the flow for privileged software to set bitmaps for enabling or disabling instructions.
Block diagram illustrating the flow for privileged software to set bitmaps for enabling or disabling instructions.

For example, if some software tries to use the instruction that is not available, its effect can be emulated in the software or other instructions can be used. Therefore, when NOP is activated, the prefetching for the given instruction is simply ignored, not affecting the output of the software, only its performance. In the case of a cloud customer who has paid for more capacity, it may be more useful to throw a #UD exception if there is an attempt to access a disabled instruction for that specific customer.

But what can happen if both bits are set simultaneously? It is in this situation that the locking mechanism appears. Since instruction should not simultaneously be a NOP and nonexistent, the processor can use this case of both bits being set as a flag that the change is locked.

The action that will be taken can then be based on the first bit that was set. If the operating system first sets the NOP bit and then sets the #UD bit, that means the instruction is NOP locked. When the instruction is locked, it can only be unlocked by manually clearing the two bits in the correct order. An attempt to clear the NOP bit without first clearing the #UD bit will result in an exception.

An illustration of the various locking mechanisms that can be provided. As you can see, a separate blocking field is provided per instruction, although the methods illustrated here can be extended to a version in which the second set of flags is treated as a blocking flag.
An illustration of the various locking mechanisms that can be provided. As you can see, a separate blocking field is provided per instruction, although the methods illustrated here can be extended to a version in which the second set of flags is treated as a blocking flag.
An illustration of the data structure in which a global lock flag is provided for the entire data structure. This can be useful when it is desirable to lock or unlock instruction flags globally.
An illustration of the data structure in which a global lock flag is provided for the entire data structure. This can be useful when it is desirable to lock or unlock instruction flags globally.

The dark past and a bright future

It is inevitable to remember the failure of the old Intel Upgrade Service and its very fair criticism when superficially analyzing the novel proposed Intel On-demand and the introduction of SDSi in Sapphire Rapids CPUs. It's fair the lack of confidence and backlash from the tech community, given Intel's shady money-grabbing attempts to in the not-so-distant past.

However, the patent analysis, one more time, allows us to go beyond the trivial discussion. When analyzing the fundamentals behind the proposed mechanism, it is absolutely clear that Intel's proposal significantly improves the cloud market by bringing a new and flexible way to acquire resources that cloud providers really need. The new Intel SDSi allows the development of a new level of locking mechanism capable of providing a greater level of flexibility and control over the ISA, helping cloud providers deliver better services and more diversification, as well as bringing real benefits to security, debugging and tracing.

While Intel cannot remove the stains of past mistakes, by bringing this new mechanism, Intel has the opportunity to bring more innovation, diversification and security to the cloud market and write a new and more friendly chapter in your history. But for that, it needs to significantly improve its way of marketing and public relations, which is still very much buried in the depths of the ancient past.

***

Some references and further reading:

  • US11243766 - Flexible instruction set disabling - Rodrigo Branco - Intel [Link]

  • Jang, Yeongjin, et al., Breaking Kernel Address Space Layout Randomization with Intel TSX, 23rd ACM Conference on Computer and Communications Security - 2016 [Link]

Changelog

  • v1.0 - initial release;

Donations

  • Monero address: 83VkjZk6LEsTxMKGJyAPPaSLVEfHQcVuJQk65MSZ6WpJ5Adqc6zgDuBiHAnw4YdLaBHEX1P9Pn4SQ67bFhhrTykh1oFQwBQ

  • Ethereum address: 0x32ACeF70521C76f21A3A4dA3b396D34a232bC283

Subscribe to Underfox3
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.