[MUSIC PLAYING] Hello, and welcome to TI Precision Labs on Embedded Flash Memory. In this video, I'll discuss how flash memory is used in an embedded system, key terminology for flash memory, and important electrical parameters of the flash memory. Let's get started. 

This is a very simple example of a processor or microcontroller and its internal connections. While there may be many instances of the blocks you'll see here, all processors and microcontrollers will have at least one instance of the following. The CPU, or Central Processing Unit, responsible for the main operation of the device, peripherals that are controlled by the CPU, these can be anything from communication peripherals, such as SPI or UART, to embedded analog blocks, such as ADCs, DACs, or actuation peripherals, such as PWMs. 

Finally, there must be some type of memory that stores the data for the device to use. The basic role of the memory in an embedded system is to either contain program code that the CPU will execute or to store data variables for both the CPU and peripherals. Memory can be either internal to the device or externally accessed through the device pins. 

We can further classify the memory by how it behaves when the device does not have power. Volatile memory is memory whose contents are lost when the device has no power, or non-volatile, which is memory whose contents are preserved when the device has no power. Some examples of volatile memory are SRAM, DRAM, and caches. Some examples of non-volatile memory are flash, FRAM, and ROM. 

Today I will be going more in depth on flash memory. At this point, you may be asking why one type of memory is used versus another. Let's take a look at some typical trade-offs between volatile memory, in this case, I will use SRAM as the example, and non-volatile memory, in this case, I will use flash memory as the example. 

To reemphasize, the biggest difference is one we just talked about. A SRAM will lose its contents between power cycles while a flash memory will retain those values. Next are some not so obvious differences. Due to its physical construction, SRAM can usually operate at the full speed of the CPU on the device. 

This would mean that its access time is the same as the clock rate of the device. On the other hand, flash memory typically cannot run at the full speed of the CPU on the device. This would mean that its access time is slower than the clock rate of the device. These statements are not absolutes, so be sure to check the device data sheet for your device to understand the behavior of all memories. 

Again due to the physical construction of the memories, SRAM and flash handle reads and writes very differently. SRAM behaves in a manner that I would say is the assumed normal for memory in that the CPU can read and write to the SRAM through basic indirect instructions. Flash, however, can only be directly read by the CPU. In order to write new data to a flash memory, special operations must be used by the CPU. I will go into more detail later on this aspect. 

Finally, flash memory introduces the concept of endurance or how many times the contents of the memory can be changed over the lifetime of a device. SRAM does not have this restriction, so it is something important to understand for your end application. This is an example of a typical memory allocation on a TI microcontroller, showing the different types of code and where they are stored. 

On a device that has flash memory, typically the main code will be written to flash, in addition to constants and tables that are needed for code execution. Basically anything that would be needed for the device to run, stand alone, if power cycled. For variables that are set during runtime, you will see those are loaded to RAM since they are not static run to run. While this is a typical example, it is worth noting that code can be allocated to RAM by copying it from the flash where it was loaded and then running from the RAM address. 

The next image is source code from a linker command file that the linker for your code generation tools will use to allocate those sections we just talked about. You can see that all memory is defined here, SRAM, flash, but also an external memory zone, as well as ROM that is on the device. The section's portion of the linker will then direct which areas go to which memory. Again, you can see that the main code is allocated to flash memory while more variable data, like the stack, is allocated to SRAM. 

Let's go over some basic memory terminology that will be necessary as I explain in more detail how the flash memory can be accessed. The smallest grouping in any memory is referred to as a bit and is simply either a binary value of 0 or 1. Bits are notated with a lowercase b, so if I want to show the value of 10 kilobits, that would be notated with the number 10 followed by an uppercase K and a lowercase b. 

Note that in this terminology, the letter K is capitalized as it refers two to the power of 10, or 1,024 units, and not simply 1,000. The next representation is the byte, which is a grouping of 8 bits. Bytes are notated by an uppercase B. So 10 kilobytes will be written as the number 10 followed by an uppercase K and uppercase B. This is a very typical size notation since many embedded devices are 8-bit or byte addressable. 

Next is the Word notation. The length of a Word can vary depending on the CPU architecture in your embedded system. A Word may be 16 bits, 32 bits, or more. This is notated by an uppercase W. So 10 kilowords would be written as the number 10 followed by a capital K and capital W. 

Finally is the Long notation. This is simply the next largest integer allocation above a Word. Typically this is a 32-bit number, but can vary depending on your target device. 

On to some specific terms for flash memory and it is the terminology related to how the flash memory is grouped. A grouping of several flash words, and in this case, we are showing a device with a 16-bit Word, is called a Sector. A sector is notable as the smallest group of flash bits that can be erased at one time. A grouping of many flash sectors is typically called a Bank for a TI embedded device. 

The sectors that are part of the same bank are significant because they cannot be read and written to at the same time. I will explain the concept of a bank in the next slide. 

As I mentioned previously, if we want to change the contents of flash memory, we have to use special commands or operations to do so. When a flash bit is 1, it is considered erased. This can apply to a bit, word, or a sector. Changing the value from a 0 to 1 is called erasing the memory. 

Recall that flash can only be erased at the sector level. When a flash bit is 0, it is considered in the program state. Changing a value from a 1 to a 0 is called writing or programming the memory. This operation can occur at the bit level, unlike a race, which again is at the sector level only. A special case when all the words in a sector are 0, the sector is considered cleared. 

Let's look at a real world example of the operations we would use to change the contents of a specific word of flash. In this case, I want to change the contents of memory address 0x2000 to a value of 0x8888. Since the current contents are all zeros, I must first erase the flash. I would call the flash erase API to do so. 

Keep in mind that in order to do this, I have to erase the entire sector. Once that operation is complete, I can then program the bits to 0 needed to get a final value of 0x8888. I would then call the flash program API to this address accordingly. I likely want to restore the other words that I had to erase originally, so I then call the flash program API again to restore those values. 

I spoke briefly about the bank grouping earlier and its importance in the flash memory. In this case, we are looking at a memory map a device with two banks of flash. When programming or erasing a flash, the rule for TI devices is that the CPU cannot be executing code from the same bank as it is attempting to program or erase. 

For devices with one only one bank of flash, this can be accomplished by moving CPU execution from RAM to ROM. For devices with multiple banks, the CPU can run its code from addresses in a different flash bank while programming the other. Also of note in this memory map is the sector sizes that are different inside a bank. This is highly device dependent, so check your device data sheet for the configuration specific to your device. 

The point in this example is that no matter the size of the sector, the erase operation must still occur at the sector boundaries defined by the device. In this case, the word size is 16-bits, so sizes are shown in kilowords. 

Electrical specifications of a memory may not be something obvious to consider on a given device, but for flash memory, it is something that must be understood before implementing the system. As I mentioned before, the operation of a flash read and write are very different than the same functions on an SRAM. Because of this, both the read and write operations will have different timings. 

Let's take a closer look at specifications for flash reads. Recall that flash typically cannot operate at the full CPU frequency of a device. To accommodate this, the processor must introduce intentional delay to accommodate different CPU speeds. In this table, we are looking at the aspect of reading the contents of the flash having an access time that may be slower than the max CPU clock rate. 

From the bottom row of this table, we can see this flash supports reads up to 50 megahertz clock rate. It is usually more helpful when talking about access times to refer to the CPU clock speed in terms of its period versus its frequency. In this case, a 50-megahertz clock rate would correspond to a 20-nanosecond time for each CPU cycle. We can derive that each access to a flash word takes 20 nanoseconds. 

If we to run the CPU at a faster clock rate, we would need to insert delay as shown in the table. So if we want to run the CPU at 200 megahertz, we would need to insert three additional cycles of read time or three wait states. I have shown the wait state calculation that allows us to derive the needed wait states based on the access time of our flash and the CPU period of our device. 

This would result in a flash access time of four CPU cycles at the 200 megahertz CPU clock rate or four times the five nanosecond period equal to 20 nanoseconds. Wait states are configurable per your device. Please consult your device data sheet for exact information on how to do so. 

I will now explore the timings that are required to write data to the flash. Recall that I mentioned Special Operations are required to write to flash memory. These are called write and erase APIs. Due to the physical construction of the flash, these operations will take defined times that can be found in your device's data sheet. 

In the below table, you can see the timings for program and erase are quite different and complete programming of a sector varies depending on the sector size. For programming time, this is the time required to change a specific number of bits or words from the erase state to some combination of zeros and ones. Note that the time varies depending on the amount of words that are being programmed. 

For erase time, this is the time to erase an entire sector of flash, making all the bits of value of 1. Recall that we can only erase at the sector level. 

There are two other parameters in the below table. First is the number of times the flash can be written and erased. In this case, it is 20,000 cycles. After this amount of cycles, the correct operation of the flash write and erase is not guaranteed. This is the concept of flash endurance I mentioned earlier in the presentation. 

Also of note is a specification for the ability of the flash to retain its data over time. For this device, it is specified at 20 years of operation. Please refer to your specific device's data sheet for the values of these parameters. 

Finally, I'd like to address some of the mechanisms we provide to program the flash on a device. I've broken these options down into the different stages of production lifecycle. During development, the most typical method to program the flash memory is through a JTAG connection to the device and using the debugger ID that supports the target device. During production, there are different options depending on the scale of devices that need programming. 

Devices can be programmed either before assembling or on the target PCB after assembly has been completed. A target PCB with a socket can be used for this step. There are also standalone programmers that do not require a PC or Mac host that can store the flash contents locally and program through the JTAG or serial interface. 

Finally, there's the aspect of updating the contents while the device is in the field. This is accomplished by embedding the flash API into the flash memory during production time and then calling it selectively when an update is required. For more information on the flash programming options for your device, please consult the device data sheet and specific technical documentation as needed. I hope you have found this introduction to embedded flash memory helpful to your designs and implementation of TI's embedded processors and microcontrollers. For extended information, please see the link above.