Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CACHE SCANNING
Document Type and Number:
WIPO Patent Application WO/2024/063767
Kind Code:
A1
Abstract:
Methods, systems, and apparatus, for systems on-a-chip. One system includes a functional component having one or more embedded random-access memories (RAMs), the functional component including a scan memory state machine configured to generate signals for dumping the contents of the one or more embedded RAMs during a scan dump process.

Inventors:
JALASUTRAM MAHEEDHAR (US)
GARG SUNDER (US)
WONG VICTOR KAM KIN (US)
TUMMALA GOPI KRISHNA (US)
JAIN ANUPAM (US)
Application Number:
PCT/US2022/044249
Publication Date:
March 28, 2024
Filing Date:
September 21, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GOOGLE LLC (US)
International Classes:
G06F11/07; G06F11/22; G06F11/36
Foreign References:
US20120072791A12012-03-22
Other References:
HONG HAO ET AL: "STRUCTURED DESIGN-FOR-DEBUG - THE SUPERSPARC TM II METHODOLOGY AND IMPLEMENTATION", PROCEEDINGS OF THE INTERNATIONAL TEST CONFERENCE (ITC). WASHINGTON, OCT. 21 - 25, 1995; [PROCEEDINGS OF THE INTERNATIONAL TEST CONFERENCE (ITC)], NEW YORK, IEEE, US, 21 October 1995 (1995-10-21), pages 175 - 183, XP000552821, ISBN: 978-0-7803-2992-8, DOI: 10.1109/TEST.1995.529831
Attorney, Agent or Firm:
SHEPHERD, Michael P. et al. (US)
Download PDF:
Claims:
CLAIMS

1. A cache of a system-on-a-chip having one or more embedded random-access memories (RAMs), the cache comprising: a scan memory state machine configured to generate signals for dumping the contents of the one or more embedded RAMs during a scan dump process.

2. The cache of claim 1, wherein the cache comprises a test controller that is configured to generate test signals that are separate from a functional path.

3. The cache of claim 2, wherein the cache is configured to select between a signal from the test controller and a signal from the scan memory state machine.

4. The cache of claim 3, wherein the cache further comprises a multiplexor configured to select between a functional path and a signal either from the test controller or from the scan memory state machine.

5. The cache of any one of claims 1-4, wherein the cache is configured to activate the scan memory state machine in response to software writing a value to a configuration register.

6. The cache of any one of claims 1-5, wherein the cache has cache lines stored in the one or more embedded RAMs.

7. The cache of any one of claims 1-6, wherein the cache is a translation lookaside buffer having translation entries stored in the one or more embedded RAMs.

8. The cache of any one of claims 1-7, wherein the one or more embedded RAMs do not have a scan input.

9. The cache of any one of claims 1-8, wherein the one or more embedded RAMs are static RAMs.

10. A method performed by a cache of a system-on-a-chip, the method comprising: generating, by a scan memory state machine of the cache, signals for dumping the contents of one or more embedded random-access memories of the cache; and providing, by the scan memory state machine, the generated signals to the one or more embedded RAMs during a scan dump process.

11. The method of claim 10, wherein the cache comprises a test controller that is configured to generate test signals that are separate from a functional path.

12. The method of claim 11, wherein the cache is configured to select between a signal from the test controller and a signal from the scan memory state machine.

13. The method of claim 12, wherein the cache further comprises a multiplexor configured to select between a functional path and a signal either from the test controller or from the scan memory state machine.

14. The method of any one of claims 10-13, wherein the cache is configured to activate the scan memory state machine in response to software writing a value to a configuration register.

15. The method of any one of claims 10-14, wherein the cache has cache lines stored in the one or more embedded RAMs.

16. The method of any one of claims 10-15, wherein the cache is a translation lookaside buffer having translation entries stored in the one or more embedded RAMs.

17. The method of any one of claims 10-16, wherein the one or more embedded RAMs do not have a scan input.

18. The method of any one of claims 10-17, wherein the one or more embedded RAMs are static RAMs.

19. A functional component of a system-on-a-chip having one or more embedded random-access memories (RAMs), the functional component comprising: a scan memory state machine configured to generate signals for dumping the contents of the one or more embedded RAMs during a scan dump process.

Description:
CACHE SCANNING

BACKGROUND

Random Access Memories (RAMs) and caches often carry functional debug information that can be useful in a case of a software crash or computer lockup. The code and conditions present in the RAMs can be helpful to find the root cause of the software crash. However, some RAMs and caches may not have direct software visibility for debugging and may not have an interface to examine the contents of the RAMs.

SUMMARY

This specification describes a system which can dump the contents of the one or more embedded RAMs during a scan dump process. This allows a debugging system or a user to examine the contents of the RAMs, which may not have direct software visibility. For example, the contents of multiple static RAMs (SRAMs) can be dumped to a logic scandump chain. The contents of the logic scandump chain can then be dumped during a scan dump, e.g., using existing logic scandump framework.

Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages.

The systems and methods described allow for visibility into the code and conditions stored within RAMs or SRAMs, which typically do not have visibility for debugging. Additionally, the systems and methods described in this specification can be integrated with an existing test path, e.g., a built-in self-test path. Integrating the described system with the existing test path can prevent or reduce timing impacts on the functional path of the system. Also, the systems and methods can advantageously create a continuous data stream to dump SRAM or cache data with increased efficiency, e.g., without any dummy or empty cycles of signals. Additionally, the systems and methods can utilize existing logic scandump framework to read cache data. For example, dumping SRAM data to an off-chip DRAM using existing scandump framework avoids additional timing and performance impact on the system during a debug process. Additionally, the systems and methods can be integrated with an existing reset flow, e.g., on a phone, which enables dumping of the SRAM data without connecting to external equipment.

The details of one or more embodiments of the subject matter of this specification are set forth in the accompanying drawings and the description below. Other features,

I aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of an example system on a chip.

FIG. 2 is a diagram of an example static RAM integrated on a chip.

FIG. 3 is a diagram of an example dynamic RAM on a chip.

FIG. 4 is a diagram of an example system on a chip dumping the contents of embedded RAMs.

FIG. 5 is a flowchart of a method of dumping the contents of embedded RAMs.

Like reference numbers and designations in the various drawings indicate like elements.

DETAILED DESCRIPTION

FIG. 1 is a diagram of an example system 100 on a chip. The system 100 allows the content of embedded caches or RAMs, which usually do not have software visibility, to be dumped so that the contents can be examined. Example functional components that have embedded RAMs that do not support software visibility include caches that store cache lines in embedded RAMs as well as translation lookaside buffers that store translation entries in embedded RAMs. These functional components are often populated by logic circuitry of hardware devices, and therefore, the contents of these components is typically not accessible by software.

The system 100 includes multiple static RAMs (SRAMs) 102, 104, 106 and a logic scandump chain 108. For example, the logic scandump chain can be a logic framework which dumps logic content of the chip to an external storage device, e.g., an external DRAM. In the example system 100, the logic scandump chain is utilized to also dump the logic from the SRAMs 102, 104, 106. In many cases, dumping the code and conditions present in the SRAMs to an external location can be helpful to find the root cause of a software crash or computer lockup. A scan dump process can include dumping the code and conditions present in the SRAMs. In the system 100, a scan memory state machine can generate signals for dumping the contents of the SRAMs 102, 104, 106 to the logic scandump chain 108. The logic scandump chain 108 would then include the contents of the SRAMs when the contents of the logic scandump chain 108 is executed. The contents of the SRAMs 102, 104, 106 can be dumped to the logic scandump chain 108 in a daisy chain, e.g., in sequential order. For example, when the contents of the SRAM 102 are completely dumped to the logic scandump chain 108, a signal can be generated to dump the contents of the SRAM 104 to the logic scandump chain 108. Similarly, when the contents of the SRAM 104 are completely dumped to the logic scandump chain 108, a signal can be generated to dump the contents of the SRAM 106 to the logic scandump chain 108. This process can be repeated for any number of SRAMs. Automatically triggenng the daisy chain to dump the next SRAM can create a continuous data stream to dump SRAM data, e g., without any dummy or empty cycles of signals. Contents of every SRAM are dumped to the logic scandump chain 108, the logic scandump chain 108 continues dumping the contents to an external storage device, e.g., an external DRAM.

FIG. 2 illustrates a diagram of an example functional component 200 for dumping an SRAM. For example, the component 200 can be part of a cache. The functional component 200 includes a RAM wrapper 202. The RAM wrapper 202 includes a memory built in self-test (MBIST) interface 204 and an SRAM or cache 206. The functional component 200 can include a test controller that is configured to generate test signals for a test path 210, e.g., an MBIST path, and the functional component can have a functional path 208 which is separate from the test path 210. The MBIST interface 204 includes a scan memory state machine 212 which can be activated to dump the SRAM 206. For example, the scan memory state machine 212 is a component that can perform a sequence of operations according to a state machine. The scan memory state machine 212 can be activated in response to a signal 218, e.g., software writing a value to a configuration register. The MBIST interface 204 can also include a multiplexor 214 which can select between the test path 210 and signals from the scan memory state machine 212. Another multiplexor 216 can select between the functional path 208 and the output from the first multiplexor 214. Because the scan memory state machine is integrated with the test path 210, there is no additional timing impact on the functional path 208. In this fashion, the multiplexor 216 can select between the functional path 208 and either signals from the test controller or signals from the scan memory state machine 212.

When the signal 218 is generated, the scan memory state machine 212 can be activated. The scan memory state machine 212 can generate a signal to dump the SRAM 206. The multiplexor 214 can select the signal from the scan memory state machine 212, and the multiplexor 216 can select the signal from the scan memory state machine 212. The SRAM 206 can receive the signal to dump its contents to the scan memory state machine 212. The scan memory state machine 212 can contain an N-bit register 220, where N is the width of the SRAM databus. The scan memory state machine 212 can read the first address of the SRAM 206, which is captured into the N-bit register 220. The scan memory state machine 212 can perform a serial shift of the N-bit register 220 to unload the data, which will take N cycles because it is an N-bit register 220. After N serial shifts of the N-bit register 220, the scan memory state machine 212 increments the address of the SRAM 206 by one, performs a read on the corresponding address, and updates the content of the N-bit register 220 with the new data read from the SRAM 206. The N-bit register 220 can continue to unload the data on the output signal 226 on every clock cycle 222, and output signal 226 can continue to unload data through the logic scandump chain. When all of the addresses of the SRAM 206 are read out, the scan memory state machine 212 can output a done signal 224. If there are additional SRAMS, the done signal 224 can trigger the next scan memory state machine, e.g., in a daisy chain.

FIG. 3 illustrates a diagram of a logic scandump chain 300. The logic scandump chain 300 can receive the contents of an SRAM, e.g., through an input signal 302. The logic scandump chain 300 can also output the contents of the logic scandump chain 300, e.g., through an output signal 304 to an external storage device, e.g., an external DRAM. When the logic scandump chain 300 receives a scandump signal 306, the contents 308 of the logic scandump chain 300 can be output, e.g., using a JTAG interface which can be accessible with an external debugger, or the contents of the logic scandump chain 300 can be output, e.g., through an on-chip DDR PHY (DFI) interface . The contents of the logic scandump chain 300 can then be examined by a debugging system or user. If the logic scandump chain 300 contains the contents of an SRAM due to the process described above, the contents of the SRAM will also be available to be examined by the debugging system or user. This allows for visibility into the code and conditions stored within the SRAM. As discussed above, SRAMS and caches typically do not have visibility for debugging, and the code and conditions stored within the SRAM can be helpful to find the root cause of a software crash.

FIG. 4 illustrates a system 400 on a chip that allows the content of embedded caches or RAMs, which usually do not have software visibility, to be dumped so that the contents can be examined. The system 400 includes multiple functional components 200a, 200b, 200c, which can be, for example, implemented according to the example functional component 200 of FIG. 2. For example, each functional component contains a RAM wrapper, an MBIST interface, an SRAM, a scan memory state machine, etc. as described above. The system 400 also includes the logic scandump chain 300 as described above. When a signal 218 is generated to begin a scandump, e.g., by software writing a value to a configuration register, the signal 218 can be forwarded to the first functional component 200a. The scan memory state machine of the first functional component 200a can be activated and can generate a signal to dump the SRAM 206a of the first functional component 200a. The scan memory state machine can contain an N-bit register as described above, and can perform a senal shift of the N-bit register to unload the data from the SRAM using the output signal 226a, as described above. When all of the contents of the SRAM are read out, the scan memory state machine can output a done signal 224a to trigger the next functional component 200b, e g., in a daisy chain.

The functional component 200b can receive the done signal 224a at the scan memory state machine of the functional component 200b. The scan memory state machine of the second functional component 200b can be activated and can generate a signal to dump the SRAM 206b of the second functional component 200b. The scan memory state machine can contain an N-bit register as described above, and can perform a serial shift of the N-bit register to unload the data from the SRAM using output signal 226b, as described above. When all of the contents of the SRAM are read out, the scan memory state machine can output a done signal 224b to trigger the next functional component 200c.

The functional component 200c can receive the done signal 224b at the scan memory state machine of the functional component 200c. The scan memory state machine of the third functional component 200c can be activated and can generate a signal to dump the SRAM 206c of the third functional component 200c. The scan memory state machine can contain an N-bit register as described above, and can perform a serial shift of the N-bit register to unload the data from the SRAM using output signal 226c, as described above. When all of the contents of the SRAM are read out, the scan memory state machine can output a done signal 224c, e.g., to trigger a next functional component. This process can be repeated for any number of functional components and SRAMs in a system.

The Logic ScanDump Chain 300 can receive the contents of each SRAM 206a, 206b, 206c in serial order, e.g., because each SRAM dump is triggered one after the other. When the system 400 receives a scandump signal 306 and a ramdump signal 218, the contents of each SRAM 206a, 206b, 206c are loaded into Logic ScanDump Chain 300 and the Logic ScanDump Chain 300 continues to unload its data, e.g., using a JT AG interface which can be accessible with an external debugger, or e.g., using an external DRAM using scan-to-DRAM access. The contents of the system 400 can then be examined by a debugging system or user. Because the system 400 contains the contents of each SRAM, e.g., due to the process described above, the contents of each SRAM will also be available to be examined by the debugging system or user. This allows for the content of the embedded RAMs, which usually do not have software visibility, to be dumped so that the contents can be examined.

FIG. 5 is a flowchart of an example process 500 for dumping the contents of one or more embedded caches or RAMs during a scan dump. The example process can be performed by one or more components of a system on a chip. The example process will be described as being performed by, e.g., the scan memory state machine 212 of FIG. 2, configured accordingly in accordance with this specification.

The scan memory state machine receives a signal to be activated (502). For example, the scan memory state machine can receive a signal similar to signal 218 of FIG. 2, which can be by software writing a value to a configuration register.

The scan memory state machine generates a signal to dump the SRAM (504). For example, the scan memory state machine can send a signal similar to the signal sent to the SRAM 206 of FIG. 2, as described above.

The scan memory state machine reads the first address of the SRAM (506). For example, the contents of the first address can be read by an N-bit register similar to register 220 of FIG. 2.

The scan memory state machine receives the SRAM contents (508) and outputs the contents to the logic scandump chain (510) in serial fashion. For example, the contents can be captured by the N-bit register and unloaded to the logic scandump chain in serial manner by serial shift of the N-bit register, as described above.

The scan memory state machine reads the next address of the SRAM (512). For example, the contents of the corresponding address of the SRAM can be read by the N-bit register and sent to a logic scandump chain, as described in FIG. 2. For example, the address is incremented once for every N shifts of the N-bit register, and a read is performed once for every N shifts of the N-bit register. The N-bit register sends the read data to the logic scandump chain on every clock cycle, so that data is not lost. The serial shift of the N-bit register is a continuous process until the entire content of the SRAM is read out. If there are additional addresses in the SRAM, the scan state memory machine continues to output the contents (branch to 508) of the SRAM (510). This loop can repeat until the scan state memory machine reads all of the addresses of the SRAM, e.g., a pre configured maximum address is reached, and sends the contents to the logic scandump chain through the N-bit register.

When all of the addresses of the SRAM are read out, the scan memory state machine outputs a done signal (514). For example, if there are additional SRAMS, the done signal can trigger the next scan memory state machine, e.g., in a daisy chain.

Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.

Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non-transitory storage medium for execution by, or to control the operation of, data processing apparatus. The computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them. Alternatively or in addition, the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.

The term “data processing apparatus” refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can also be, or further include, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). The apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.

Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.

In addition to the embodiments described above, the following embodiments are also innovative:

Embodiment 1 is a cache of a system-on-a-chip having one or more embedded random-access memories (RAMs), the cache comprising: a scan memory state machine configured to generate signals for dumping the contents of the one or more embedded RAMs during a scan dump process.

Embodiment 2 is the cache of embodiment 1, wherein the cache comprises a test controller that is configured to generate test signals that are separate from a functional path.

Embodiment 3 is the cache of embodiment 2, wherein the cache is configured to select between a signal from the test controller and a signal from the scan memory state machine.

Embodiment 4 is the cache of embodiment 3, wherein the cache further comprises a multiplexor configured to select between a functional path and a signal either from the test controller or from the scan memory state machine.

Embodiment 5 is the cache of any one of embodiments 1-4, wherein the cache is configured to activate the scan memory state machine in response to software writing a value to a configuration register.

Embodiment 6 is the cache of any one of embodiments 1-5, wherein the cache has cache lines stored in the one or more embedded RAMs.

Embodiment 7 is the cache of any one of embodiments 1-6, wherein the cache is a translation lookaside buffer having translation entries stored in the one or more embedded RAMs.

Embodiment 8 is the cache of any one of embodiments 1-7, wherein the one or more embedded RAMs do not have a scan input.

Embodiment 9 is the cache of any one of embodiments 1-8, wherein the one or more embedded RAMs are static RAMs.

Embodiment 10 a method performed by a cache of a system-on-a-chip, the method comprising: generating, by a scan memory state machine of the cache, signals for dumping the contents of one or more embedded random-access memories of the cache; and providing, by the scan memory state machine, the generated signals to the one or more embedded RAMs during a scan dump process.

Embodiment 11 is the method of embodiment 10, wherein the cache comprises a test controller that is configured to generate test signals that are separate from a functional path.

Embodiment 12 is the method of embodiment 11, wherein the cache is configured to select between a signal from the test controller and a signal from the scan memory state machine.

Embodiment 13 is the method of embodiment 12, wherein the cache further comprises a multiplexor configured to select between a functional path and a signal either from the test controller or from the scan memory state machine.

Embodiment 14 is the method of any one of embodiments 10-13, wherein the cache is configured to activate the scan memory state machine in response to software writing a value to a configuration register.

Embodiment 15 is the method of any one of embodiments 10-14, wherein the cache has cache lines stored in the one or more embedded RAMs.

Embodiment 16 is the method of any one of embodiments 10-15, wherein the cache is a translation lookaside buffer having translation entries stored in the one or more embedded RAMs.

Embodiment 17 is the method of any one of embodiments 10-16, wherein the one or more embedded RAMs do not have a scan input.

Embodiment 18 is the method of any one of embodiments 10-17, wherein the one or more embedded RAMs are static RAMs.

Embodiment 19 is a functional component of a system-on-a-chip having one or more embedded random-access memories (RAMs), the functional component comprising: a scan memory state machine configured to generate signals for dumping the contents of the one or more embedded RAMs during a scan dump process.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any invention or on the scope of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially be claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. For example, the actions recited in the claims can be performed in a different order and still achieve desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous.

What is claimed is: