From Von Neumann Bottleneck to Moore's law to Concurrency programming

Programming concurrently is actually very hard, so why do it? ‎

Can we get a speedup without doing it? Clearly, if you do things in parallel, you could get a speedup. But do we need it? I would argue and most people would argue, "Yes, we do right now." ‎

In the past, maybe we didn't need it. But now, we really need it. So, one way to get a speedup without parallelism is to just speed up the processor.‎

So just make a new processor that runs faster than the old processor, and then you get speedups and you don't have to change the way you write your code. ‎

This had been the way of things until I don't know. Until recently.. Until recent years, that had been the way of things. So, the way things sped up, the majority of the way code sped up was because the processors were being built faster.‎

Every short period of time, the clock rate would get faster and faster. As the clock rates get faster, the code executes faster generally and processors will get faster and faster. ‎

Now, another limitation on the speed is what you call the Von Neumann Bottleneck.‎

If you think about the way processor executes code, it is the CPU is executing the instructions and there's memory, and the CPU has to go to memory to get the instructions, also to get data that you want to use.‎

You want to say x plus y equals z, you got to go to memory, grab the data out, add them together, put the result into z or rather into x, but you have to access memory.‎

So, the CPU is regularly reading from memory writing back to memory, and memories are always slower than the CPU. ‎

So even if you crank up the clock speed make that thing work a lot faster, the memory is still slow.‎

Memory speedup slowly over time, but a lot slower than the clock rates would speedup. So, you get what's referred to as a Von Neumann Bottleneck where even if you crank up your clock or you could double your clock speed but your code only runs a little bit faster, and that would be because even as you crank up the clock speed, you're still waiting on memory.‎

So, you'd waste a lot of time just waiting for memory access.‎

So, what people do for that or have done in the past with that,, what they have done for past that for that is they built cache. So, fast memory on the chip so you don't have to go to main memory, which is too slow, you go to fast cache.‎

So, that's what traditionally had been done, clock rates would go up, memory and cache capacity would go up, and so speed would go up.‎

These performances on these processors will just go up and up and up, and as a programmer, you don't really have to do anything. You could just write your code the same way you wrote your code and expect that it would speed up magically because the processor themselves are getting improved. So, that's how it used to be.‎

That's not how it is now. It's changed for a couple of reasons. First thing is that Moore's law, which I'm about to describe, has really died, I don't know what the right term is, it doesn't really happen anymore. ‎

Okay. So, Moore's law, basically predicted that transistor density would double every two years.‎‎

Transistor density would double every two years. Now, these processors, they're all packet transistors, lots of transistors that are used to do computation. So, if you can double the transistor density, then the transistors are getting smaller and smaller and they switch faster when they're smaller. Meaning, they can go high to low faster. So as they get smaller, they get faster.‎

this Moore's law is not really a law, law is sort of a bad term. It is not a physical law, it's just an observation.‎

So, Moore's law was sort of doing everything for us or most of the things for us. It was just given us the speedup, programmers didn't have to worry about things., that was work. Hard work on the part of hardware designers.‎

Figuring out how to get these transistors smaller and still accurately made and all these. Very hard work but they were consistently doing it. So, software people had an easy time of it, but that's not how it is anymore.‎

That type of thing has gone away. So, software, in order to continue getting speedups has to do something else to keep achieving those speedups over time.‎

Why should we take advantage of Concurrency programming?‎

The speedup that you get from Moore's law, so density increase which leads to a speedup and performance improvement, it can't continue. So you might say, "Well, why? Why can't that just go on forever." The reason why is because these transistors consume power.‎

So, sure, the density of transistors can go up and up and up on these processors, but these transistors consume a chunk of power, and the power has now is becoming a critical issue and they call it a power wall.‎

So, as you increase the number of transistors on a chip, increasing the density, that would naturally lead to increased power consumption on the chip.‎

If something is running, it is consuming a lot of power, it's going to be physically hot.if you look at the motherboard, there's at least a bunch of cooling fans.‎

So, this is necessary because these chips are running at such high-power that they're heating up, and you need the cooling. If you don't have the cooling, if you don't have this heat sink to dissipate the heat and the fan to blow away the heat, then it'll hurt the chip.

‎

So, even if you could put more transistors on there, you got to be careful about power and specifically in power and its impact on temperature.‎

Temperature is probably the biggest wall there, but power is also an important thing because if you want to have portable devices, you have a battery, you don't want to run out your battery instantly. So, power all by itself is important, but the temperature is probably the biggest limitation because you will melt the chip if you don't cool the thing.‎

Say you are in Intel, you want to sell chips, right? So if you say you come up with a generation chips that's running at four gigahertz and the next generation is also at four gigahertz, people may not buy the chip, right? They need some reason to buy this thing. It's got to be some improvement. So what do people do, they increase the number of cores in the chips. This is way, you get multi-core systems.‎

a processor core, it basically executes a program roughly and you can have multiple of them. So ,might have four processor cores, or something like this. There are variable numbers of cores. ‎

They're putting, they're just having more of this replicating hardware on the chip. But they don't increase the frequency, they don't increase the clock frequency. They keep it roughly the same.‎

So for instance, clock frequencies go up slowly but much more slowly than they used to go up but the number of cores still continues to increase. ‎

Now, the thing about having a multi-core system, having lots of cores is that you have to have parallel execution/concurrent execution to exploit multi-core systems.‎

When you're using one core and the other three cores are sitting idle. So in order to exploit these multi-core systems and get speedup. You have to be able to take your program divided amongst a bunch of cores and execute different code concurrently on the different cores.‎

Okay, so this is where parallel execution becomes necessary. In order to keep achieving speedup in the presence of multi-core systems, you've got to be able to use, to exploit this parallel hardware.‎

This is why concurrency is so important nowadays. Because the programmer has to tell the program how it's going to be divided amongst cores, right? That's really what concurrent programming is doing. It's saying, "Look, here's a big piece of code. You can put this on one core, this on another core, this on another core." Right? These things can all run together. That's what the program is doing when you're doing concurrent coding, and you have to do that.‎

‎

Mahmoud Reda Blog

بحث هذه المدونة الإلكترونية

Layered design in software engineering