Mutithreaded cores(Intel) behavior
Let’s say we have a socket processor with c cores and t physical threads per core.
How can I get the best response time for the software running?
- The best response time you can get is when you utilize only the first thread of a mutithreaded core, I mean, you have to manage to run your software in such a way in wich every software thread is running in the first physical thread of the core (assuming your software is pure parallel multithreaded).That’s said, for the best reponse time the utilisation of your system is:
utilization = 100/t
How can I get the best throughput?
- The best throughput you get is utilizing all the cores
What happens if I run a process with more software threads than physical threads(oversubscribed)?
- Troughput is pretty much constant or best, total time increases in a almost linear ratio K (K > 1) greater than the oversubscription ratio (O = N/T), I mean, if you have T total physical threads that takes T to finish and you run your software with N threads (N > T) , the time for each process is:
Time per process = T * K * O
K is a constant that depends on the processor environment, that is the processor itself, the software running, the memory, bus, operative system,…
Here some graphs in an Intel core i5 for demonstrating, x axis is the number of executions of the same multithreaded parallel software process executed repeatedly:
1 thread: Response time 1,6 secs | Throughput: 630
4 thread: Response time 2,2 secs | Throughput: 907
8 thread: Response time 6,9 secs | Throughput: 1147
12 thread: Response time 10,7 secs | Throughput: 1113
Putting more CPU cores does not inprove response time, ever
The best response time you can get is when you are the only user in the system (if the system is stable and not a transitory moment).
If you want to speed up your app you have 2 main choices:
- improve the app
- have it running in fastest processor
- split your system
In short: more CPU improves througput and or concurrent capacity as soon as your app is multithreaded and there are no bottlenecks over there
Oversubscription impact in response time
You are advised!, oversubscription increases response time in a factor greater than the oversubscription ratio. For instance oversubscription = 3 increases response time up to 5 times. And I’m not considering overhead because hypervisor scheduler!!
Context switches consume cpu cicles to move things off and on.
Software interruptions increases when more processes are competing for cpu
More processes competing for cpu -> more context switches -> more software interruptions -> less chance for a unit of work to finish -> more latency -> bad response time -> poor online systems -> bad user experience
This study refers to pure workloads, in the real life i/o, contention, and other latencies makes the problem more complicated