Thoughts regarding performance


Concepts

Response time

Throughtput

Performance

Mutithreaded cores(Intel) behavior

Let’s say we have a socket processor with c cores and t physical threads per core.

How can I get the best response time for the software running?

  • The best response time you can get is when you utilize only the first thread of a mutithreaded core, I mean, you have to manage to run your software in such a way in wich every software thread is running in the first physical thread of the core  (assuming your software is pure parallel multithreaded).That’s said, for the best reponse time the utilisation of your system is:
utilization = 100/t

How can I get the best throughput?

  • The best throughput you get is utilizing all the cores

What happens if I run a process with more software threads than physical threads(oversubscribed)?

  • Troughput is pretty  much constant or best, total time increases in a almost linear ratio K (K > 1)  greater than the oversubscription ratio (O = N/T), I mean, if you have T total physical threads that takes T to finish and you run your software with N threads (N > T) , the time for each process is:
 Time per process = T * K * O

K is a constant that depends on the processor environment, that is the processor itself, the software running, the memory, bus, operative system,…

Here some graphs in an Intel core i5 for demonstrating, x axis is the number of executions of the same multithreaded parallel software process executed repeatedly:

(averages)

1 thread: Response time 1,6 secs | Throughput: 630

4 thread: Response time 2,2 secs | Throughput: 907

8 thread: Response time 6,9 secs | Throughput: 1147

12 thread: Response time 10,7 secs | Throughput: 1113

rtofirstthread

rtoallcores

rtooversubsx2

Putting more CPU cores does not inprove response time, ever

The best response time you can get is when you are the only user in the system (if the system is stable and not a transitory moment).

If you want to speed up your app you have 2 main choices:

  • improve the app
  • have it running in fastest processor
  • split your system

In short: more CPU improves througput and or concurrent capacity as soon as your app is multithreaded and there are no bottlenecks over there

Oversubscription impact in response time

You are advised!, oversubscription increases response time in a factor greater than the oversubscription ratio. For instance oversubscription = 3 increases response time up to 5 times. And I’m not considering overhead because hypervisor scheduler!!

oversubsoverhead

Context switches

Context switches consume cpu cicles to move things off and on.

Software interruptions

Software interruptions increases when more processes are competing for cpu

Conclussions

More processes competing for cpu -> more context switches -> more software interruptions -> less chance for a unit of work to finish -> more latency -> bad response time -> poor online systems -> bad user experience

Disclaimer

This study refers to pure workloads, in the real life i/o, contention, and other latencies makes the problem more complicated

Enjoy 😉

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.