I had an idea for integrating automatic thread pool usage into a programming language.
First, if you don’t understand thread pools and how they are best used, and how they can be superior to manual thread management, then you should read up on how and why thread pools work – but basically keep in mind that thread pools, if used optimally, can get you closer to linear scalability than any other model.
Also, it’s worth reminding yourself that your application spends almost no time on the processor. Unless you’re doing 3D rendering, or video encoding, the processor probably sees brief flashes of your application separated by what are relatively large stretches (a virtual eternity from a processor’s perspective) where every thread in your application is in a wait state. Most importantly, this is usually true even during the heaviest use of your application, because the processor is simply not the bottleneck these days (except in the rare cases already noted).
If you aren’t using a thread pool, or you are using a thread pool but doing so suboptimally, your application may be in a wait state when work actually needs to be done. This is what destroys scalability. In a thread pooled environment, this happens when all the threads in your pool are stuck in a wait state, waiting for one of the things threads wait for (disk access, network access, synchronization lock, etc). More advanced pool management can detect this situation and increase the number of available threads so that work can get done while those other threads sleep. This helps with scalability, but comes with its own expense thereby making the scalability nonlinear (i.e. the processing power you were able to tap didn’t go as far as the processing power you already had available).
This situation is avoidable. You can avoid it by breaking your application up into segments of code that never ever go into a kernel wait state. You do that by using the async requests available for those requests. Instead of calling Read() on a stream, you call BeginRead() at the end of one segment, and then EndRead() at the beginning of another. When your segment calls BeginRead() and quits, the thread in the thread pool can then move on to the next work item without delay, and when the read completes, the next segment can be queued to handle it.
The problem is that coding like this is difficult. It’s difficult because breaking up your code into linear functions is natural, but dividing a function into multiple functions along every blocking call is just not intuitive, and it becomes quite difficult to do things like loop. How do you start a for loop in one function that terminates in another? You can achieve this result by passing a state variable between calls, but this becomes extremely complicated.
With version 2.0 of C#, we’ve seen the compiler rig up some interesting code for us to support automatic enumerators. We can create a function that returns IEnumerable<string> and then yield return strings from our procedure as many times as we want. This is much easier than the alternative which is to create a new class that implements IEnumerable<string>, and another class that implements IEnumerator<string> which has our member variables, and responding to Next() requests instead of just calling yield return. That’s trouble enough, but try doing a for loop in the middle of that and you’ll see just what the compiler is doing for you automatically when you use yield return.
If the compiler can break up the developer’s enumerating function into a couple of class definitions and multiple functions, then the compiler should be able to do the same thing for our function’s blocking calls.
Here is an example function which takes an array of HTTP URLs, fetches them, and writes them all out to a file, “output.txt”.
void MyFunction(string[] requests)
{
using (Stream output = File.OpenWrite("output.txt"))
{
byte[] buffer = new byte[1024];
foreach (string request in requests)
{
HttpWebRequest hreq = (HttpWebRequest)WebRequest.Create(request);
HttpWebResponse hresp = (HttpWebResponse)hreq.GetResponse();
using (Stream input = hresp.GetResponseStream())
{
int len;
while (0 != (len = input.Read(buffer, 0, buffer.Length)))
{
output.Write(buffer, 0, len);
}
}
}
}
}
I’ve highlighted in red the functions which are blockers. Calls to these functions will put the thread to sleep for what could be a very long time in processor terms. For this reason, these functions have asynchronous, Begin/End versions which can be used instead. However, try to do that with the above code and you are left with the following abomination:
class MyFunctionState
{
internal Stream output;
internal Stream input;
internal byte[] buffer;
internal string[] requests;
internal IEnumerator request_enumerator;
internal HttpWebRequest hreq;
internal HttpWebResponse hresp;
internal string request;
internal int len;
}
void MyFunction(string[] requests)
{
MyFunctionState state = new MyFunctionState();
state.requests = requests;
state.output = File.OpenWrite("output.txt");
state.buffer = new byte[1024];
state.request_enumerator = requests.GetEnumerator(); // for loop
MyFunction_Segment1(state);
}
void MyFunction_Segment1(MyFunctionState state)
{
if (!state.request_enumerator.MoveNext()) // for loop
{
// after for loop
state.output.Dispose();
return;
}
state.request = (string)state.request_enumerator.Current;
state.hreq = (HttpWebRequest)WebRequest.Create(state.request);
state.hreq.BeginGetResponse(new System.AsyncCallback(MyFunction_Segment2), state);
}
void MyFunction_Segment2(IAsyncResult result)
{
MyFunctionState state = (MyFunctionState)result.AsyncState;
state.hresp = (HttpWebResponse)state.hreq.EndGetResponse(result);
state.input = state.hresp.GetResponseStream();
MyFunction_Segment3(state);
}
void MyFunction_Segment3(MyFunctionState state)
{
state.input.BeginRead(state.buffer, 0, state.buffer.Length, new AsyncCallback(MyFunction_Segment4), state);
}
void MyFunction_Segment4(IAsyncResult result)
{
MyFunctionState state = (MyFunctionState)result.AsyncState;
state.len = state.input.EndRead(result);
if (state.len == 0) // while condition
{
// after while loop
state.input.Dispose();
MyFunction_Segment3(state);
return;
}
state.output.BeginWrite(state.buffer, 0, state.len, new AsyncCallback(MyFunction_Segment5), state);
}
void MyFunction_Segment5(IAsyncResult result)
{
MyFunctionState state = (MyFunctionState)result.AsyncState;
state.output.EndWrite(result);
MyFunction_Segment3(state); // top of while
}
If this made you cringe, then you’re not alone. There are several things to notice here:
- The result is extremely ugly and hard to follow, but does achieve scalability.
- You have an extra state class and several extra functions which have no purpose except to implement what would conceptually be part of the original function.
- The using, while, and foreach all had to be manually implemented, and the using was actually not implemented as a try/catch as it should be (it probably could, but that would have been even more confusing).
I propose that the language provide a keyword by which you could have all this done for you; much like yield return does so for you with automatic enumerators. Just what the keyword should look like, and how this should be implemented, I’m not 100% sure yet, but I wanted to get this idea down before I forgot it.
In this example, by the way, the MyFunction() method returns immediately before it is complete. This may not be ideally how this would work (i.e. you might even want to return a value… I’ve left off this detail for now. Adding this should be fairly trivial).