Wednesday, October 25, 2006

Since the SoC ended i've been hard at work at both college and life, so coding time has been severely reduced. Nonetheless i have managed a few good things and a few bad things for the library ;)

For the good things: I now have memory usage in the range of 10-13 megabytes with 3 torrents running simultaenously. There's still one or two more optimisations that i have left to try. I'm at the level where calling DateTime.Now is responsible for 30% of my ongoing allocations! Ok, so i am calling it ~ 400 times a second, but that has nothing to do with it ;) Bug fixes have been flying in and most importantly (imo) i have received my first patches from other people! Yes, other people have sent me patches for MonoTorrent, amazing stuff.

The bad things: I've found some serious bugs in MS.NET which were the root cause behind the mysterious freezing i used to get in client library. What happens is that sometimes an Asynchronous call to socket.BeginReceive would actually be completed synchronously. "But thats good, it saves the computer from starting up another thead" you might think, but you'd be wrong!

My code was built on the fact that an asynchronous call was asynchronous. So, the problem is this. When dealing with multithreaded code you have to make sure that a different thread doesnt modify/delete data that your main thread is working on. To do this, you must put a "lock" on the item you're working on. Then when Thread 2 sees the lock, it wont do anything until Thread1 is finished and releases its' lock. Problems arise when Thread1 locks on item A and wants to lock onto B while Thread2 already has a lock on B but wants a lock on A. In this scenario, neither thread can go forward and so your application locks up.

Unfortunately this is exactly what happens when my socket call is completed synchronously. Thank microsoft for such a nice undocumented feature.

Finally in bad news: Download speeds are gone to crap. I'm going to have to remove my rate limiting code and get everything working hunky dorey before i reimplement rate limiting. My whole locking structure needs to be reviewed in light of the above mentioned bug. I've put in a temporary fix, but a side effect of it is to make downloading go pretty damn slowly.

Great fun all round ;)

Alan.

6 comments:

Unknown said...

I'm not sure that's a MS.NET bug. .NET uses ThreadPool for many asynchronous calls. Read this :
http://msdn2.microsoft.com/en-us/library/0ka9477y.aspx

Alan said...

Well thats fine! I know that socket async calls will use ThreadPool threads (or IOCP threads). However the problem is that the call is not being completed asynchronously. hence i had threading bugs. I know that FileStreams do complete small reads/writes synchronously even if you call the async method, but that isn't mentioned anywhere for sockets!

Viraptor said...

Did you try reporting it to MS? Can you give us a link to opened ticket?

Unknown said...

As you can read in documentation, call to BeginReceive (and every BeginWhatEverOperation) returns a IAsyncResult object. It has a property called CompletedSynchronously (http://msdn2.microsoft.com/en-us/library/system.iasyncresult.completedsynchronously.aspx
). This property returns true if code was executed in current and not on another thread. Why ?
"The underlying operating system might determine that it can complete an operation directly more efficiently than spawning a thread and dealing with the associated task switching overhead. If that happens, your completion callback will be called on the main thread and the CompletedSynchronously flag will be set to True."
More information :
http://msdn2.microsoft.com/en-us/library/ms228963.aspx
It's definitely not a bug but a feature ;)

mdi said...

pascal: Its one thing to queue the operation and do it quickly and say "hey, I happen to have completed this operation", but another one is to have a multi-second hang because the operation was not actually async.

Alan said...

pascal: Thats a very good point. I did know that the IAsyncResult had a CompletedSynchronous property, but i never thought of checking it as i couldn't find any documentation to say that BeginXXX socket operations can be completed synchronously. After checking the property, it is being set to true. Still, if i had thought of checking that before it would've solved quite a few weeks of head banging with me saying "but how can this be happening!" :p

@hunog: Its not so much a "multi-second hang" so much as a complete and utter freeze up :p But its all fixed now, thank god!

Hit Counter