15 March, 2013

My experiences with Thread safe code - 1

In this post, let me share my views and experiences about things to be done while coding in a distributed environment with respect to thread safety.

Persons who had taken a course on operating systems would be knowing about thread safe code. For those who doesn't know anything about it, as always let me give a gist.

What is thread?
Thread is a light weight process.

What is process?
A process is a program in execution.

For example : Consider microsoft word. When you open it, the process starts running i.e this program is in execution. This program has lots of features like spell check, auto correct etc which are independent threads running along with it.

In a single threaded process, only one flow of control exist.  The code executed by these processes  need not be thread-safe. In multithreaded programs, the same functions and the same resources may be accessed concurrently by several flows of control.

Let me explain thread-safeness  through an example.

Consider the case where there are group of people who come together to perform a particular task.
Each people has a role i.e they have some boundaries/memory which they don't cross.
Ex : Hadoop/GFS(Chubby)/Open Stack etc.

To illustrate still more, consider a bathroom.
Only one person can use it at a time. Other persons should wait for their turn.

A thread-safe code protects shared resources from concurrent access by locks. 
Thread safety concerns only the implementation of a function and does not affect its external interface.

For further info please read this.

So having defined some basics of thread safe function, let me share my experiences on it.

If you had read my previous post on design patterns , I would have spoken about something called service and session. I was writing a separate library in my previous concern where the objective was to provide some features like calculating current time and attaching to message based on customer specified version.

This library written would be loaded by the service using dlopen and would be accessed for each and every session. So for calculating the current time, I was looking for some system calls which would return me the time/date etc and I came across this function called localtime. I was happy, used it & finished my implementation and was waiting for code review comments from my technical lead.

To my shock, the implementation was not supposed to be done in the way I had done.
My usage was proper. I have used proper libc function which is a tested code.

But then why?

I missed seeing the below one in  the local time function.

/***********************
The function accesses the object pointed by timer.


The function also accesses and modifies a shared internal object, which may introduce data races on concurrent calls to gmtime and localtime. Some libraries provide an alternative function that avoids this data race: localtime_r.
***********************/
I should have used localtime_r function () . Why ?

Because there can be multiple parallel sessions running in the server and there is a potential chance of multiple sessions accessing this library at same time leading to corruption.
I was thankful as it was caught in review stage else it would have leaked after deployment.

So the lesson learnt from this

"If you are programming in a multithreaded environment and you are looking for system functions defined in libc kind of libraries, confirm whether it is thread safe or not and then use it".

However there are many others which i would post gradually. I just glossed over some things in this post.

Thanks for reading. Lets conclude it for now about my view/experience on writing thread-safe programs.

Please post any comments/suggestions so that I can improve.
 

No comments:

Post a Comment