This mainly occurs when a person receives a call or dials a virtual number for the first time. It happens because when executing the response provided by a client's callback the media file must be downloaded from it's current location and saved in the Africa's Talking servers after which; the file is played.
To prevent the delay from occurring you can upload the media file to our server in advance using the upload media function. Here's a tutorial on how to do this:
http://docs.africastalking.com/voice/uploadmedia
The file will be uploaded directly to our servers.
For further info get in touch with us on support@africastalking.com