quarta-feira, 23 de dezembro de 2009

Why Delphi BDE applications sometimes will fail with shared memory errors

SHAREDMEMADDRESS BDE Configuration Parameter
Borland Database Engine (BDE) is a data access technology introduced by Borland a long time ago with Borland C++ and later on used by Delphi as a way of unifying access to DB data sources. At that time it had some interesting features that gave it some edge over ODBC for instance. For instance, if you followed some best practices, you were able to write applications accessing heterogeneous DB with the same client code (fat clients were common at the time). Also data access was fast, as Delphi was and still is (for Win32 applications). Over the time additional data access technologies surfaced but many legacy applications were kept accessing data over BDE native drivers. Incompatibilities can start to appear with new OS. Also, BDE is a Black Box. Documentation is scarce and it is no longer supported (so if you are using newer databases, chances are that you cannot access some of its new features or that somethings that worked before on previous DB engines do not work with the new versions. So when you or a customer decides to upgrade a DB, careful analysis and testing has to be performed.  

After a long period of analysing stability problems of custom apps at a customer I have found the reason for some random errors "$210D: Shared Memory Conflict" occurring on Windows Server 2008 (that did not occur on another deployment server Windows 2003). The reason was traced to the address space layout randomization (ASLR) mechanism. For Windows, ASLR appeared first on Vista and aimed to introduce some randomness into the load address of some specially marked EXEs and DLLs (with the ASLR bit on) into the 32-bit address space of a Win32 process. For these operating systems some of OS DLLs are already marked with those flags. I am talking of  DLLs like NTDLL.DLL, KERNEL32.DLL, etc.

To make a very long story short:
- BDE by default is configured to allocate the shared memory buffer it uses at 6BDE (0x6BDE0000, for Windows NT and above). This is inside the region used by the ASLR mechanism (above 0x50000000). This address is controlled by the SHAREDMEMADDRESS changeable in the BDE Administrator Control Panel applet or in the registry ([HKEY_LOCAL_MACHINE\SOFTWARE\Borland\Database Engine\Settings\SYSTEM\INIT] key).
- Every time a reboot is performed on the server a new randomization occurs and DLLs can from then on start to load on a different zone than the preferred load address. If a DLL is loaded into the 6BDE zone (or whatever address is set in SHAREDMEMADDRESS above 5000) then BDE tries to allocate the SHAREDMEMSIZE buffer into another region and typically succeeds. I haven't reached to the point of fully understanding the algorithm used for this rellocation since I was not given enough effort to debug this BDE behaviour (assembly).
- When in the same BDE client (i.e. same Windows Session on a PC with BDE installed locally) you open a second BDE app it will never be able to locate successfully the shared memory block in the rellocated address and will fail miserably with the "$210D: Shared memory conflict". Again I did not fully understand why this failure occurs (i.e. what kind of search is done and why it fails to locate a block of memory previously allocated by the same BDE code). A little more time spent debugging this should allow a full comprehension of this issue. But the customer needed a quick fix, so I had to move on.

The solution:
First list all BDE applications that a specific user will use. Using some memory mapping tool like VMMap, analyse all your BDE applications for a specific user. Find a region out of the ASLR region (below 0x50000000 minus the shared memory size configured in the BDE Admin) that is not being used consistently for every application. Notice that Delphi apps load at a preferred address (AFAIR 0x40000000) by default and this can be changed in Delphi compiler options. Also notice that old DLLs like BDE ones (IDAPI32.DLL) are not marked with the ASLR bit and as such they load at the preferred address stated in the file (PFE), unless another DLL is already loaded there.

After that analysis is done choose a valid address for SHAREDMEMADDRESS like 3BDE. But let me explain which addresses are valid addresses: Valid values for Windows NT and above are from 1000 to 7F00 but, as you have seen, if you are using Vista, Windows Server or any other OS that has the ASLR mechanism working, do NOT leave this setting at its default setting 6BDE. It could be a matter of time before you start having these errors reported from the users. The worst thing is that you will not be able to reproduce the issue easily on other test machines.

To locate the BDE shared memory location in VMMap, look for a block of memory with the indicated SHAREDMEMSIZE (if needed, close all BDE apps and change SHAREDMEMSIZE to a unique value so that you are able to spot it from other blocks).

I was able to reproduce this error in XP (SP2) by doing the reverse procedure and this was important to conclude that the reason for the error had been found. To reproduce the error you can follow this steps: Analyse the memory map for a BDE app, locate a DLL with the ASLR bit off (like IDAPI32.DLL) and take note of its load address. If it is in the valid range for SHAREDMEMADDRESS values (described above) configure this parameter to the same value. If not choose another DLL in the same conditions. Close all BDE applications. Open the first one and notice that the shared memory block is not being allocated at the configured address because it collides with that DLL (I have described how above). Open the same BDE application again and the $210D error will surface even in XP.
Note: ProcMon will allow you to detect if a process is ASLR activated or not (add the right column to the DLLs list for instance).

SHAREDMEMSIZE 
Oh and BTW, if the error "$2501: Insufficient memory for this operation" is appearing to some of other users, increase the SHAREDMEMSIZE to 8192 or even a greater value (the maximum is 60000K; the minimum and default is 2048). Each BDE app consumes a fixed part of this shared memory buffer and a bigger buffer will allow for more applications to be open at the same time. This will increase each BDE EXE footprint in the OS after being loaded into memory but this should not be a problem if you have still enough virtual memory available in the OS. Another interesting thing i have discovered in the process was that by using the maximum value for SHAREDMEMSIZE I managed to open the maximum number of simultaneous BDE clients in a Windows session: 48. This was something I have also learned during this time. The limit to the number of BDE apps that can be opened at the same time is described in the "BDE Limits" help topic of BDEADMIN.HLP.

I leave you with a tip: Migrate this BDE apps to newer technologies ASAP.
You'll be doing yourself and your customer a big favor. Problems are around the corner and if you have to spot them by yourself you are alone in the dark (no support, no technical documentation, no source code, etc).

For details on these BDE parameters and another ones you can look at the BDEADMIN.HLP help file.

10 comentários:

  1. THANKS! Thanks a LOT! You helped me A LOT! Thanks for sharing!

    ResponderEliminar
  2. Fabiano, fico muito contente por ter ajudado alguém ("do outro lado do Atlântico", certamente, acertei?).

    Esta situação foi muito complexa de despistar e achei extremamente importante partilhar.

    Sempre ao dispor!

    ResponderEliminar
  3. Thanks for the article! It was the first article I read about the problem which explained exactly what the problem was and a definitive way to fix it (instead of just saying to try changing the SHAREDMEMLOCATION to some random value until it all just happens to works). We’ve been looking for the solution to this problem for a long time and now, finally, we’ve managed to solve it (as our software vendor just kept saying they don’t support the version we’re using anymore and that we need to upgrade)!

    I also want to add a contribution to it: our ERP relies on BDE to access the database, and we wanted to be able to open any of the applications at any time. So, we’ve created a small python script that would automatically analyze the output of vmmap tool and tell us the free memory addresses which are common to all of the applications. I want to share the script, so it might help other people who are facing the same problem we were. The script is available here: http://bit.ly/11q3bTp. All you have to do is run the vmmap tool, select the application you want to investigate and save the output as a mmp file (which is, actually, a XML file). Repeat this step for all the applications which use BDE to access data, saving all the “.mmp” files in a folder. Once done, put the script inside the same folder and run it. Before running it, you might want to edit the value of the variable SHAREDMEMSIZE, on the beginning of the script , so as to match the value on your configuration file. The script was created in a hurry, so, I agree the code could be better optimized. But, as it has already helped us finding the value for SHAREDMEMLOCATION in our applications, we’ve decided to leave it as is (as our goal was already met).

    Another gotcha we’ve met along the way: watch out for the value of “Save for use with”, on the “Object > Options” menu. It must be set to “Windows 95/NT only” and not to “Windows 3.1 and Windows 95/NT”. The former saves the settings to the windows registry, while the later saves it to the registry AND to a file. The issue is that when you select the “Windows 3.1 and Windows 95/NT” option, even though the settings should be saved both to the registry and the configuration files, they are not:

    - When you update any value, the changes are applied only to the file (the registry keeps the old data);
    - When you open an application, the library loads the settings from the registry (ignoring the file and, thus, the modifications you’ve just made).

    We’ve discovered this awkward behavior after a long process of changing the values and not noticing any change to the vmmap output (we’re using BDE version 5.01).

    Again, thanks a lot for sharing your findings in this post; I also hope our script can help others which are facing the same problem we were to solve it.

    ResponderEliminar
    Respostas
    1. Luis, thanks for your feedback and for sharing some additional insight. If you don't mind I'll post your info as an individual article as soon as I find the time for it. Giving you credit, naturaly.

      Eliminar
    2. Luis, thanks for posting the script. I was wondering if I could have a little technical help on it. In the function retorna_posicoes, I get the following exception thrown:

      AttributeError: Document instance has no attribute '__exit__'

      Please note that this is after I fix the call to os.listdir in processa_archivos().

      Eliminar
  4. Thank you very much!
    Your suggestion to use a SHAREDMEMLOCATION below 0x5000(0000) was the key to solve our problem with error $2501.
    From now on, we choose "0x3BDE" for SHAREDMEMLOCATION, considering that 0x4000 is used by our Delphi applications.

    Although we had some trouble with the BDE because it seems to be, that sometimes it ignores the configurationfile's settings and uses the registry settings instead. A test with VMMap showed, that registry settings are always applied.

    Therefore, we change the values in our product updates in the configurationsfile (by BDE driver) AND in the registry, as a precaution.
    Our customers who experienced problems told us, that these seem to be solved after the recent update.

    ResponderEliminar
    Respostas
    1. Hermann, thanks for your feedback. It made me feel that it was worth the time and effort I've put on writing this down and share it with the world.

      Eliminar
  5. Hi, First of all thank you for being actually the only proper article about this issue. Even as of now! I am dealing with this issue for quite some time but finally a see the light in the tunnel.
    I am aware it was already many years after your initial argument, but hopefully you still can help me.
    If I were to se SHAREDMEMSIZE to 65535 (maximum I believe) setting the SHAREMEMLOCATION to 3BDE is not a good option right? Because adding 65535KB to 3BDE is more than ASLR-free range. Am I missing something obvious as this memory adresses/location is way above my understanding .

    ResponderEliminar
    Respostas
    1. Hi K! I almost never go to GMail but today I went there and I've seen the notification for your comment. It's been a long time since I've publish this "research" (2009) and I've almost forgot it, as much as its contents. But I'd say you're safe with those values because when you set to 0x3BDE you're defining it in the virtual address space to be 0x3BDE 0000 (64 bits addressing space) (and this will for sure respect my suggestion - use a SHAREDMEMLOCATION below 0x5000(0000)). From 0x3BDE 0000 to 0x5000 0000 there's way more than 64K of memepory address space :). Did you test it? Also bear in mind some tips from people's comments above (set the value in registry, sometimes the BDE engine settings file will not override for some reason the registry setting). Try it for a while. If the problem disappears (after 1-2 weeks nobody is reporting it), you'll feel happy as I did 12 years ago :). Hope this helps. Peace.

      Eliminar
    2. Hi! Thank you for your response! So I have tried 3BDE and it have worked for certain VM's and not for the others. I will try changing registry settings as well. Overall thank you very much, it certainly helped me. I just wonder why it is SINGLE place in entire internet which provides any reasonable answer. It is not common nowadays :)

      Eliminar