vrogier / ocilib

OCILIB (C and C++ Drivers for Oracle) - Open source C and C++ library for accessing Oracle databases

Home Page:http://www.ocilib.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OCI_DequeueGet() returns messages with OCI_Object payloads that can have NULL properties while being NOT NULL when queued

prospero-team opened this issue · comments

hello,

we have a problem with ocilib when working with clients sending queries on pure oracle sql
The problem is that under certain conditions the data are not read from the message oracle

more in detail: in JmsHeader(QUEUEUE_TYPE "SYS.AQ$_JMS_BYTES_MESSAGE") data less than 2000 are written in BYTES_RAW (through OCI_ObjectSetRaw) if more than 2000 then through blob(OCI_LobWrite)

respectively the simplified code which reads these messages is as follows:

  len = OCI_ObjectGetDouble(ctx.obj, "BYTES_LEN");
  if (len < 2000)
  {
    size = OCI_ObjectGetRaw(ctx.obj, "BYTES_RAW", buffer, 10 * 1024);
    return "raw";
  }
  lob = OCI_ObjectGetLob(ctx.obj, "BYTES_LOB");
  isok = OCI_LobRead2(lob, buffer, &char_count, &byte_count);
  return "blob";

this code depends on which message came first

If there was a message written to "BYTES_RAW, then the messages written to BYTES_LOB are not read and vice versa, if the first message was in BYTES_LOB, then the messages written to BYTES_RAW are not read

You can see an example of this behavior here https://github.com/prospero-team/OciLibExample1/blob/main/README.md

Hi,

I reproduced the issue using provided code.
In the 1st workflow (init, start reader, send raw, send blob) , when the reader tries to retrieve the "BYTES_LOB" member of the object "AQ$_JMS_BYTES_MESSAGE", OCILIB returns a NULL OCI_Lob object.
And if I kill the reader and restart it and send blob again, OCILIB returns a valid OCI_Lob object.
Thus, it corresponds to what you are describing.

I debugged the code and it seems to be an Oracle Client issue.
Oracle object types are exposed to C/C++ using Oracle OCI Client as dynamic C structures.
Oracle provides a Object Type Translator (OTT) tool allowing to generate C representation of these database object types in C/C++.
OCILIB does not require users to use this tool and instead, it handles all the complexity by setting /getting itself member values dynamically upon object member names.
As there is no way to express the concept of NULL for a value type in C, Oracle also provide for each object type another structure called an "Indicator structure" which have an indicator value (null/not null) for each member of the object structure instance.
Thus, when OCILIB retrieves an instance of a dynamic object structure from Oracle client, it also retrieves a instance of its indicator structure companion instance.
Every time you do call OCI_Object_GetXXX(), OCILIB starts by verifying the indicator value from the indicator structure.
If it is not null, then it retrieves the value from the object structure.
If it is null, it returns directly with a default value for the given type that correspond to null.
For an OCI_Lob instance, it would then return NULL.

Thus, when I reproduced the issue, it appears that the value in the indicators structure for the field "BYTES_LOB" indicates that the property is NULL. And this is why OCILIB returns A NULL OCI_Lob value.
But looking at the content of the field in the object structure the clob handle is not null and valid !
If I remove the check on the indicator value, then the CLOB content is read as expected.

So, to my opinion, this is a issue with the Oracle client function OCIObjectGetInd() that returns a indicator structure with wrong value in your workflow.

To make sure that the issue was not in OCILIB internal computation of member offset in the object and indicator dynamic structures, I used OTT to generate a C/C++ representation of "AQ$_JMS_BYTES_MESSAGE :

typedef OCIRef AQ__JMS_BYTES_MESSAGE_ref;
typedef OCIRef aq__jms_header_ref;
typedef OCIRef aq__agent_ref;
typedef OCIArray aq__jms_userproparray;

struct aq__agent
{
   OCIString * name;
   OCIString * address;
   OCINumber protocol;
};
typedef struct aq__agent aq__agent;

struct aq__agent_ind
{
   OCIInd _atomic;
   OCIInd name;
   OCIInd address;
   OCIInd protocol;
};
typedef struct aq__agent_ind aq__agent_ind;

struct aq__jms_header
{
   struct aq__agent replyto;
   OCIString * type;
   OCIString * userid;
   OCIString * appid;
   OCIString * groupid;
   OCINumber groupseq;
   aq__jms_userproparray * properties;
};
typedef struct aq__jms_header aq__jms_header;

struct aq__jms_header_ind
{
   OCIInd _atomic;
   struct aq__agent_ind replyto;
   OCIInd type;
   OCIInd userid;
   OCIInd appid;
   OCIInd groupid;
   OCIInd groupseq;
   OCIInd properties;
};
typedef struct aq__jms_header_ind aq__jms_header_ind;

struct AQ__JMS_BYTES_MESSAGE
{
   struct aq__jms_header header;
   OCINumber bytes_len;
   OCIRaw * bytes_raw;
   OCIBlobLocator * bytes_lob;
};
typedef struct AQ__JMS_BYTES_MESSAGE AQ__JMS_BYTES_MESSAGE;

struct AQ__JMS_BYTES_MESSAGE_ind
{
   OCIInd _atomic;
   struct aq__jms_header_ind header;
   OCIInd bytes_len;
   OCIInd bytes_raw;
   OCIInd bytes_lob;
};
typedef struct AQ__JMS_BYTES_MESSAGE_ind AQ__JMS_BYTES_MESSAGE_ind;

OCILIB handles these object types and opaque buffers and indicators structures as dynamic array of OCIInd values.
In debug, mode, I found out that OCILIB computed that the indicator for bytes_lob was the 15th element.
That matches the OTT definition.

If doing this in C/C++ code:

    AQ__JMS_BYTES_MESSAGE_ind ind{};

    ((OCIInd*)&ind)[14] = -1;

it sets the member ind.bytes_lob to -1

This is why I think the is an issue in Oracle Client code.

Regards,

Vincent

I could of course not check the indicator value for pointer based properties and use the value if not null but if oracle client forget to initialize some members, we might run into seg faults....

I ran into more tests and found out a way to solve the issue.
The Oracle OCI API is a inconsistent when handling indicator structure related to object retrived from AQ.
I will report it.
I will commit a fix in the v4.7.5 branch today.

@vrogier please do send us a testcase.

@cjbj I first wanted to do so and started an mail to you and Anthony but I realized that the issue was between the desk and the computer!
I made a fix for this issue.

But I will send you a mail to ask for a documentation improvement to OCIObectGetInd() and OCIAQDeq().

I was naively thinking that I could not pass an indicator structure to OCIAQDeq() as we get an out object handle when it returns and can call OCIObectGetInd() to retrieve its indicator structure (that is the same value that the one filled by OCIAQDeq() when it returns the first time.
The issue is that with a second call to OCIAQDeq(), the out object handle is the same but the out indicator structure is different.
And OCIObectGetInd() returns the same indicator struct as in the first call.
That's where I got the issue but this behavior is very much misleading.
I fixed the issue by always passing my local object instance indicator struct to OCIAQDeq().

@vrogier looking forward to hearing suggestions. And thanks for all the work you do on OCILIB.