CFE_TBL_FileDef Size Error

Question

CFE_TBL_FileDef Size Error

JimKaidyNASA opened this issue 5 years ago · comments

Before we go back and start editing our Simulink models to reduce name sizes, is there something that makes the accommodation of table names greater than 16 characters not possible in the code?

We’ve set the CFE_TBL_MAX_NAME_LENGTH to 28 and got through the initial issue with lengths. The problem shows up with the CFE_TBL_FileDef length not properly defined. I’ve looked to see how this happens and found eci_tbl_if.h to contain logic for defining this parameter. Is there another setting here or elsewhere to allow CFE_TBL_FileDef to be under a limit?

Here is the error if this helps:

JimKaidyNASA commented 5 years ago

Will do.

SpaceSteve121 · Answer 1 · Wed Sep 04 2019 23:44:16 GMT+0800 (China Standard Time)

@JimKaidyNASA Is it possible for you to post ctrl_srm_gains.c so that we can take a look at the arguments to the CFE_TBL_FileDef call?

JimKaidyNASA · Answer 2 · Wed Sep 04 2019 23:48:35 GMT+0800 (China Standard Time)

JimKaidyNASA commented 5 years ago

JimKaidyNASA · Answer 3 · Wed Sep 04 2019 23:50:28 GMT+0800 (China Standard Time)

Sorry I saw close and comment and thought that was closing the comment, not closing the issue.

SpaceSteve121 · Answer 4 · Wed Sep 04 2019 23:54:15 GMT+0800 (China Standard Time)

@JimKaidyNASA In the future, its very helpful if you can use the code tags when posting code, because it makes it much easier to read and copy/paste than a screenshot. What you have is fine for now though.

Based on that snippet, it looks like the size of your table should be 48 bytes (6 doubles * 8 bytes each), which is definitely at odds with the size that the macro sees (says 172 in the error message). Let me take a look at that macro a bit closer and see if we can't figure out what's going on.

SpaceSteve121 · Answer 5 · Thu Sep 05 2019 00:23:01 GMT+0800 (China Standard Time)

Note, the location that error is being generated from is https://github.com/nasa/elf2cfetbl/blob/09102cca146dab4106009452bc8ac8b5fc5a0fa4/elf2cfetbl.c#L1571

Need some time to try to understand what's going on there.

JimKaidyNASA · Answer 6 · Thu Sep 05 2019 01:36:45 GMT+0800 (China Standard Time)

Ok thanks for keeping me posted. Hoping it is just a matter of increasing the size limit on another component of the full name or the formula for generating it.

JimKaidyNASA · Answer 7 · Fri Sep 06 2019 05:04:16 GMT+0800 (China Standard Time)

Steve I have been looking at this code and have traced the logic and from the best I can tell the TblDefSymbolIndex (equal to SymbolIndex) is possibly not defined. It depends on a value coming from a loop that needs a non-zero value for NumSymbols in the index loop where GetSymbol is called to set the index for SymbolIndex . I don't see anywhere where NumSymbols is set to anything but zero. But the AllocateSymbols routine does a check for NumSymbols == 0 and should error out.

Thus I don't see any means for the logic to get down to GetTblDefInfo where it is failing.

SpaceSteve121 · Answer 8 · Fri Sep 06 2019 22:03:31 GMT+0800 (China Standard Time)

@skliper Are you the POC for the elf2cfetbl tool? Otherwise, do you know of anyone who might be able to help look into this?

Jake Hageman · Answer 9 · Fri Sep 06 2019 22:45:47 GMT+0800 (China Standard Time)

The error is on matching size for CFE_TBL_FileDef_t between the object built and what elf2cfetbl is expecting:

https://github.com/nasa/cFE/blob/b2765d9f905aa11f0c3b419b62233597366a7db9/cfe/fsw/cfe-core/src/inc/cfe_tbl_filedef.h#L85-L92

Did you rebuild elf2cfetbl after changing the size of CFE_TBL_FileDef_t (by changing the table name length)?

JimKaidyNASA · Answer 10 · Fri Sep 06 2019 22:50:17 GMT+0800 (China Standard Time)

Was not aware of the need to rebuild efl2cfetbl as a separate operation. Can you give guidance on that step?

JimKaidyNASA · Answer 11 · Fri Sep 06 2019 23:08:56 GMT+0800 (China Standard Time)

I followed the README in the directory, copying it from the for_build and ran make clean and make. Got this error that osconfig.h is not found.

JimKaidyNASA · Answer 12 · Sat Sep 07 2019 00:54:52 GMT+0800 (China Standard Time)

After copying three missing files into the elf2cfetbl directory including osconfig.h, cfe_platform_cfg.h and cfe_msgids.h, I was able to build elf2cfetbl.c into an executable.

I then went back to the ~/cfs/build/cpu1 directory, did a make clean and attempted to build again. Still getting the same error as before.

Jake Hageman · Answer 13 · Sat Sep 07 2019 02:20:46 GMT+0800 (China Standard Time)

Right. I'm not as experienced with classic build, but what I think what you did was rebuild elf2cfetbl in the elf2cfetbl directory. Then when rerunning the classic build to build tables, it likely used the efl2cfetbl executable that was built as part of the standard build process (and make install if that step is in classic build). I'd recommend instead doing a fresh rebuild of everything, if you can do a make distclean in classic build and start over then it should rebuild elf2cfetbl and put it in the right spot for the table building portion to use.

I am guessing a bit here since 6.5 and classic build is before my time so I may not be referencing the right keywords for classic build... but hopefully the concept makes sense?

Jake Hageman · Answer 14 · Sat Sep 07 2019 02:22:38 GMT+0800 (China Standard Time)

Note if you redo the build from scratch, you shouldn't need to copy any includes (and you'll likely want to remove those you coppied since they could be overriding what the build from scratch expects).

JimKaidyNASA · Answer 15 · Sat Sep 07 2019 02:43:51 GMT+0800 (China Standard Time)

Just to affirm that I did do a rebuild of ef2cfetbl in the ef2cfetbl directory. Given that, I need to understand the specific steps in "doing a fresh rebuild of everything". I am not sure what it means to do a distclean in classic build.
0) Remove the three files I copied into the efl2cfetbl
1)Where do I do the distclean?
2)Do I rebuild efl2cfetbl in the efl2cfetbl directory? Are those include files that I deleted going to be found elsewhere?
3) Where is the right spot to copy the executable for table building use?

Jake Hageman · Answer 16 · Sat Sep 07 2019 04:13:13 GMT+0800 (China Standard Time)

I recommend doing step 0 now/first.

When you first built the entire system, did you run a make all and make install from some base directory somewhere (or maybe the build directory?) and it all worked? If so, from that same directory, run make clean, make all, and make install again. This should rebuild everything including elf2cfetbl and put it in the right location for you, as well as build your table assuming things are configured correctly.

If this or something similar doesn't work, what steps did you follow to build the very first time? I think classic build typically has a setvars.sh step, the build and install steps. Basically you just want to clean and repeat these initial steps over again.

Jake Hageman · Answer 17 · Sat Sep 07 2019 04:15:32 GMT+0800 (China Standard Time)

I poked around a bit in the old code, try from the build/cpu1 directory make distclean and then make, don't forget to run the setvars.sh first per the standard directions.

JimKaidyNASA · Answer 18 · Sat Sep 07 2019 04:57:13 GMT+0800 (China Standard Time)

Did a make clean, make config (once) then make. Did not do a make all or make install.

JimKaidyNASA · Answer 19 · Sat Sep 07 2019 05:01:56 GMT+0800 (China Standard Time)

I did step 0 (removed the three files as well as the executable elf2cfetbl). I went to the top level directory ~/cfs and reran setvars.sh and then went to ~/cfs/build/cpu1 and did a make clean and the make all. Still getting the error.

I

Jake Hageman · Answer 20 · Sat Sep 07 2019 05:41:01 GMT+0800 (China Standard Time)

Check were the elf2cfetbl was built (find ./ -name "elf2cfetbl") and confirm it just got rebuilt (check the timestamp). Just a make clean may not remove it. Probably need to do a make distclean. If that doesn't work just delete the executable and try again, hopefully that makes it rebuild? If it was just rebuilt then I recommend using a debugger or poke in printf's and check the size and contents of CFE_TBL_FileDef_t.

JimKaidyNASA · Answer 21 · Mon Sep 09 2019 22:06:38 GMT+0800 (China Standard Time)

I have verified that the executable elf2cfetbl is deleted after the make distclean.

When I tried the make, I get the missing file error again.. osconfig.h.

Jake Hageman · Answer 22 · Tue Sep 10 2019 21:19:58 GMT+0800 (China Standard Time)

How did you get it to build the first time? Sounds like you may have bigger issues in your distribution. I'd recommend starting from a working version that builds from scratch, then apply your changes. I don't have enough information to suggest anything more helpful.

JimKaidyNASA · Answer 23 · Tue Sep 10 2019 21:28:15 GMT+0800 (China Standard Time)

I followed the README.md in the top level directory from the repository. I was able to build with no problem.

JimKaidyNASA · Answer 24 · Tue Sep 10 2019 21:30:22 GMT+0800 (China Standard Time)

I have been compiling the steps we took to document them. Attached in the word document are the procedures we've used so far.
Template for GN&C Integration cFS.docx

Jake Hageman · Answer 25 · Tue Sep 10 2019 21:35:24 GMT+0800 (China Standard Time)

If it built with a clean clone, and it has errors now with a missing header file, that points to an issue with one of the changes you made.

SpaceSteve121 · Answer 26 · Tue Sep 10 2019 21:35:48 GMT+0800 (China Standard Time)

@JimKaidyNASA Perhaps start that process again from the beginning (so that you have a clean repo), but make your modification to the table name size before you start the second section (ie, setvars) which includes the compilation?

Also note that we don't necessarily recommend that you use our CI scripts in your development workflow... they could be used to initialize a repo for the first time, but typically you'd maintain your own repo which you'd setup once, and just integrate changes to your code each new delivery you do.

JimKaidyNASA · Answer 27 · Tue Sep 10 2019 21:40:59 GMT+0800 (China Standard Time)

What scripts are you referring to as CI? (command input?)

SpaceSteve121 · Answer 28 · Tue Sep 10 2019 21:46:46 GMT+0800 (China Standard Time)

CI in this context is Continuous Integration... sorry for the confusion. CI refers to the tests that we run every time we update this code. Those scripts are intended to setup the SIL/ECI on a fresh system so that we can test changes to this code automatically.

Thus, they can be used to set up a development environment, however typically you'd want to configuration manage your environment once you have it working so that you can track changes you're making to it. Was just providing a heads up that we're not necessarily shipping those scripts with the intent that you use them as an active part of a workflow and we reserve the right to change them however we need for our testing, which may break your workflow.

JimKaidyNASA · Answer 29 · Tue Sep 10 2019 21:50:01 GMT+0800 (China Standard Time)

So which one is it the make config that we should not use?

SpaceSteve121 · Answer 30 · Tue Sep 10 2019 21:53:21 GMT+0800 (China Standard Time)

All I was saying is that I don't necessarily recommend using our fetchCFE.sh script in your workflow, which you're calling from the ci directory of the ECI repo. That script was developed and is used for SIL/ECI testing. I was just warning that we make no guarantees about the stability of that script and might update it to get CFE 6.6 in the future, which might break your workflow if you're expecting that it gets CFE 6.5 for you. I was not commenting about how to build the CFS or run make.

Sorry for the confusion.

JimKaidyNASA · Answer 31 · Tue Sep 10 2019 21:59:37 GMT+0800 (China Standard Time)

That's fine. I understand now. I had my suspicions about the fetch. It does actually retrieve the 6.5 as desired though, but I will attempt another fresh checkout, manually switch to 6.5 and then make the table name max length change just before the setvars.sh.

JimKaidyNASA · Answer 32 · Thu Sep 12 2019 21:25:55 GMT+0800 (China Standard Time)

I made sure that the cfe version is 6.5 and have attempted to test the modification to the procedure by setting the table max (CFE_TBL_MAX_NAME_LENGTH) to 28 from 16 and then running the setvars. The error is still the same.

Jake Hageman · Answer 33 · Thu Sep 12 2019 22:44:11 GMT+0800 (China Standard Time)

That doesn't look like the same failure to me. You are missing the cntl_srm_gains.c file.

JimKaidyNASA · Answer 34 · Thu Sep 12 2019 22:52:08 GMT+0800 (China Standard Time)

Yes you are correct. I am looking for what is missing in my procedure to get the error back.

JimKaidyNASA · Answer 35 · Fri Sep 13 2019 21:57:54 GMT+0800 (China Standard Time)

Template for GN&C Integration cFS.docx
I have a procedure for the nominal cfs build and the cnt added. This gets us to where we were with the table size issue. I tried a modified version of changing the CFE_TBL_MAX_NAME_LENGTH to 28 before the setvars.sh and it doesn't make any difference when building the cfs first without cnt. Not sure if this was the a test or not of whether changing it first makes a difference in the cfs build, which it does not.

Jake Hageman · Answer 36 · Fri Sep 13 2019 22:34:36 GMT+0800 (China Standard Time)

I'd suggest trying to move "Edit cfe_mission_cfg.h" before the initial build (that is likely building your elf2cfetbl, and may not be getting rebuilt the second time you build).

JimKaidyNASA · Answer 37 · Sat Sep 14 2019 00:10:40 GMT+0800 (China Standard Time)

I actually went ahead and deleted elf2cfetb executable and did a make clean and make. So I know it is rebuilding elf2cfetbl.

Jake Hageman · Answer 38 · Sat Sep 14 2019 01:52:31 GMT+0800 (China Standard Time)

Before we go back and start editing our Simulink models to reduce name sizes, is there something that makes the accommodation of table names greater than 16 characters not possible in the code?

We’ve set the CFE_TBL_MAX_NAME_LENGTH to 28 and got through the initial issue with lengths. The problem shows up with the CFE_TBL_FileDef length not properly defined. I’ve looked to see how this happens and found eci_tbl_if.h to contain logic for defining this parameter. Is there another setting here or elsewhere to allow CFE_TBL_FileDef to be under a limit?

Here is the error if this helps:

I'm now thoroughly confused what error we are trying to fix here. Is it the one listed in the original post on the thread or the one you highlight above? They don't look the same to me. The one you just highlighted is the table name being too long.

Jake Hageman · Answer 39 · Sat Sep 14 2019 01:54:58 GMT+0800 (China Standard Time)

It's possible to fix the original issue you might need to change the table name size in two locations, both for cFE and for elf2cfetbl. There is a current ticket open against cFE 6.7 to clean up the table name size macros and make them consistent, but there is no plan to update 6.5.

JimKaidyNASA · Answer 40 · Sat Sep 14 2019 01:58:23 GMT+0800 (China Standard Time)

Yes I have no problem with making the change to both for now until cFE 6.7 comes out. In the meantime I am searching for where ECI_PARAM_TBL_MAX_NAME_LEN is being set. I cannot find a default in the *.h or *.c files.

Jake Hageman · Answer 41 · Sat Sep 14 2019 03:24:15 GMT+0800 (China Standard Time)

I'm actually surprised ECI would define it's own TBL_MAX_NAME_LEN, @SpaceSteve121 any ideas?

Jake Hageman · Answer 42 · Sat Sep 14 2019 03:25:54 GMT+0800 (China Standard Time)

I actually went ahead and deleted elf2cfetb executable and did a make clean and make. So I know it is rebuilding elf2cfetbl.

Just so I can keep things straight in my head, the error here is during your build of the table prior to the elf2cfetbl step, so it's not clear to me yet that you've hit the original error again.

JimKaidyNASA · Answer 43 · Sat Sep 14 2019 03:41:58 GMT+0800 (China Standard Time)

Yes I believe you may be correct. The original error violated table-rules.mak:21 and this one is table-rules.mak:27.
That said the original error seems to be related elf2cfetbl but this may be because we weren't rebuilding the elf2cfetbl executable (or not related). But something tells me we are closing in because the ECI_PARAM_TBL_MAX_NAME_LEN seems to be the key since this gives us a bit more to work with as there is a limit violation here. Do you know where this is set to its default? My search has a hashtag define in the eci_tbl_if.h file and nowhere set in *.c that I can find:

SpaceSteve121 · Answer 44 · Sat Sep 14 2019 03:43:33 GMT+0800 (China Standard Time)

Look here for the definition of ECI_PARAM_TBL_MAX_NAME_LEN. It defaults to CFE_TBL_MAX_NAME_LENGTH, so as long as you've set that and recompiled I'm not sure why you're running into that.

JimKaidyNASA · Answer 45 · Sat Sep 14 2019 03:48:10 GMT+0800 (China Standard Time)

Yes I found this. So does the hashtag define on line 22 set them equal (set by CFE_TBL_MAX_NAME_LENGTH)? Does this mean that I just need to increase the CFE_TBL_MAX... to something that the combined name and other stuff concatenated can fit under when it compares to the ECI_PARAM...?

SpaceSteve121 · Answer 46 · Sat Sep 14 2019 03:52:01 GMT+0800 (China Standard Time)

The syntax

#define ECI_PARAM_TBL_MAX_NAME_LEN    CFE_TBL_MAX_NAME_LENGTH

sets ECI_PARAM_TBL_MAX_NAME_LEN equal to CFE_TBL_MAX_NAME_LENGTH, but that's all enclosed in a conditional, so that only happens if ECI_PARAM_TBL_MAX_NAME_LEN isn't already defined.

If you set CFE_TBL_MAX_NAME_LENGTH to the length you need it and do not set ECI_PARAM_TBL_MAX_NAME_LEN then I think you should be fine.

JimKaidyNASA · Answer 47 · Sat Sep 14 2019 03:56:48 GMT+0800 (China Standard Time)

Understood, but as you can see the cntl_srm_gains_Tbl is violating the ECI_PARAM.. limit. I have not been able to locate where ECI_PARAM.. is set to a default, so I am supposing that it's being set equal to the CFE_TBL_MAX... I just tried setting it from 28 (originally 16) to 58. I deleted the elf2cfetbl executable and did a make clean and a make and still getting the same error.

SpaceSteve121 · Answer 48 · Sat Sep 14 2019 04:02:58 GMT+0800 (China Standard Time)

That snippet we've been discussing is the setting of the default value for ECI_PARAM_TBL_MAX_NAME_LEN. If you're not sure that's happening properly you could print the value using a diagnostic pragma right before that table name check to verify what the value is.

Jake Hageman · Answer 49 · Sat Sep 14 2019 04:03:37 GMT+0800 (China Standard Time)

Is there a point at which it just becomes easier to have a shorter table name? Changing low level defines in cFS is not trivial, as 9 days on this issue reflect. You need to trace the various sizes passed around and make sure they are consistent everywhere to get it to work right. Just changing CFE_TBL_MAX won't work, just changing ECI_PARAM won't work, etc.

In attempting to simplify an issue in cFE 6.7, Joe tracked down the following behavior (may be different in 6.5):

CFE_TBL_MAX_NAME_LENGTH - this one is already marked as deprecated, so I’ll just ignore it
CFE_MISSION_TBL_MAX_NAME_LENGTH , this replaces CFE_TBL_MAX_NAME_LENGTH, OK…
CFE_TBL_MAX_FULL_NAME_LEN_COMP, intended for “AppName.TblName” string storage, but based on OS_MAX_API_NAME, so not consistent across CPUs. This is not directly used in apps or anywhere else, it is just an intermediate for the next one.
CFE_TBL_MAX_FULL_NAME_LEN, which is just CFE_TBL_MAX_FULL_NAME_LEN_COMP rounded up to a 32 bit multiple. This one IS used in CFS apps, both for internal storage and telemetry. It is also used by CFE TBL for its internal storage. But not used by CFE TBL telemetry.
CFE_MISSION_TBL_MAX_FULL_NAME_LEN, for “AppName.TblName” strings, but based on CFE_MISSION_MAX_API_LEN rather than OS_MAX_API_NAME, so it’s better. This is used by CFE_TBL telemetry packets. However, it is NOT rounded up to a 32 bit multiple like the CFE_TBL_MAX_FULL_NAME_LEN is.
CFE_TBL_FILDEF_MAX_NAME_LEN, only used in the CFE_TBL_FileDef_t struct definition, which is used by elf2cfetbl only. This is always the same as CFE_TBL_MAX_FULL_NAME_LEN. The companion CFE_TBL_File_Hdr_t, which actually gets put on the files themselves, uses CFE_TBL_MAX_FULL_NAME_LEN

Not exactly related, but illustrates the complexity involved.

JimKaidyNASA · Answer 50 · Sat Sep 14 2019 04:15:45 GMT+0800 (China Standard Time)

It doesn't like my pragma:

SpaceSteve121 · Answer 51 · Mon Sep 16 2019 20:21:15 GMT+0800 (China Standard Time)

@JimKaidyNASA What's the error message?

SpaceSteve121 · Answer 52 · Mon Sep 16 2019 20:27:31 GMT+0800 (China Standard Time)

I believe the syntax would be something like:

#define DISPLAY_VALUE2(x) #x
#define DISPLAY_VALUE(x) DISPLAY_VALUE2(x)
#pragma message( "ECI_PARAM_TBL_MAX_NAME_LEN  = " DISPLAY_VALUE(ECI_PARAM_TBL_MAX_NAME_LEN) )

which is explained in the gcc docs.

JimKaidyNASA · Answer 53 · Mon Sep 16 2019 20:58:47 GMT+0800 (China Standard Time)

Here is the pragma syntax error along with the remaining table length error.

JimKaidyNASA · Answer 54 · Mon Sep 16 2019 21:02:38 GMT+0800 (China Standard Time)

Ok that worked and ECI_PARAM_TBL_MAX_NAME_LEN = 16. If it is being set to CFE_MISSION_TBL_MAX_NAME_LENGTH it should be set to 28. I will check on that with a pragma as well.

JimKaidyNASA · Answer 55 · Mon Sep 16 2019 21:06:18 GMT+0800 (China Standard Time)

Ok so CFE_MISSION_TBL_MAX_NAME_LENGTH is being reset back to 16 somewhere. I am setting in cfe_mission_cfg.h and somewhere downstream it is being overwritten.

JimKaidyNASA · Answer 56 · Mon Sep 16 2019 21:15:50 GMT+0800 (China Standard Time)

Looks like there are two locations (that matter) where the CFE_TBL_MAX_NAME_LENGTH is set. The last one to execute must be the one in build/mission_inc.

JimKaidyNASA · Answer 57 · Mon Sep 16 2019 21:28:58 GMT+0800 (China Standard Time)

Good news we are no longer in the eci_tbl_if.h as the TBL MAX Names are both large (I set to 58).
We are in the cntl_srm_gains.o where the CFE_TBL_FileDef is causing a seg fault.

JimKaidyNASA · Answer 58 · Mon Sep 16 2019 21:33:17 GMT+0800 (China Standard Time)

Sorry about the syntax error. Not CFE_MISSION_TBL...it is CFE_TBL_MAX_NAME_LENGTH we are talking about.

SpaceSteve121 · Answer 59 · Mon Sep 16 2019 21:42:35 GMT+0800 (China Standard Time)

Aren't we back to where we started, but with an incorrect table size of 200 rather than 172 this time? If so, it seems like your compiled elf2cfetbl tool is still out of sync with the size of the table structure in your source code?

Are you fully rebuilding all binaries (include the elf2cfetbl tool) in between each of these tests (by running make distclean), to ensure that you're not mixing various settings compiled at various stages in this debugging process?

JimKaidyNASA · Answer 60 · Mon Sep 16 2019 21:49:21 GMT+0800 (China Standard Time)

So I did the make distclean and make config and the make and I got past the cntl_srm_gains.c!

Now at the cntl_de_thresh.c and problems with type with the cntl_de_thresholds_Tbl. This is progress...

SpaceSteve121 · Answer 61 · Mon Sep 16 2019 22:08:55 GMT+0800 (China Standard Time)

@JimKaidyNASA If the original problem/question has been resolved, please close this issue. If there are further problems/questions please open an issue specific to each one so that questions/debugging/answers stay organized. Thanks!