espressif / arduino-esp32

Arduino core for the ESP32

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

i2c Write Issue

PhilColbert opened this issue · comments

Attaching a ms5611 ( GY86 ) to the esp32 on i2c - works perfectly for about 5-10 minutes - some strange values do come in however it seems to work pretty well.

Then loads of

[E][esp32-hal-i2c.c:161] i2cWrite(): Busy Timeout! Addr: 77

Board needs to be reset to start again

Any ideas?

Thanks

seems similar to #811 or #741.

Yes, seems so :( Just wondered if anyone had a solution yet !? :)

Seems strange this has only come up now, surely theres a lot of people that use i2c on the esp32 !

@PhilColbert No solution as of yet. @stickbreaker and myself have been working it. The latest code that I submitted increased the timeout, but there appear to be hardware timing issues that need to be adapted in software. At the same time, @stickbreaker is working to add interrupt based handling to better meet the timing needs. So far it is slow going as we are dealing with a chip having a number of hardware defects and a poor interrupt handling design.

I also added code for resetting the I2C subsystem. I have been running this code for weeks with it being able to recover; it is just avoiding the errors in the first place that is more problematic.

Is it possible to try this code that allows reset ? :)

Thanks

I am getting an issue with my MCP7940 (RTC) that constantly is saying this annoying message:

[E][esp32-hal-i2c.c:161] i2cWrite(): Busy Timeout! Addr: 6f.

The only "fix" is to be all the time resetting the I2C protocol in order to access to my device. I am only accessing this device about once a second but in my product will be another I2C device using the bus far more often.

Pullup resistors for SDA and SCL lines are 4.7K. Tried with several frequencies up to 400KHz which is the maximum supported by the MCP7940. All behave the same so I am not sure what is the problem.

Any ideas what is going on? Btw I am using multiple core tasks (one for WiFi/web server, another for logic), no Bluetooth

Look thru the code library that you are using. If it uses Wire.endTransmission(false); or Wire.requestFrom(id,length,false); You are probably running into this timeout issue because of a task switch. @lonerzzz and I are attempting to create a solution.

The current Wire() library implements the ReSTART I2C operation by just pausing the I2C operation. If the I2C operation is not resumed within a small fixed window, the I2C hardware explodes with a TimeOut Error. The method I am currently working on queues Wire() operations until a STOP is encountered. Then it executes all of the I2C operations in one continuous sequence. I am hopeful that I will have a ALPHA version available for testing in the next week or two.

It is enough different than the Wire() library standard that I suspect adoption by espressif/Arduino-ESP32 will be difficult. I don't have all I2C devices to test it against. So my Limited subset of compatibly testing may be judged inadequate.
But, I'm having fun building it! 😀

Chuck.

@stickbreaker, when you have it ready I can test on a few I2C devices as well (LCDs and OLEDs are what I have now)

Thanks for the work ! I have a few different i2c devices so can test aswell.

@atanisoft @PhilColbert @josmunpav try my fork at stickBreaker:arduino-esp32
It should solve your problems, Or your money Back! 😀

Chuck.

I am very appreciative of your efforts on the I2C front, but having tried your library I may be requesting a partial refund. The ESP32 problem I have encountered is simultaneously using I2C and WiFi . In particular, I am using an ADS1115 and BMP280 while accessing the data over the internet. Initially, everything is fine, but after 1-4 hours the I2C stops working correctly--when one I2C device fails the other does too. I tried all kinds of power supply remedies without success, and eventually found other users reporting I2C problems. I tried a couple different 'reset' routines, and they might work a couple times, but eventually that recovery failed, too. The problem remains unresolved.

Anyway, since I had a GY521 on a breadboard I tried your modified library on that. The code I am using is basically from Section 4 ('readings that make sense') of https://olivertechnologydevelopment.wordpress.com/2017/08/17/esp8266-sensor-series-gy-521-imu-part-1 . When I compiled and ran the code I got consistent readings, but they were incorrectly factored. The easiest to understand are the AcX , AcY , AcZ which should add to 1 g (gravitation unit) (like AcX = 0.98 g| AcY = 0.02 g| AcZ = -0.01 g), but mine added up to 0.25 g like (AcX = 0.24 g| AcY = 0.01 g| AcZ = -0.01 g). Using the standard ESP32 core and libraries, I got the correct scaling so I didn't make any effort to trouble-shoot the problem.

I'm not asking you to solve this particular problem. I've made this comment to let you know there are people out here who are appreciative of your efforts and hoping they will solve our problems. This is just one data point that 'might' indicate some issue with writing to the device.

Thanks, Jeff.

@jlhavens are you running under arduino?
this code is designed to give debug output when it fails, in order to see this output, you have to select board as "ESP32 Dev Module" and set the "Core Debug Level:" to "Error"

If you could do this and post a capture of the debug output it would be helpful.

thanks,

Chuck.

@jlhavens Did you notice any difference between the mainBranch Arduino and my code? Did they both fail about the same amount or at the same time?
Reports of any differences would be valuable.
Chuck.

@jlhavens on the timeout: Are you are getting a "Gross Timeout Dead" on the console? I may have been too optimistic on my ISR timeout, I only allow a calculated (bit rate based)+50ms for the I2C transactions to be completed. If this is your Error message it is easy to fix.
But if the timeout is the I2C bit timeout that will be harder. The I2C hardware will only allow upto 13ms before it triggers a hard Timeout. Currently I have if configured to the max.

I'm sorry, I should have realized that you are a lot more interested in a report on the results of your efforts to improve stability than what is probably a lesser glitch in the implementation of a particular sensor.

I didn't really allow this to run until the I2C issue occurred. When I didn't get the correct results, I let it run a couple minutes and went on to something else. Sorry. Taking a closer look at the data I see that the raw data (sensor reading) is the same when using both your and the standard libraries. The program both sends the scaling factors, and reads them back later, every loop through. The received scaling_factor values are used to scale the raw data and those are different in the two environments. Since the raw data is the same in both environments but the read back scale factors are different, I would guess that the reading back the scale factors is amiss. This is the code:

Set in loop() -------- setMPU6050scales(MPU_addr, 0b00000000, 0b00010000);

void setMPU6050scales(byte addr, uint8_t Gyro, uint8_t Accl) {
Wire.beginTransmission(addr);
Wire.write(0x1B); // write to register starting at 0x1B
Wire.write(Gyro); // Self Tests Off and set Gyro FS to 250
Wire.write(Accl); // Self Tests Off and set Accl FS to 8g
Wire.endTransmission(true);
}

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
Wire.beginTransmission(addr);
Wire.write(0x1B); // starting with register 0x3B (ACCEL_XOUT_H)
Wire.endTransmission(false);
Wire.requestFrom(addr, 2, true); // request a total of 14 registers
Gyro = (Wire.read()&(bit(3) | bit(4))) >> 3;
Accl = (Wire.read()&(bit(3) | bit(4))) >> 3;
}

Sending Gyro = 0x00 Accl = 0x10
Reading back Gyro = 0x10 Accl = 0x00 in your environment (incorrect scaling)
Reading back Gyro = 0x00 Accl = 0x10 in normal environment (correct results)

Sorry, none of this answers any of your questions.

I will set the system up again with the "Core Debug Level:" to "Error" and see if I can capture an error message. I have output going to the serial monitor and assume that the error output will go there?

I'll let you know how it goes.
Jeff

I didn't mention. I got some compiler warning and have cut and pasted them into
CompileWarnings.txt

In order to test on a target I knew had a history of failing, I compiled the software known to run for a while (usually a couple hours) on my real target using your library. It crashed after a few seconds. The setup has an HM10 on HardwareSerial(2) , BMP280, ADS1115, and a DS18B20. The DS18B20 and (BMP280 and ADS1115) are each running in separate tasks pinned to core 1. The DS18B20 runs alone just fine with WIFI. Each task takes new reading at about 1 second intervals. It is reading the sensors and sending results over WIFI on a page that calls for an update every second--so the WiFi is regularly getting hit. The data that I got seemed to be fairly correct, but a few fliers. The software sends the data to the serial port only when a WiFi update is requested. I sends a '.' to serial every time the data updates whether it is sent out on WiFI or not. I typically got 1-10 readings before an error was thrown. If the WIFI isn't calling for data, it seemed to go without error for quite a few data updates-- a long row of dots. I didn't run this to failure with the WiFi not calling for data, but it went about 20 minutes without error while I was writing this message.

I've sent a chunk of output in
Error_output.txt.

It shows the output and Gross Timeout Error is seen frequently.

Every once in a while it has some data (1-10 readings at 1 second intervals) that looks like this
.T==74.37 F T_DS==70.70 F P=72.17 mm V=9063 V
.T==74.37 F T_DS==70.70 F P=72.17 mm V=9072 V
.T==74.35 F T_DS==70.70 F P=72.17 mm V=9048 V
If I stop the WiFi data request, the output looks like this (with 1 second (reading) for each '.')
.............................................
like at the very end of the file.

I included so much data in case there is something to see when it first threw the error compared to the subsequent error cycle.

Hope this is helpful.

Jeff

@jlhavens
Looking at that error ouput, the i2cProcQueue() is timing out waiting for the ISR EventGroup flags to change.

[E][esp32-hal-i2c.c:1643] i2cProcQueue():  Gross Timeout Dead st=2492, ed=2542, =50, max=50 error=1
[E][esp32-hal-i2c.c:1103] dumpI2c(): i2c=0x3ffc1040
[E][esp32-hal-i2c.c:1104] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:1105] dumpI2c(): lock=0x3ffd343c
[E][esp32-hal-i2c.c:1106] dumpI2c(): num=0
[E][esp32-hal-i2c.c:1107] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:1108] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:1109] dumpI2c(): error=1
[E][esp32-hal-i2c.c:1110] dumpI2c(): event=0x3ffd34c0 bits=0
[E][esp32-hal-i2c.c:1111] dumpI2c(): intr_handle=0x3ffd08d4
[E][esp32-hal-i2c.c:1112] dumpI2c(): dq=0x3ffe1734
[E][esp32-hal-i2c.c:1113] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:1114] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:1115] dumpI2c(): byteCnt=4
[E][esp32-hal-i2c.c:1120] dumpI2c(): [0] 90 W STOP buf@=0x3ffc3542, len=3, pos=3, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:1468] i2cDumpInts(): row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [01] 0x0001 0x0002 0x0004 0x0000
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [03] 0x0004 0x0040 0x0000 0x0000
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0000  

that top line:

[E][esp32-hal-i2c.c:1643] i2cProcQueue(): Gross Timeout Dead st=2492, ed=2542, =50, max=50 error=1

is saying that the xTaskTick was 2492 when the ISR was started, and xEventGroupGetBits() returned at
2542, a difference of 50, and the maxTimeOut was set to 50.
Based on other values in this dump, it looks like when the ISR was exiting, Setting the values for event in line :1110 it got preempted. and the 50ms timeout expired for the call.
byteCnt = 4 shows that 4bytes moved through the I2C state machine, the DumpInts() show a successful transfer.

As a fix increase the timeout for the ISR at line 1592 in esp32-hal-i2c.c
that calculation is probably too tight. It calculates an 10 bits of clock for each byte, each byte is actually only 9 clocks, each START is one clock and each STOP is one clock. Then adds 50ms.
Change the 50 to? Well how about 500. One half second should be long enough.

Chuck.

@jlhavens Those compiler warning are all my fault, This code is dirty. Before we release it they will all be fixed.

Chuck.

@jlhavens
At line 773 of the debug output is show that the I2C hardware was locked up busy, at Boot. Usually this means that one of the I2C slave devices was confused by the ESP crash and is holding the SDA low, There is coding in the wire library to recover. but It look like the recovery failed.

[E][esp32-hal-i2c.c:947] i2cInitFix(): Busy at initialization!
[E][esp32-hal-i2c.c:1103] dumpI2c(): i2c=0x3ffc1040
[E][esp32-hal-i2c.c:1104] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:1105] dumpI2c(): lock=0x3ffd343c
[E][esp32-hal-i2c.c:1106] dumpI2c(): num=0
[E][esp32-hal-i2c.c:1107] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:1108] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:1109] dumpI2c(): error=6
[E][esp32-hal-i2c.c:1110] dumpI2c(): event=0x3ffd6aa0 bits=92
[E][esp32-hal-i2c.c:1111] dumpI2c(): intr_handle=0x3ffd08d4
[E][esp32-hal-i2c.c:1112] dumpI2c(): dq=0x3ffd6aec
[E][esp32-hal-i2c.c:1113] dumpI2c(): queueCount=2
[E][esp32-hal-i2c.c:1114] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:1115] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:1120] dumpI2c(): [0] ec W STOP buf@=0x3ffc3542, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:1120] dumpI2c(): [1] 48 W STOP buf@=0x3ffc3542, len=3, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:1468] i2cDumpInts(): row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [02] 0x0001 0x0100 0x0000 0x0000
[E][esp32-hal-i2c.c:1103] dumpI2c(): i2c=0x3ffc1040
[E][esp32-hal-i2c.c:1104] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:1105] dumpI2c(): lock=0x3ffd343c
[E][esp32-hal-i2c.c:1106] dumpI2c(): num=0
[E][esp32-hal-i2c.c:1107] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:1108] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:1109] dumpI2c(): error=6
[E][esp32-hal-i2c.c:1110] dumpI2c(): event=0x3ffd6aa0 bits=92
[E][esp32-hal-i2c.c:1111] dumpI2c(): intr_handle=0x3ffd08d4
[E][esp32-hal-i2c.c:1112] dumpI2c(): dq=0x3ffd6a7c
[E][esp32-hal-i2c.c:1113] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:1114] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:1115] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:1120] dumpI2c(): [0] 76 R STOP buf@=0x3ffc34bc, len=1, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:1468] i2cDumpInts(): row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [01] 0x0001 0x0002 0x0004 0x0000
[E][esp32-hal-i2c.c:1471] i2cDumpInts(): [02] 0x0001 0x0100 0x0000 0x0000

error = 6, is timeout
the event bits 0x92 is saying timeOut_error, Done, Error
bytepos = 0, no bytes moved through the I2C hardware

DumpInts [01] shows the txEmpty IRQ (0x0002) fired once (0x0001), and moved two bytes (0x0002) into the fifo.
DumpInts [02] shows the timeOut IRQ (0x0100)fired once (0x0001) and that was all she Wrote!
Basically the ISR work, but the Hardware refused to touch the I2C bus.

I think increasing the Gross Timeout would eliminate this cascade.

Chuck.

@jlhavens Looking at the code, The ISR was in it's exit stage but yielded when it called
xEventGroupSetBits() at line 1273 of esp32-hal-i2c.c and that yield exceeded 50ms.

Definitely, a O/S task switch issue. Increasing that Gross TimeOut should resolve it.

in a reply to: #844 I said this:

I just made a new branch NoYield, It just comments out the portYIELD_FROM_ISR() calls in the ISR. Try commenting them out in
/cores/eps32/esp32-hal-i2c.c
lines:
1269,1275,1329,1334,1375,1384,1436

But, based on your error Report, the Yield happened during the xEventGroupGetBits(). Not after it. So commenting out the following yields would not have any effect.

I'll have to look a the xEventGroup timeout, maybe loop again? I don't know how to release/return to the ISR. I would expect the ISR to have higher priority than the Main Loop(i2cProcessQueue)?

Chuck.

@jlhavens @lonerzzz @me-no-dev More Research, changed conclusion.
With more exploration, I have came to another conclusion.
Changing the Timeout would not solve this problem.
Reading the FreeRTOS documentation on xEventGroupSetbitsFromISR(). I think the call to it failed in the ISR.
The way I understand it, xEventGroupSetBitsFromISR() tries to creates a message in the RTOS daemon task that actually does the bit setting. If the RTOS daemon Task's timer Command Queue was full this messages was never created.

Page. 311 of FreeRTO_Reference_Manual_V9.0.0.pdf

So, the Bits were Never set.

So, the TimeOut is guaranteed.

I'm going to recode the exit to test xEventGroupSetBits()'s return code. And try to catch this error.

Chuck.

@jlhavens attached is a modified esp32-hal-i2c.c that may work around this issue. Can you please try it. It looks like your code explodes with out any delay. ReCreating failures is always the hardest Part.

esp32-hal-i2c.zip

Chuck.

I did the esp32_hal_i2c.c substitution and this is the resulting output Error_output2.txt

I also modified the program to strip out the DS18B20 and BMP280, so that only the ADS1115 was being read and reported by WiFi. At the start I got errors similar to the ones from the previous program. I recompiled with the new hal and have attached the output. Error_output3.txt. Once it got past the first error, it ran fine for an hour with the WiFI asking for updates every second.

I initially had a 25 ms delay before the first ADS1115 read but tried to remove it. When I did that, I got continuous reboots. It turned out that I needed at least a 2 ms delay before the very first read, but after that none was required. While writing this it occurred to me to have a longer delay might prevent the error at start-up. With a 5 second delay before the first ADS1115 read I don't get the startup error in the ADS1115 only code. Output looked like this.
Error_output4.txt

Jeff

@jlhavens
Output2.txt shows what I was expecting, The FreeRTOS dispatch Daemon is being overrun, Some task is being piggy and not allowing the dispatch Daemon to do its work. I can work around it I think, but It is a symptom of bigger problems.
from the first of output2.txt

 Dallas sensors found.
..[E][esp32-hal-i2c.c:1629] i2cProcQueue(): EventGroup Failed:0x0!=0x10
[E][esp32-hal-i2c.c:1629] i2cProcQueue(): EventGroup Failed:0x0!=0x10
[E][esp32-hal-i2c.c:1629] i2cProcQueue(): EventGroup Failed:0x0!=0x10
[E][esp32-hal-i2c.c:1275] i2cIsrExit(): Clear Event Running Failed
[E][esp32-hal-i2c.c:1286] i2cIsrExit(): set ExitCode Failed, ec=0x10
[E][esp32-hal-i2c.c:1629] i2cProcQueue(): EventGroup Failed:0x0!=0x10

The first 3 lines show timeOut's but they don't show the ISR got a fail when it tried to set the progress bits.

the Last 3 lines are what I was expecting, Those are all on the exit path. 1&2 in ISR, 3 the timeout back in i2cProcessQueue()

How are you communicating with the OneWire Devices? I Know that protocol is very time sensitive. Are you disabling interrupts to get accurate timing?

I think the WiFi is being piggy. While it is initing, it is not allowing other processes any time. I do not know how to prove this 'theory' but it feels correct.

If you allow the 5 second delay at boot, then start your sensor scans (including the OneWire devices) how does is respond?

How fast are you polling the ads1115?

All I can see from # 3 is 12 cycles that EventGroups failed. between the two period(.) markers

It does not show how many completed without errors.

Even those 12 should have correctly return/written data from/to the ads1115.

is output4 what you expect? Halfway down # 25 has two periods?

I chose to use EventGroups to synchronize the hand off of queue data to the ISR and accept it back when the ISR had completed. The Timeouts were caused when the ISR was not able to hand the batten back to i2cProcessQueue(). With the modified code, the timeout causes a +50ms delay in APP. As long as the ISR actually completes it work, before the +50ms expires, the effect is minor.

If EventGroup are not stable, I'll have to think of something else.

Chuck.

I left the system that is only reading the ADS1115 and sending to WiFi running overnight and it ran without error (about 10 hours). That is a good result. Using the standard library I wouldn't have expected it to go that long error free, but I didn't previous have this exact program running so no absolute conclusion can be drawn.

Replies:
How are you communicating with the OneWire Devices? I Know that protocol is very time sensitive. Are you disabling interrupts to get accurate timing? -->Here is the basic code One_Wire.txt, but I am not the author. I don't see any indication that interrupts are disabled, but I don't know what is happening in the OneWire Library. I found this code in a forum where someone else also had a problem using the standard Dallas library with either the esp32 or the esp8266.

If you allow the 5 second delay at boot, then start your sensor scans (including the OneWire devices) how does is respond?--> Have not yet done that, but I can.

How fast are you polling the ads1115?--> About once every second I am taking 20 consecutive reading and then averaging them. There is a built in 8 ms delay in the Adafruit library that waits for the conversion to complete. I set a timer and the read cycle is taking about 180ms.

All I can see from # 3 is 12 cycles that EventGroups failed. between the two period(.) markers
It does not show how many completed without errors .--> The two periods indicate that two readings were completed. A period is printed at the end of a read cycle. The actual data is only sent to the serial output when the wifi calls for an update.

is output4 what you expect? Halfway down # 25 has two periods? --> yes, output is as expected. Two periods together just indicate that two readings were taken between WiFi updates.

I will go back to the code that read all the sensors and see what the 50ms delay does. I think I will then try the bmp280/WiFi combination to see how well that works.

Jeff

Chuck,

I added a 5 sec delay before the first reading in the system with I2C and OneWire (ADS1115,BMP280,DS18B20). It ran for somewhat longer, but failed after about 20 readings. The output is here.
Error_output_AllSensors_50ms_Delay.txt

I then took the same code and stripped out the BMP280 calls. That gave error free startup and it ran for about 15 minutes without errors. I stopped it to continue other tests. So the ADS1115 and DS18B20 were both running error free.

I realized that in my code I was starting the tasks to read the sensors before I started the WiFi, Like this:

void setup(void){
...
xTaskCreatePinnedToCore(
A2DTask, //Function to implement the task
"A2DTask", // Name of the task
4000, // Stack size in words
NULL, // Task input parameter
5, //Priority of the task
NULL, // Task handle.
1); // Core where the task should run

WiFi.begin(ssid, password);

// attempt to connect to Wifi network:
while(WiFi.status() != WL_CONNECTED) {
// Connect to WPA/WPA2 network. Change this line if using open or WEP network:
delay(500);
mySerial.print(".");
}
...
}

I considered that the WiFi startup was the resource hog causing problems at the beginning, so I moved the xTaskCreatePinnedToCore call to after the WiFi.begin and wait for the WiFi.status change. After that edit, the program started error free without the 5 second delay (or any delay) before the first A2D conversion. It has run for 20 minutes without any errors.

Jeff

@jlhavens Sounds like you have found the correct sequence. There is a BackTrace decode that you could use to see where the Guru failure occurred.

I hope it was not in my code 😬 😨
Chuck.

Chuck,

I moved on to the BMP280 sensor. I set up a program like that ADS1115 that reads the data in a separate task and then when the WiFi calls for an update it sends the data both to WiFi and serial.
That setup seemed to work okay for a 5-10 minutes, so I made a change.

I set it to read the bmp280 in both a separate task and in-line. In the task, it reads the device and assigns the result to a variable that's used when data is called for. In-line it simply reads the device while it is generating the output and sends it -- like Serial.print( (bmp.readTemperature()*(9.0/5.0))+32);
That ran for a10 minutes then threw an error. It rebooted and then after a couple minutes threw another error. The output contains both the task-read value and the in-line value. The failure seemed to take place when the in-line reading was taking place. The data output looks like
12:12 T=74.61 *C T=74.59 *C P=73.60 mm P=73.60 mm Calc=0ms
where the first T= is Task-read and the second T= is in-line, The same for the two P=.
The failed output look like this:
12:14 T=74.59 *C T=74.59 *C P=73.60 mm P=Guru Meditation Error of type LoadProhibited occurred on core 1. Exception was unhandled.
Which looks to me like it fails at the second P= which is the in-line reading. I've attached the output here. Error_output_BMP280_3.txt

I used the BackTrace on the output (two different times) and got Error_output_BMP280_3_decode.txt. It is pointing to esp32-hal-i2c.c

I have continued to pursue this and have found that inserting a 25ms delay before the data is read in the task gave me 20 minutes run time without any errors. When I only read the data in-line it also ran for 20 minutes without error. If I only read the data in the task (without the 25ms delay) it ran for 20 minutes without error. In a retest, I read both in-line and in-task without the 25ms delay and in 10 minutes I got an error.

I'll try to setup something for a longer test tonight.

Jeff

@jlhavens the Line numbers are of course different because the of pre-compiler actions,
but using just the Function Names; on the second result:

0 In Print::printFloat()
  1 which called _xt_lowInt1
     2 got Interrupted, then dispatched to I2C_ISR_handler_default()
        3 which called _xt_lowInt1
           4 got Interrupted,  then dispatched I2C_ISR_handler_default() 
Then Death?

first example

0 in xEventGroupWaitbits()  -- main Thread hanging for ISR completion
   1 _xt_lowint1
     2 interrupted -> ISR handler
       3 ??

I'll get with @me-no-dev and figure out what is going on.

@jlhavens
Do I understand you correctly, You are starting multiple tasks that each use Wire?

The Wire library is not task safe. The Global Wire() has one instance. it does not handle simultaneous calls from multiple tasks. The only thread safe part is the Dispatcher. And, it is very rudimentary. I did not write any thread safe mutex semaphores in the Queue handling. I am surprised it worked at all.

If you are creating separate tasks, only one can use Wire.

If you want a separate task structure, put all of the I2C (Wire) using into One task.

Have it process each sensor in line, one after the other.

Chuck.

Chuck,
Yes, I created a separate task, initially, for the ADS1115. I wouldn't have thought to do that except that the DS18B20 worked when run in a separate task (in the standard ESP32 environment) after it had failed when I used the standard Dallas library running in loop(). So the system that is failing almost immediately has the code for reading the ads1115 in a separate task from the main loop() while the reading of the BMP280 is in the main loop() task. I can change that.

The bmp280 results I reported previously had some readings of the bmp280 in the main task and some in a separate task. In some cases that worked, depending on where I had inserted delays. I have had a program running for a couple hours where the BMP280 is only being accessed from the separate task and not the main. That seems to work with the WIFI running, too.

The program that ran error free overnight only read the ads1115, and only from a single task.

Jeff

@jlhavens Yea, all of the calls that use Wire.h must be in the same task. Either in the Main Task or One separate task. Well, If you go to the trouble of creating a exclusive semaphore such that only one task can issue commands to the Wire library you could have them in as many task as you wanted.

One of my 'next' goals is to develop a 'Driver' model i2c subsystem. Multi-Task operation is on of the benefits/requirements. But that will wait until this simple single process version is validated.

The bmp280 results I reported previously had some readings of the bmp280 in the main task and some in a separate task. In some cases that worked, depending on where I had inserted delays.

Have you ever seen that Elmer Fudd cartoon of him being a Helicopter rear Gunner? Every time the blade comes around he ducks his head. That's what came to my mind when I read this. 😆

Chuck.

Chuck,

I don't' recall the cartoon, but can visualize it. I sort of felt that way myself, but additionally, it seemed that every time I ducked I hit my forehead on a low beam.

Last night I setup a system that ran the WiFi, ads1115 and bmp280. All the I2C calls were from a single task. It serviced a json request in the main task every second and took one bmp280 and 20 ADC1115 readings every second in a secondary task. It ran for 800 minutes without an error. With the standard library, that would have certainly failed.

The XML that defines the web page is about 10K characters. When I hit my internet browser refresh button 8-10 times in a row I got one of these on the output: .[E][WiFiClient.cpp:223] write(): 104
Nothing stopped, just that error message.

Chuck, I can't tell you how much I appreciate your effort on this I2C thing. After last night's test I feel like I might be able to use the ESP32 in a project that relies on it. I'm sure the multi thread issue was not the root of my problems with the standard library because I only turned to that when I couldn't get the system to work in a standard linear fashion. I had set aside the ESP32 (which I initially had high hopes for) and returned to an Atmege328 device for which in a few hours I was able to put together code that ran for days without issue. For me you have solved a real deal breaker. I only hope that my "help" in testing didn't cause you more grief than it was worth. Thanks, Chuck.

Jeff

Chuck,

I've returned to the GY521. I have still not resolved the scaling factor. The source of the code is referenced in an earlier comment.

The code sets the scale with
setMPU6050scales(MPU_addr, 0b00000000, 0b00010000);
before each reading and then before it interprets the data it runs
getMPU6050scales(MPU_addr, Gyro, Accl);

The code is shown below (I added the printout of the values and changed the code to read the register values into variables before the or-ing and shifting. Those changes had no effect on the results as compared to the original code. In this particular case it looks like the reads are reversed so I temporarily reversed the order of reading to get the correct results (commented out to run the comparison))

`void setMPU6050scales(byte addr, uint8_t Gyro, uint8_t Accl) {
Wire.beginTransmission(addr);
Wire.write(0x1B); // write to register starting at 0x1B
Wire.write(Gyro); // Self Tests Off and set Gyro FS to 250
Wire.write(Accl); // Self Tests Off and set Accl FS to 8g
Wire.endTransmission(true);
Serial.print("\nSending Gyro=0x"); Serial.print(Gyro,HEX);
Serial.print(" Accl=0x"); Serial.println(Accl,HEX);
}

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
int a, g;
Wire.beginTransmission(addr);
Wire.write(0x1B); // starting with register 0x3B (ACCEL_XOUT_H)
Wire.endTransmission(false);
Wire.requestFrom(addr, 2, true); // request a total of 14 registers
g = Wire.read(); a = Wire.read();
// a = Wire.read(); g = Wire.read(); --fix for ESP32 alternate I2C
Gyro = (g & (bit(3) | bit(4))) >> 3;
Accl = (a & (bit(3) | bit(4))) >> 3;
/* change code to read registers befor minipulations;
Gyro = (Wire.read()&(bit(3) | bit(4))) >> 3;
Accl = (Wire.read()&(bit(3) | bit(4))) >> 3;
*/
Serial.print("\nReading Gyro=0x"); Serial.print(g,HEX);
Serial.print(" Accl=0x"); Serial.println(a,HEX);
}`

The output when using the standard ESP32 environment (all as expected) is
Sending Gyro=0x0 Accl=0x10
Reading Gyro=0x0 Accl=0x10
.WiFI Calling-->1:22 AcX=1.00 T= *C AcZ=-0.01 mm P= mm Calc=0ms

The exact same code generates this when using your modified I2C library
Sending Gyro=0x0 Accl=0x10
Reading Gyro=0x10 Accl=0x0
.WiFI Calling-->0:16 AcX=0.25 T= *C AcZ=-0.00 mm P= mm Calc=0ms

I ran the program with the alternate I2C withsome different scale settings and the results are shown below.
setMPU6050scales(MPU_addr, 0b00010000, 0b00110000);
Sending Gyro=0x10 Accl=0x30
Reading Gyro=0x30 Accl=0x0

setMPU6050scales(MPU_addr, 0b00110000, 0b00110000);
Sending Gyro=0x30 Accl=0x30
Reading Gyro=0x30 Accl=0x0

setMPU6050scales(MPU_addr, 0b00100000, 0b00110000);
Sending Gyro=0x20 Accl=0x30
Reading Gyro=0x30 Accl=0x0

No errors are thrown when the programs are run. The compiler output (which in the standard environment complains about a line in getMPU6050scales) is somewhat different between the two I2C environments so I've attached it here. GY521_I2c_compile_output.txt

Any thoughts?

Jeff

@jlhavens Your Success makes me happy!

It ran for 800 minutes without an error. With the standard library, that would have certainly failed.

The WiFiClient Error?

The XML that defines the web page is about 10K characters. When I hit my internet browser refresh button 8-10 times in a row I got one of these on the output: .[E][WiFiClient.cpp:223] write(): 104
Nothing stopped, just that error message.

That Error points to /libraries/Wifi/src/WifiClient.cpp line 223
The 104 error from:
\tools\xtensa-esp32-elf\xtensa-esp32-elf\include\include\sys\errno.h
Says:
#define ECONNRESET 104 /* Connection reset by peer */

The webBrower said shut up.

When you display a code block put three back quotes on a separate line before and after you code. I will make github display it better. If you use ```c++ as the before block quote marker, it will color highlight as c++

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
int a, g;
Wire.beginTransmission(addr);
Wire.write(0x1B); // starting with register 0x3B (ACCEL_XOUT_H)
Wire.endTransmission(false);
Wire.requestFrom(addr, 2, true); // request a total of 14 registers
g = Wire.read(); a = Wire.read();
// a = Wire.read(); g = Wire.read(); --fix for ESP32 alternate I2C
Gyro = (g & (bit(3) | bit(4))) >> 3;
Accl = (a & (bit(3) | bit(4))) >> 3;
/* change code to read registers befor minipulations;
Gyro = (Wire.read()&(bit(3) | bit(4))) >> 3;
Accl = (Wire.read()&(bit(3) | bit(4))) >> 3;
*/
Serial.print("\nReading Gyro=0x"); Serial.print(g,HEX);
Serial.print(" Accl=0x"); Serial.println(a,HEX);
}

Are you Sure this code is correct? The `Wire.write(0x1B);' does not do what the comment said "starting with register 0x3B (ACCEL_XOUT_H)
This is how I would Write this function to work with my I2C

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
int a, g;
Wire.beginTransmission(addr);
Wire.write(0x3B); // starting with register 0x3B (ACCEL_XOUT_H)
uint16_t count=Wire.Transact(14); // request the next 14 bytes from register 0x3b
if(count==14){// happy got everything
  uint16_t xAccel = Wire.read();
  xAccel = (xAccel << 8) + Wire.read();
  uint16_t yAccel = Wire.read();
  yAccel = (yAccel << 8) + Wire.read();
  uint16_t zAccel = Wire.read();
  zAccel = (zAccel << 8) + Wire.read();
  uint16_t temp = Wire.read();
  temp = (temp<<8)+ Wire.read();
  uint16_t xGyro = Wire.read();
  xGyro = (xGyro << 8) + Wire.read();
  uint16_t yGyro = Wire.read();
  yGyro = (yGyro << 8) + Wire.read();
  uint16_t zGyro = Wire.read();
  zGyro = (zGyro << 8) + Wire.read();
  }
else {
  Serial.printf("Wire.Transaction failed=%d only %d bytes read.",Wire.lastError(),count);
  }
/* here is an interpretation from your code as written
from your code g = Register 0x1B ? FS_SEL?
Are you wanting to get the FS_SEL and AFS_SEL?
If that is what you want then how about this
*/
Wire.beginTransmission(addr);
Wire.write(0x1b);
count =Wire.transact(2);
if(count==2){ // success
  //Gyro = (g & (bit(3) | bit(4))) >> 3;
   Gyro = (Wire.read() & 0x18)>>3;
  //Accl = (a & (bit(3) | bit(4))) >> 3;
  Accl = (Wire.read() & 0x18)) >>3;
  Serial.printf("\nReading Gyro=0x%02x Accl=0x%02x\n",Gyro,Accl);
  }
else{
  Serial.printf("Wire.Transaction failed=%d only %d bytes read.",Wire.lastError(),count);
  }
}

@jlhavens try this code for setting the configs

void setConfig( uint8_t addr, uint8_t gyro, uint8_t accel){
Wire.beginTransmission(addr);
Wire.write(0x1b);
uint16_t count = Wire.transact(2);
if(count==2){
  uint8_t g = Wire.read();
  uint8_t a = Wire.read();
  Serial.printf("before Setting raw Gyro=0x%x, raw Accel=0x%x",g,a);
  } 
else {
  Serial.printf("Transact Failed =%d",Wire.lastError());
  }
Wire.beginTransmission(addr);
Wire.write(0x1b);
Wire.write( (gyro&3)<<3);
Wire.write( (a & 0xE0) | ((accel&3)<<3)); // the extra is because I don't want to mess with
// ACCEL_CONFIG: XA_ST, YA_ST, ZA_ST
uint8_t err =Wire.endTransmission();
If(err!=0){
  Serial.printf("endTransmission Failed=%d",err);
  }
Wire.beginTransmission(addr);
Wire.write(0x1b);
uint16_t count = Wire.transact(2);
if(count==2){
  uint8_t g = Wire.read();
  uint8_t a = Wire.read();
  Serial.printf("after Setting raw Gyro=0x%x, raw Accel=0x%x",g,a);
  } 
}

chuck

This is not my original code, but it does seem to work when run correctly in the standard ESP32 environment. I noticed that a few of the comments don't match with the code, but didn't change it since it worked as received. You can look at the article from which the code came in https://olivertechnologydevelopment.wordpress.com/2017/08/17/esp8266-sensor-series-gy-521-imu-part-1/

I made a couple minor edits to your code. I adjusted it to use the actual desired values of the registers as the arguments and for it to read the register values three times in succession after writing them. I place two consecutive calls like this:

setConfig(MPU_addr, 0b00000000, 0b00010000);
setConfig(MPU_addr, 0b00000000, 0b00010000);

I substituted the modified register read code into getMPU6050scales which reads the registers after the data conversion.

void setConfig(uint8_t addr, uint8_t gyro, uint8_t accel) {
        uint8_t a,g;
	Wire.beginTransmission(addr);
	Wire.write(0x1b);
	uint16_t count = Wire.transact(2);
	if (count == 2) {
		//uint8_t g = Wire.read();
		//uint8_t a = Wire.read();
         g = Wire.read();
         a = Wire.read();
		Serial.printf("\nbefore Setting raw Gyro=0x%x, raw Accel=0x%x", g, a);
       Serial.printf("\nPlanned Settings for Gyro=0x%x, raw Accel=0x%x", gyro, accel);
	}
	else {
		Serial.printf("Transact Failed =%d", Wire.lastError());
	}
	Wire.beginTransmission(addr);
	Wire.write(0x1b);
        Wire.write(gyro); // Self Tests Off and set Gyro FS to 250
        Wire.write(accel); // Self Tests Off and set Accl FS to 8g
 
	//Wire.write((gyro & 3) << 3);
	//Wire.write((a & 0xE0) | ((accel & 3) << 3)); // the extra is because I don't want to mess with
												 // ACCEL_CONFIG: XA_ST, YA_ST, ZA_ST
	uint8_t err = Wire.endTransmission();
	if(err != 0) {
		Serial.printf("endTransmission Failed=%d", err);
	}
 
	Wire.beginTransmission(addr);
	Wire.write(0x1b);
	//uint16_t count = Wire.transact(2);
        count = Wire.transact(2);
	if (count == 2) {
		uint8_t g = Wire.read();
		uint8_t a = Wire.read();
		Serial.printf("\n Read Back Setting raw Gyro=0x%x, raw Accel=0x%x", g, a);
	}
      Wire.beginTransmission(addr);
      Wire.write(0x1b);
      //uint16_t count = Wire.transact(2);
      count = Wire.transact(2);
      if (count == 2) {
          uint8_t g = Wire.read();
           uint8_t a = Wire.read();
          Serial.printf("\n Read Back Setting raw Gyro=0x%x, raw Accel=0x%x", g, a);
  }
      Wire.beginTransmission(addr);
      Wire.write(0x1b);
      //uint16_t count = Wire.transact(2);
      count = Wire.transact(2);
      if (count == 2) {
          uint8_t g = Wire.read();
           uint8_t a = Wire.read();
          Serial.printf("\n Read Back Setting raw Gyro=0x%x, raw Accel=0x%x", g, a);
  }
 }

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
  int a, g;
  Wire.beginTransmission(addr);
  Wire.write(0x1b);
  //uint16_t count = Wire.transact(2);
  uint16_t count = Wire.transact(2);
  if (count == 2) {
    uint8_t g = Wire.read();
    uint8_t a = Wire.read();
  }
  // a = Wire.read(); g = Wire.read();  --fix for ESP32 alternate I2C
  Gyro = (g & (bit(3) | bit(4))) >> 3;
  Accl = (a & (bit(3) | bit(4))) >> 3;
  Serial.print("\nAfter conversion read Gyro=0x"); Serial.print(g,HEX);
  Serial.print("  Accl=0x"); Serial.println(a,HEX);
}

The output was :
before Setting raw Gyro=0x0, raw Accel=0x0
Planned Settings for Gyro=0x0, raw Accel=0x10
Read Back Setting raw Gyro=0x0, raw Accel=0x10
Read Back Setting raw Gyro=0x10, raw Accel=0x0
Read Back Setting raw Gyro=0x0, raw Accel=0x0

before Setting raw Gyro=0x0, raw Accel=0x0
Planned Settings for Gyro=0x0, raw Accel=0x10
Read Back Setting raw Gyro=0x0, raw Accel=0x10
Read Back Setting raw Gyro=0x10, raw Accel=0x0
Read Back Setting raw Gyro=0x0, raw Accel=0x10

After conversion read Gyro=0x0 Accl=0x0

So I got different results from successive readings of the two registers.

I tried to compile the code in the standard esp32 env, but got an error on wire.transact().
But I did read the two registers several times in a row using the previous code and got:
Sending Gyro=0x0 Accl=0x10
After conversion read Gyro=0x0 Accl=0x10
After conversion read Gyro=0x0 Accl=0x10
After conversion read Gyro=0x0 Accl=0x10
After conversion read Gyro=0x0 Accl=0x10
.WiFI Calling-->0:54 AcX=1.00 T= *C AcZ=-0.02 mm P= mm Calc=0ms

The first three reads are before conversion data was really read and the last one after.

So, I didn't see any difference using your code than the original.

Jeff

I have not been able to generate an incorrect read.

But, I am using my current code, that has a few fixes applied since I originally release it.

How old is the esp32-hal-i2c.cpp file you are using?

Try grabbing the current ones from my repo and lets see if you can recreate that error.

both esp32-hal-i2c.cpp and esp32-hal-i2c.h

I am not dis-trusting you, I just don't know how to figure out what went wrong. None of my Error outputs (log_e) triggered so unless It is messing up the rxBuffer inside Wire?

There was a problem in the origional code when I released this repo. When did you download your copy?

add:

Serial.printf("there are %d bytes in the rxBuffer.",Wire.available());

after every Wire.transact();
All of them should say '2'

Chuck.

I agree that this apparent misreading of a particular register is baffling, especially with it reading so much other data from different registers and devices. For me, this isn't even a device that I have a current need for. I'm just giving your software a go on the several devices that I have in my drawer.

I have been using the hal files from two days ago provided in your post with esp32-hal-i2c.zip and the Wire library that was in your repository on 11/16.

I replaced the esp32-hal-i2c.cpp, esp32-hal-i2c.h and Wire library, with copies that I just downloaded from your git repository. The following output was generated on compile:

Compiling 'GY521' for 'ESP32 Dev Module'
Wire.cpp: In member function size_t TwoWire::transact(size_t)

Wire.cpp: 305:16: error: 'I2C_ERROR_MISSING_WRITE' was not declared in this scope
last_error = I2C_ERROR_MISSING_WRITE
Wire.cpp: In member function size_t TwoWire::transact(uint8_t*, size_t)

Wire.cpp: 323:16: error: 'I2C_ERROR_MISSING_WRITE' was not declared in this scope
last_error = I2C_ERROR_MISSING_WRITE
Error compiling libraries
Build failed for project 'GY521'

Sorry I forgot I did some clean up in Wire.cpp, You'll need to get that one also!

The days are rolling together. I did these changes and forgot about them.

Both Wire.h and Wire.cpp

I'll have to start a check list.

Chuck.

Chuck,
So, after I had the compile errors above I went back to the previous esp32-hal-i2c and Wire lib. I got to looking at the actual data that resulted after setting the configuration with the setConfig() that you provided (not just the gyro and accel registers) . It turned out not to be correct, not just improperly scaled by really off the mark. So looking at your code I noticed that there weren't many Wire.endTransmission statements. I didn't know whether some other function did that, but thought I would go ahead and throw one in for every Wire.beginTransmission and the resulting data was at least the same as when the gyro and accel registers were set by the original setMPU6050scales function.

As I said, I didn't write this code and don't have any experience using the Wire library directly. I took a look at the original code and it didn't have a Wire.endTransmission at the end of all the read and write sequences either. I noticed that when a read sequence initiated by Wire.requestFrom(addr, count, true) had taken 'count' reads, there wasn't a Wire.endTransmission. I don't know if there is supposed to be one, but as it turns out, when I put Wire.endTransmission at the end of those transactions the program gave the correct output. The same data as the standard ESP32 environment.
This is what I am talking about--I inserted the last line of code.

void getMPU6050scales(byte addr, uint8_t &Gyro, uint8_t &Accl) {
	Wire.beginTransmission(addr);
	Wire.write(0x1B); // starting with register 0x3B (ACCEL_XOUT_H)
	Wire.endTransmission(false);
	Wire.requestFrom(addr, 2, true); // request a total of 14 registers
	Gyro = (Wire.read()&(bit(3) | bit(4))) >> 3;
	Accl = (Wire.read()&(bit(3) | bit(4))) >> 3;
	Wire.endTransmission(true);    //---------inserted by jlh
}

I added a similar line after the read sequence that corresponds to the
uint16_t count=Wire.Transact(14)
from your code. I have attached the program code GY521_ino.txt. Look for "//------" to see what I changed from the original.

As far as I can see, this solves the problem with this device. You might understand what difference this is making. It seems like the endTransmission is acting to ensure the I2C buffers get emptied before the next beginTransmission is executed.

Jeff

Chuck,

When I got the compile error I also copied the more recent Wire library and got the same error.

Jeff

I just update my repo.
The correct sequence for Wire() is:

//Write data to device
Wire.beginTransmission(id);
Wire.write(byte); // up to 128 bytes between Begin/end
Wire.endTransmission(stop);

//Read data from device
Wire.requestFrom(id,len,stop);
while(Wire.available()){ // available() returns number of bytes in rxBuffer
   Wire.read();  // read returns an int any value 0..255 is valid, -1 means no more data available
    }

//********************
// I added transact() because it fits better with the NEW ESP32 the hardware.
//
// using transact() combines this structure
// ******
// existing Arduino AVR style
Wire.beginTransmission(id);
Wire.write(); // usually the register number to read from
Wire.endTransaction(false);

Wire.requestFrom(id,len,true);
//
// new style
Wire.beginTransmission(id);
Wire.write(); // usually the register number to read from
Wire.transact(len); // implied read of len bytes after START,WriteMode ID,writeDATA, ReSTART,ReadMode ID,readData,STOP
//

Don't throw in the extra endTransmissions();
That is a symptom of some other problem. Lets find it and Squash IT!

Chuck.

Chuck,
The code with the extra endTransmissions() ran overnight in a WiFi enabled program without issue.

I downloaded the latest version from your repository and copied the I2C-hal files as well as the Wire library. I commented out the extra endTransmissions() and the program still gave the correct readings and scale factors. Using your setConfig the written and reread values were the same, even with repeated reads. I went back to the original routines for setting and reading the scale factors and everything worked as expected. I've set it going with a WiFi enabled program and will see if there are any stability issues.

Whatever changes you made between the most recent and earlier versions seem to have fixed the problem with the GY521.

Jeff

@jlhavens That's good to hear.

I've had a test running that does block reads of 24LCxx eeproms using increasing buffer sizes, and decreasing starting addresses then doing a byte by byte comparison.

So far today it has moved 60,505,500 bytes without any errors.

It is currently taking 1.001 seconds to read 10,833 byte and do the compare.

Chuck.

Just shut down my data read test, it was up to 49,134 bytes at a time. Transferring 49134 bytes from a 24LC32 at 100kHz clock on the I2C transfer took 4550ms. the Compare operation 4ms. So, about 4.5seconds per transaction.

By my calculation it has move 1.2billion bytes in the last 28 hour 43 minutes 7 seconds! Without any Errors!

Chuck.

commented

I have been working with Phil the OP on the same code base. He had been getting the i2C timeouts and I hadn't. Consequently, I have been working trying to overcome issues of having BLE and Wifi enabled. During this process I had to download the latest esp32 core today in order to overcome heap issues. The ble and WiFi started partially working but then i got.the i2c timeout issue. I spent today looking into the issue and reading all the posts. Having the issue with the latest esp32 core I then used your forked build. Same issue. Moving back to my core esp32 build from Oct 19th no issue. But of course i have heap.issues which prevents ble/wifi working.

So, it seems this new i2c library works for some devices and not for others. I know the i2c bus on the mpu6050 is problematic from a timing perspective and no-see-me-dev had been working on the i2c issues in the past to correct. Obviously it was working in the past but not in the latest or your build. Unfortunately I don't use GIT so can't tell which build I have
only timestamps when i downloaded. I just saw a new commit to core to fix ble issues but unfortunately the commit breaks the build. When it is fixed i will download compile see if it fixes the timing issue.If not I will copy the 6 files you changed from my working build to the latest core and see what happens..

@ifrew when you try it, change your selected device to "ESP32 Dev Module", that will allow you to set the "Core Debug Level" to "ERROR". If the I2C fails it will send debug messages out the serial port. So, far, I have been able to successfully fix all problems encountered.

Later on today, I will commit an updated version, more cleanup, one fix of an edge condition that would cause Bus TimeOuts, and 10Bit master Read support.

With these cleanups, It is down to 4 files ,
libraries\Wire\src\Wire.h
libraries\Wire\src\Wire.cpp
cores\esp32\esp32-hal-i2c.h
cores\esp32\esp32-hal-i2c.c

Chuck.

commented

Ok.. I am using the node 32s board as for some reason even when I change the file upload size to the size I need for the esp32 dev board it still stays at the same value. Strange. I am using visual studio and visual micro addin for Arduino so my build configuration is different form ardunio and idf. However, I just defined ARDUHAL_LOG_LEVEL to 5 in the esp32-hal-log.h file toovercome this issues. So after downloading the latest core esp32 I still got the error and warning timeout messages spitting out. I then downloaded your forked build and copied the above 4 files to the esp32 core folder I was building form overwriting the existing ones. Cleaned the build and recompiled. No timeout messages!

Thanks for unblocking me! I can now get back to seeing if I can get the ble/wifi working together. They are not happy campers together!

Cheers

Iain

@ifrew That's What I like Hearing!

Chuck.

@stickbreaker - awesome work ! Working really well for me now - had one little hiccup in an hours running - but infinitely better , thanks so much !

Hardware is esp32 - generic board.

GY86 board ( ms5611, MPU6050, MPU9250 )

Error keeps repeating itself as

[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070116
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459157, ed=459207, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070195
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459283, ed=459333, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070213
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459410, ed=459460, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070292
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459537, ed=459587, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070311
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459664, ed=459714, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070390
[esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=459791, ed=459841, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff6490
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6510 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3ffd6ef0
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3ffd6eb4
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInt

@PhilColbert
this dump is showing that the ISR is not receiving the correct interrupts.
It is showing that It services a 0x2 (txFifoEmpty), loaded 2 bytes into the TxFifo
and That is All

[E][esp32-hal-i2c.c:902] i2cDumpInts(): row  count    INTR        TX       RX        TICK
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00070390

byteCnt=0 says no bytes moved out or in through the I2C statemachine.
You issued a Write to Device 0x77 of a Single byte, and SendStop=true

[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0

So, it looks like the I2C bus was in a busy state. but I don't record that in the dump.

Do you have any additional info?
Can your regenerate this error condition?
If you can regenerate it, I can add more debug to actually show the bus state and to to get an idea what is happening.

Can you describe your I2C environment what devices, physical connections, 3.3v bus, 5.0v?

This looks like a Bus Busy at initialization error.
The tick value is about 7 minutes 40 seconds since boot.
Had it been erring that whole time?
My code does not have any built in auto recovery for this.

Chuck.

@PhilColbert
In Wire.cpp
change TwoWire::reset() to this

void TwoWire::reset(void)
{
    i2cFreeQueue(i2c);
    i2cReleaseISR(i2c);
    i2cReset( i2c );
    i2c = NULL;
    begin( sda, scl );
}

and in your code where the Wire.endTransmission(true); is failing check for err=3

uint8_t err =Wire.endTransmission(true);
if(err==3){
  Wire.reset();
  } 

You'll have to change your code to retry that last operation.

Chuck.

The only way I have been able to generate this condition is to ground SCL during a transmission.

Chuck.

Well it should be possible to recreate but it does happen pretty randomly after a few minutes running - only had it twice in an hour recently but it does go again so if you would like to send me any code I can run and can debug this better ?

It ran perfectly until it starts erroring.

Hardware is

3.3v bus

connected to a GY86 board ( ms5611, MPU6050, MPU9250 )

Have other peripherals connected aswell ( gps and SPI lora board, running in a separate task ) - also out putting to a piezo on 2 channels.

Do you want to send any debug code so I can add or shall I just go ahead and add the reset code ?

Thanks for all the help - amazing work !

2 hours up - still happy - looking good .... :)

@PhilColbert I've been thinking, The bus busy error is a confusion issue. What I mean, is that the I2C bus is convinced that another master has started a transaction without completing it. The esp32 is "really" polite. It will never interrupt someone else's conversation. So it just sits there patiently waiting for it's turn. That never occurs. The Wire.reset() just starts over, clearing out this 'busy' state.

I am interested in why the SM think's another master is communicating. From your description you do not have any Other Masters on the I2C bus?

I was reading the MPU6050's data sheet, on page 34 of PS-MPU-6000A-00 REV 3.4 Release Date 08/19/2013, it says that it can stretch the SCL clock to pause the 'MASTER', Depending on how this is used, this could be the cause of the confusion. The DataSheet only has that one mention of the SCL clock stretching.

Chuck.

@PhilColbert if you want to try a circuit change, put a signal diode, like a 1n914 inline on the SCL line between the ESP32 and the MPU6050. The cathode facing the ESP 32, also you will need pullup resistors on both sides. This diode would prohibit the MPU6050 from stretching SCL. The esp would never see the stretched pulse, so there might be some garbled communications. But the esp would not get into that 'confused' state. (if this clock stretching is the cause 🤒 )

Chuck.

@PhilColbert next time you get an error, the First error is the most important. I've had more ideas (speculations) on what could be going wrong, but I need more data to crystallize my thinking. My latest musings are that the MPU6050 is creating a glitch on the SCL, or that the clock stretching is creating a timeout, or that the clock stretching is interfering with a STOP generation. Need more Info.

Chuck.

Ok, will take the reset code out and log whats going on, do you want me to add anymore debugging lines or anything ?

This is the first error that comes out 👎

[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=350502, ed=350552, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff64d4
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6568 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3fff4310
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3fff6cc0
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=2
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=1
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=1
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [1] ef R STOP buf@=0x3ffc9dd8, len=3, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0003 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [04] 0x0001 0x0002 0x0000 0x0000 0x00055926
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=350659, ed=350709, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff64d4
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6568 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3fff4310
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3fff7748
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000559c3
Elapsed 866120us
bAlt = 1009345 kfAlt = 19965 kfVario = 13
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=350791, ed=350841, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff64d4
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6568 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3fff4310
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3fff7748
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00055a47
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=350918, ed=350968, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff64d4
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6568 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3fff4310
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3fff7748
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00055ac6
[E][esp32-hal-i2c.c:1090] i2cProcQueue(): Gross Timeout Dead st=351045, ed=351095, =50, max=50 error=1
[E][esp32-hal-i2c.c:557] dumpI2c(): i2c=0x3ffc1df4
[E][esp32-hal-i2c.c:558] dumpI2c(): dev=0x60013000
[E][esp32-hal-i2c.c:559] dumpI2c(): lock=0x3fff64d4
[E][esp32-hal-i2c.c:560] dumpI2c(): num=0
[E][esp32-hal-i2c.c:561] dumpI2c(): mode=1
[E][esp32-hal-i2c.c:562] dumpI2c(): stage=3
[E][esp32-hal-i2c.c:563] dumpI2c(): error=1
[E][esp32-hal-i2c.c:564] dumpI2c(): event=0x3fff6568 bits=0
[E][esp32-hal-i2c.c:565] dumpI2c(): intr_handle=0x3fff4310
[E][esp32-hal-i2c.c:566] dumpI2c(): dq=0x3fff7748
[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W STOP buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0

@PhilColbert That first error shows clock stretching.

[E][esp32-hal-i2c.c:567] dumpI2c(): queueCount=2
[E][esp32-hal-i2c.c:568] dumpI2c(): queuePos=1
[E][esp32-hal-i2c.c:569] dumpI2c(): byteCnt=1
[E][esp32-hal-i2c.c:574] dumpI2c(): [0] ee W buf@=0x3ffc9e5e, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:574] dumpI2c(): [1] ef R STOP buf@=0x3ffc9dd8, len=3, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:902] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [01] 0x0001 0x0002 0x0003 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00055926
[E][esp32-hal-i2c.c:905] i2cDumpInts(): [04] 0x0001 0x0002 0x0000 0x0000 0x00055926

These are the interesting lines,
The DumpInts[] show that the first command when through the hardware.
[01] the txFifo was loaded with 3 bytes, the: beginTransmission(0x77), Write(byte), and then requestFrom(0x77) address.
[02] says the SM started processing the command queue
[03] says 3 bytes were moved, the beginTransmission(0x77), the Data Byte write, then the ReSTART with the requestFrom(0x77).
[04] is just the txEmpty request, The SM wants the txFifo to have at least 4 bytes, but, alas, you didn't want to send that many.

No more interrupts were generated, thefore the bus was busy, probably SCL was being stretched.

I hard coded a 50ms timeout in i2cProcQueue(), it looks like this device can exceed this timeout. Do you have any idea on the maximum timeout this device can generate?

I don't see any failure, just the 0x77 device hanging. What was the command you sent? What did it ask from the device?

Try increasing the 50ms timeout in esp32-hal-i2c.c:1033

//hang until it completes.

// how many ticks should it take to transfer totalBytes thru the I2C hardware,
// add 50ms just for kicks
 
portTickType ticksTimeOut = ((totalBytes /(i2cGetFrequency(i2c)/(10*1000)))+50)/portTICK_PERIOD_MS;
portTickType tBefore=xTaskGetTickCount();  

change that +50 to maybe +500 (from 1/20 sec to 1/2 sec)

Chuck.

@PhilColbert just committed with a new Wire.h function Wire.setTimeOut(millis); It defaults to 50ms but you can set it where ever you want. You can change it for different devices.

Chuck.

Thanks - just off out for a day or so - will check when I return- good idea about different devices !

@PhilColbert I updated my repo. Have you had any success?

Chuck.

@stickbreaker I am diving in your I2C code today ;) What is the best way to keep in touch?

@me-no-dev Probably email ctodd at cableone dot net GMT-6 7am:9pm

Chuck.

All seems good here now :) thanks !

@PhilColbert Since you are satisfied, close the issue.

Stumbled on this and since switching to your fork my i2c bus, which used to lock up after 6-7 hours, has now survived for 2 days.
(Wroom-32 with just a BME280 on i2c)
Thanks

@rlerdorf That's what I like hearing. @me-no-dev is reviewing the code for inclusion into the main branch. Hopefully in the near future, I2C will be a non issue.

Chuck.

commented

Hi Chuck,

I'm working with Phil Colbert on same project. I took your latest wire.cpp and .h, and esp32_hal_i2c.c and .h today and put them in my project as I was getting the same gross timeout errors. Reading through this thread and talking to Phil I see that you had suggested using a reset if endtransmission returned err-==3

uint8_t err =Wire.endTransmission(true);
if(err==3){
Wire.reset();
}

But later you suggest changing the timeout to be greater than 50ms. I have set that to 500ms now and my code is running with no reset in my code. But I am seeing these debug statements regarding timing recovery. Do I still need to do a reset if err ==3?

[E][esp32-hal-i2c.c:1113] i2cProcQueue(): TimeoutRecovery: expected=0ms, actual=2ms
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc12bc
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff4a80
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff4b18 bits=10
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff4b48
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8b6c
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=2
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] d0 W STOP buf@=0x3ffc6f76, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: ; 3b
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00045c54
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00045c54
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [03] 0x0002 0x0040 0x0000 0x0000 0x00045c54
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0000 0x00045c54

@ifrew This debug message can be commented out, It is something to see if the StateMachine could recover from a INTR 0x100 timeout. the dump you posted is kind of confusing. I says 'TimeoutRecovery' but there was NO timeout? The expected=0ms, actual=2ms is just showing that between the time that procQueue() marked it's start and when the ISR returned took longer than calculated. I would have you place a block comment around that warning;

Starting at line 1109 thru 1118 of esp32-hal-i2c.c:

if(eBits&EVENT_DONE){ // no gross timeout
/* add this block comment @ifrew
#if ARDUHAL_LOG_LEVEL >= ARDUHAL_LOG_LEVEL_ERROR
  uint32_t expected =(totalBytes*10*1000)/i2cGetFrequency(i2c);
  if((tAfter-tBefore)>(expected+1)) { //used some of the timeout Period
    // expected can be zero due to small packets
    log_e("TimeoutRecovery: expected=%ums, actual=%ums",expected,(tAfter-tBefore));
    i2cDumpI2c(i2c);
    i2cDumpInts();
    }
#endif
Clear to there @ifrew */ 
  switch(i2c->error){
    case I2C_OK :
      reason = I2C_ERROR_OK;
      break;
    case I2C_ERROR :
      reason = I2C_ERROR_DEV;
      break;
    case I2C_ADDR_NAK:
      reason = I2C_ERROR_ACK;

It looks like you have higher priority processes that are delaying procQueue(). This delay is not a problem. So, the error message is just a distraction.

On the I2C_ERROR_TIMEOUT (3), a valid TimeOut should only exist if:

  • Your I2C bus has multiple masters
  • One of your I2C Devices does SCL stretching (clock Stretching) for delay or sync purposes

If neither of these are true, then a TimeOut is indicative of a bus problem, and the Wire.Reset() is the only solution. @lonerzzz has a device that does clock stretching. When he initiates a READ Wire.requestFrom() it can hang the bus for up to 2 seconds. The problem is that the ESP32 does not recover correctly. The TimeOut aborts the current transaction. Instead of just delaying it. So, he has to reissue the READ command to acquire the data. With his device, Wire.reset(); does not recover the bus because the SLAVE is still holding SCL. So, He just re-issues the READ with a long timeout until it is successful. Once the SLAVE release SCL the StateMachine can accept new transactions without a Wire.reset().

Chuck.

commented

Awesome info! many thanks. I have read about clock issues and the i2c bus on this device we are using not fully adhering to the i2c spec so appreciate all the work you guys are doing.

commented

Well Chuck after putting in the reset and increasing the timeout it runs for a while and I was really happy but after about 15-20 min it starts doing this repeatedly
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd2f5b, end=0xd2f8d, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d2f5b
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd2fe7, end=0xd3019, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d2fe7
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd3073, end=0xd30a5, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d3073
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$GPRMC,,V,,,,,,,,,,N53
$GPVTG,,,,,,,,,N
30
$GPGGA,,,,,,0,00,99.99,,,,,,48
$GPGSA,A,1,,,,,,,,,,,,,99.99,99.9[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd30fe, end=0xd3130, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
9,99.99
30
$GPGSV,1,1,0079
$GPGLL,,,,,,V,N
64
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d30fe
Elapsed 480384us
bAlt = 584820 kfAlt = -2261 kfVario = -8
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd31a4, end=0xd31d6, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d31a4
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd3230, end=0xd3262, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d3230
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd32bc, end=0xd32ee, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d32bc
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd3348, end=0xd337a, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d3348
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd33d4, end=0xd3406, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d33d4
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd345f, end=0xd3491, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d345f
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$LK8EX1,101596,-22,-8,25,0
3A
$LK8EX1,101596,-22,-8,25,03A
$GPRMC,,V,,,,,,,,,,N
53
$GPVTG,,,,,,,,,N30
$GPGGA,,,,,,0,00,99.99,,,,,,48
$GPGSA,A,1,,,,,,,,,,,,,99.99,99.9[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd34ea, end=0xd351c, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
9,99.99
30
$PFLAU,6,1,2,1,0,144,0,235,446
55
$LK8EX1,101596,-22,-8,25,03A
$GPGSV,1,1,00
79
$GPGLL,,,,,,V,N*64
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d34ea
Elapsed 476283us
bAlt = 584820 kfAlt = -2261 kfVario = -8
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd358e, end=0xd35c0, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d358e
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd361a, end=0xd364c, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d361a
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd36a5, end=0xd36d7, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d36a5
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd3730, end=0xd3762, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d3730
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd37bc, end=0xd37ee, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5194
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff522c bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff525c
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3ffd8a98
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x000d37bc
[E][esp32-hal-i2c.c:1

@ifrew Can you state what the SDA and SCL lines are doing when the failure happens? The Wire.reset() call just resets the state machine on ESP32 but how you are able to proceed depends on the state of peripheral as well. @stickbreaker and I have been throwing around ideas on how to handle. Currently, I have the ability to toggle the enable pin on my I2C slaves and have been using that. You could also toggle power. The other option is trying to toggle the SCL line manually to get the slave out of the funny state if it is in one.

commented

Hi @lonerzzz,

Its early morning here. Unfortunately I don't have any scopes to check out the lines. However, after reading Chucks excellent doc https://github.com/stickbreaker/arduino-esp32/blob/master/libraries/Wire/docs/README.md just now I don't think I'm reading the i2c bus correctly to take advantage of his queuing. I'm using the Jeff Rowen I2Cdevlib library. When I get up I will modify that code to read the bus as per the new method recommended by Chuck and see what happens. I hadn't fully understood that there were some modifications to make as to how the bus should be read to avoid the timeouts. Sometimes it runs for a while other times it doesn't so looks like I have been lucky as to when the FreeRTOS scheduler switches tasks when I am reading the bus. With my sensor I'm reading the bus every 10ms via interrupts.

@ifrew

With my sensor I'm reading the bus every 10ms via interrupts.

Are you doing a Wire() call from an interrupt?
This library (Wire()) is not re-entrant. It assumes only one thread is accessing it. It does not serialize access.

Chuck.

@ifrew This debug dump show the I2C bus is HUNG. Some type of reset is necessary, The Wire.reset() will clear the I2C hardware of the ESP32, but it may not reset any slave Devices. @lonerzzz comment enable/reset/power on will be required. If the bus is hung because SDA is low, I have also used another pin to stimulate SCL. Connect a signal diode (1n914,1n4148) from another digital pin to SCL. Cathode on SCL. Do a sequence of 9 digitalWrite(pin,LOW); digitalWrite(pin,HIGH); this should cause any slave to release SDA. If SCL is LOW(problem) some type of hardware reset is required. The ESP32 can only release it's drive of the the Bus, it cannot force another Device to release it.

Chuck.

commented

Great info Chuck. In my ISR I only set a Boolean to signify data can then be read and return. Main loop checks the Boolean. Should be good there. I have a different mpu/ms5611 board I can try but I think that the underlying problem lies with the mpu6050 not adhering to the I2C protocol. The other mpu board I have with ms5611 on board is a 9250. This may be a better board to use but its just a 6050 with a magnetometer. For reference, this discussion seems to match what you found here. Looks like SDA is latched low and only a reset like you discussed is a fix. jrowberg/i2cdevlib#252

commented

An update for you guys. before I changed any code, I swapped out the mpu6050/ms5611 board for a mpu9250/ms5611 board and got same issues as above. So that at least was good to know. Swapped back to the mpu6050/ms5611 board and verified that I still got same errors.

I then noticed from Chuck's analysis that the address of the device was 0x77 that was get the bus busy errors. That is the MS5611 pressure sensor. So I thought I would at least clean up the code that was reading the sample to be in accordance with how Chucks said it should be read.. Before it was the traditional method. So now it is as follows

uint32_t MS5611::ReadSample(void) {
//Serial.println("RS");
Wire.beginTransmission(MS5611_I2C_ADDRESS); // Initialize the Tx buffer
Wire.write(0x00); // Put ADC read command in Tx buffer
uint8_t err = Wire.endTransmission(false); // keep connection alive
if (err == 7) {
uint8_t count = Wire.requestFrom(MS5611_I2C_ADDRESS, 3);
if (Wire.lastError() != 0) {
Serial.printf("Bad Stuff!!\nRead of (%d) bytes read %d bytes\nFailed"
" lastError=%d, text=%s\n", 3, count, Wire.lastError(),
Wire.getErrorText(Wire.lastError()));
}
int inx = 0;
uint8_t buf[3];
while (Wire.available()) {
buf[inx++] = Wire.read();
}
uint32_t w = (((uint32_t)buf[0]) << 16) | (((uint32_t)buf[1]) << 8) | (uint32_t)buf[2];
return w;

}
else return 0;

}

Compiled and its now running with no errors yet as opposed to the errors I got last time. Not getting any 0 values returned yet either. I am expecting that since I am now waiting for the queued read to be available,, timing wise, the bus becomes free at some point and all is working. Will go out for lunch and drive around with my device running for a while and see what I get. will report back later

commented

Ok.. spoke too soon! Bus hang again. Ok.. So now confirmed that the mpu6050 is keeping the bus busy. Will need to do the hardware reset trick you had thought of above.

@ifrew Sounds like you have identified the problem. Now to construct a reliable recovery method. Good Luck.

Chuck.

commented

@stickbreaker I had a thought over the last day as to why Phil's code is running fine and I was getting errors so I got him to send me his 4 files from your forked build ie the wire and esp32-hal-i2c files. I put them in and voila mine is working fine now. I did a comparison and I can see quite a few changes. The wire.cpp file seems overall to be the same with the exception of some field size changes. However there are quite a few changes in the ep32-hal-i2c file that I think are causing issues, ie a regression from what phil and I have working now. Looking though your code, and I am not an expert by no means in what you have written, I see that the gross timeout is handled differently. In your latest code I am getting those then busy timeouts then bus hangs. In the older code it works fine. I am attaching the four files so you can compare and see if you can think why the earlier version works fine and the latest doesn't . Many thanks for all of this BTW. Phil and I would be stuck otherwise.

esp32-hal-i2c.c.zip

@ifrew I'll study the differences, Can you explain the differences between what you are seeing as compared to Phil?

Chuck.

commented

With the code from your earlier fork, it is running fine now for both Phil and I. Phil hasn't tried the latest fork from you so isn't seeing any errors now. Both of us are using the exact same hardware and configuration. I would expect if Phil used the latest 4 files he would get same issue. From the last dump I sent above that you said was a bus hang, the only part of the trace missing I noticed now was the initial gross timeout error before all the busy timeouts occurred. I can repo and send you the initial gross timeout message if would help you?

@ifrew yea, send me the complete debug output. here or in my mail box ctodd at cableone dot net.

chuck.

commented

Dang.. I think I may have found the issue! I left the code that was working yesterday running overnight and it was still running this morning after 18hours. I then ran your code that I said failed and it was still running after 40minutes, longer than it had been. Strange I thought, so I picked up my device which is on a breadboard and for some reason I decided to check the wires..bang..the error occurred as attached in the debug output. So I thought that my SDA or SCl line may have not had a goof connection. So I reset them and its running now. I then pulled out the SCL line and all ran, albeit with incorrect resuls for my app, but then put it back and it started working again. I then pulled off the SDA line and bang..the timeout errors appeared.

So it looks like I may have had a flaky SDA connection. When I look at the code difference from your latest versus what you had before, I see that you now let through the gross timeout errors whereas before you didn't.

So now I am going to leave your latest code running and see what happens! How frustrating and bad of me to waste your time if this is the issue. Dang it!

I will post an update later if all is still running.

putty.txt

commented

Still running after 2:45 now...I think that was the issue. Sorry guys.

@ifrew Well, keep testing it. I do need to clean out some of the debug code anyway. I tend to contemplate a problem before I start chasing it, so I had not written any new code yet. I'll look through your debug capture and think about how to use the events to detect/report SDA issues more succinctly. Right now no one trusts the I2C bus so they don't trust the Error messages to be accurate. I am hoping this revision will gain the trust back.

Chuck.

commented

Well @stickbreaker it bombed out again . I don't know how long it had been running but less than 12 hours with your latest code. So the old code seems to work better. I looked at it thoroughly today and there are really little changes I could see but there are a couple of places in the hal code that may make a difference. I hadn't touched my setup at all after I made sure the connections were right but another gross timeout occurred. I should state i'm using your default of 50ms. However since Phil and I had it running longer with your old version than the latest version there is something in there that seems strange.. I have a 201MB log file output but still the same gross timeout that occurred first..

[E][esp32-hal-i2c.c:1159] i2cProcQueue(): Gross Timeout Dead start=0xd67ece, end=0xd67f00, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5180
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff5218 bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff5248
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3fff57e8
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=2
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=1
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=1
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: . 00
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [1] ef R STOP buf@=0x3ffc6e90, len=3, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: ... 01 18 ff
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0003 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [04] 0x0001 0x0002 0x0000 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd67f81, end=0xd67fb3, =50, max=50 error=1
[E][esp32-hal-i2c.c:609] i2cDumpI2c(): i2c=0x3ffc125c
[E][esp32-hal-i2c.c:610] i2cDumpI2c(): dev=0x60013000 date=0x16042000
[E][esp32-hal-i2c.c:611] i2cDumpI2c(): lock=0x3fff5180
[E][esp32-hal-i2c.c:612] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:613] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:614] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:615] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:616] i2cDumpI2c(): event=0x3fff5218 bits=0
[E][esp32-hal-i2c.c:617] i2cDumpI2c(): intr_handle=0x3fff5248
[E][esp32-hal-i2c.c:618] i2cDumpI2c(): dq=0x3fff5a4c
[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W STOP buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: X 58
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00d67f81
[E][esp32-hal-i2c.c:1149] i2cProcQueue(): Busy Timeout start=0xd6800d, end=0xd6803f, =50, max=50 error=1

@ifrew

[E][esp32-hal-i2c.c:619] i2cDumpI2c(): queueCount=2
[E][esp32-hal-i2c.c:620] i2cDumpI2c(): queuePos=1
[E][esp32-hal-i2c.c:621] i2cDumpI2c(): byteCnt=1
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [0] ee W buf@=0x3ffc6f16, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: . 00
[E][esp32-hal-i2c.c:582] i2cDumpDqData(): [1] ef R STOP buf@=0x3ffc6e90, len=3, pos=0, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:598] i2cDumpDqData(): 0x0000: ... 01 18 ff
[E][esp32-hal-i2c.c:942] i2cDumpInts(): row count INTR TX RX
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [01] 0x0001 0x0002 0x0003 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00d67ece
[E][esp32-hal-i2c.c:945] i2cDumpInts(): [04] 0x0001 0x0002 0x0000 0x0000 0x00d67ece

base on this debug, the Wire.requestFrom(0x77,3) is where the problem happened. The StateMachine(SM) sent out the

Wire.beginTransmission(0x77);
Wire.write(0);
Wire.endTransmission(false);

When it started the requestFrom() the slave acknowledged the read command, then started stretching SCL. My library worked correctly. The slave device / I2C bus is causing the problem. There is nothing I can do from software to make a slave release the SCL signal.

Other people have had I2C issues that have been resolved by having the correct pullup resistance. One person when from 10k pullups to 2.4k and their problems disappeared. If you are using 3.3V I2C the pullups should be between 1k and 3.3k, for 5V 1.8k to 4.7k. If you have the bus speed at 100kzHz the higher values. As the bus speed increases more power is needed, so the pullup resistor values need to be reduced.

The default timeout for the library is 50ms you can increase it with Wire.setTimeout(uint16_t millisec);
The library can handle this SCL stretching, but the Timeout Value needs to be longer than the SCL stretching period. @lonerzzz has to use a 2000ms timeout because one of his devices can stretch SCL that long. If my library issues a I2C_ERROR_TIMEOUT your application will have to do some error recovery. My library will just shutdown the I2C hardware and return.

The SM may not handle the timeout caused by a SCL stretching event correctly, (I haven't been able to test it). @lonerzzz has a device that does this SCL stretching. after a SCL stretching event, his requestFrom() fails to transfer the data, but the SM closes the transaction correctly with a STOP. So his next call succeeds. The library returns I2C_ERROR_TIMEOUT, he uses this value to re-issue the requestFrom(). If he receives I2C_ERROR_BUSY he issues Wire.reset(); to recover the bus.

If the specified timeout expires, The bus will be left in a ERROR state. This shutdown may be the error cascade start point. Since the SM is the 'current' master, and it is in a transaction (START has be issued), but the specified timeout (50ms) has expired. This expired timeout causes the abort. (shutting down the SM). Since the bus is currently busy(SCL held low), the SM cannot close the transaction with a STOP(SDA going high while SCL is HIGH). The SM knows the 'current' bus state is busy, because it saw a START (the one it send out) but has not seen a STOP(abort canceled the transaction so it will NEVER SEND a STOP). From this point forward the SM will be hung waiting for someone to issue a STOP. It will wait forever.

This busy condition will inhibit the SM from using the I2C bus. Until it sees the bus clear (STOP detected, SCL HIGH, SDA HIGH).

The application needs to issue a Wire.reset() if warranted.

Bus Busy is a valid condition in a multimaster environment. If another master is using the bus the ST will wait up to the timeoutPeriod to execute it's commands. In your case, is sounds like the bus is hung by something else, either a Slave device stretching SCL, or a noise spike on SDA that is interpreted as a START. If the SDA line goes low while SCL is high, the SM will interpret that event as a START by another master. It will listen on the bus for a STOP, (SDA going HIGH while SCL is already HIGH). until this STOP is heard the SM will not attempt to use the bus.

What is the device at 0x77 I would like to read it's datasheet to try to understand what it is doing.

Chuck.

commented

Always great to read your detailed responses as I learn something new all the time. ! Many thanks. Currently i do not have pullups on the sda/scl lines. I haven't checked but i am assuming the wire library implementation enables the internal pull ups using the i2c hal code? I was going to check that today and add external pull-ups so your response comes at the right time. The device we are using is a gy-86 breakout board. This contains a ms5611 pressure sensor at 0x77 and a mpu6050 0x68 on the same bus. Based on the info you gave and other threads i have read the mpu6050 seems not to strictly adhere to i2c protocol implementation and can latch the sda low. I expect this is what is happening and when i then try and access the ms5611 i get the bus busy errors.

So sounds like the earlier code with it running for 18hrs once may be just a fluke and at some point it may fail?

@ifrew The internal pullups are in the range of 30k to 50k. Not enough.

The newer code implements bus busy return codes. Older code had a longer configured timeout.

Chuck.

@ifrew Reading through the MS5611 spec sheet it does NOT say to use a ReSTART when you select the register for read

Wire.beginTransmission(0x77);
Wire.write(0x48); // initial pressure conversion 
Wire.endTransmission();
// conversion can take up to 9ms
uint32_t timeout=millis();
bool ready=false;
uin32_t reading=0;
while(!ready && (millis()-timeout<10ms)){
  Wire.beginTransmission(0x77);
  Wire.write(0);
  ready = (Wire.endTransmission()==0);
  }
If(ready){
  Wire.requestFrom(0x77,3);
  if(Wire.lastError==0){
    while(Wire.available){
      reading = reading << 8;
      reading = reading + Wire.read();
      }
    }
  else {
    Serial.printf("Read of pressure sensor failed (%u)=%s\n",Wire.lastError(),Wire.getErrorText(Wire.lastError())));
    }
  }
else {
  Serial.printf("Sensor did not finish conversion within 10ms");
}

Chuck.

@ifrew Are you actually using a gy-86? all of the schematics I can find show a module designed to interface to a 5v CPU.

They have level shifters and a 3.3v regulator onboard? 4.7k pullup to 5v and 4.7 pullup to 3.3v equiv of 2.8k to 5v?

Chuck.

commented

Yes using the gy-86. Voltage input can be from 3.3v to 5v. It has an onboard regulator and pull up resistors seemingly but i don't know the values of those. I can't find schematics either. Do i need to disable internal pullups before adding externals in range 1k to 3.3k? I have been using this board with the same drivers running on esp8266 and no pullups without issues for many months. On the esp32, due to multiple cores and task switching as you pointed out we can get timeouts.

The code I am using to read the sensor is

uint32_t MS5611::ReadSample(void) {

Wire.beginTransmission(MS5611_I2C_ADDRESS);  // Initialize the Tx buffer

Wire.write(0x00);                        // Put ADC read command in Tx buffer

Wire.endTransmission(false);        // Send the Tx buffer, but send a restart to keep connection alive

Wire.requestFrom(MS5611_I2C_ADDRESS, 3);     // Read three bytes from slave PROM address 

int inx = 0;

uint8_t buf[3];

while (Wire.available()) {

    buf[inx++] = Wire.read(); 

	}          

uint32_t w = (((uint32_t)buf[0])<<16) | (((uint32_t)buf[1])<<8) | (uint32_t)buf[2];

return w;

}

I got this from another project on GitHub.

@ifrew If it is similar to the schematic I found, it already has a low enough resistance, don't add more. The built in pullups are inconsequential.

That code I posted was base on my reading of the datasheet. in the SPI section is show a 8.02ms delay while the conversion is in progress, the sdo line is used as a 'READY' signal. on Page 3 is a table of conversion time. Different resolutions have different times.

The code you just posted is only reading the last conversion. it is not initiating a sample sequence.

Try building a simple sketch using this code:

uint32_t startTime=millis();
Serial.printf("Starting conversion Cycle at %u\n",startTime);
Wire.beginTransmission(0x77);
Wire.write(0x48); // initial pressure conversion 
Wire.endTransmission();
if(Wire.lastError()!=0){
  Serial.printf("Conversion command Failed =%d\n",Wire.lastError());
  }
else {
  // conversion can take up to 9ms 
// test conversion complete using NAK polling, may not work with this device?
  uint32_t timeout=millis();
  bool ready=false;
  uin32_t reading=0;
  while(!ready && (millis()-timeout<10ms)){
    Wire.beginTransmission(0x77);
    Wire.write(0);
    ready = (Wire.endTransmission()==0);
    if(!ready) Serial.print('.');
    else Serial.println('+');
    }
/* // if NAK polling doesn't work, then use a fixed timeout
  while(millis()-startTime<10); // run in circles shouting 'Please Work'
  ready=true;
*/
// if device ACK'd then it is listening. So ask for data.
  If(ready){
    Wire.requestFrom(0x77,3);
    if(Wire.lastError==0){
      while(Wire.available){
        reading = reading << 8;
        reading = reading + Wire.read();
        }
     uint32_t endTime=millis();
     Serial.printf(" completed conversion cycle  at %u, duration =%d, value =%d\n",endTime,(endTime-startTime),reading);
      }
    else {
      Serial.printf("Read of pressure sensor failed (%u)=%s\n",Wire.lastError(),Wire.getErrorText(Wire.lastError())));
      }
    }
  else {
    Serial.printf("Sensor did not finish conversion within 10ms");
  }
}

this should not generate any timeouts.

The datasheet does not explicitly state that it will honor NAK busy polling. Most I2C devices will.
The standard is to do a Write sequence to test the NAK (ready) state. I have never used READ(requestFrom) to test a ready state.

Chuck.

commented

Thanks for that Chuck. I had only showed the reading part of the sensor code to show the endtransmission(false) part. The full code does trigger pressure and temp samples alternatively with 10ms delays between so it can read the data properly . However, doesn't do any ready checking like you have above. I get spurious readings every so often and it may be due to that. I will look to modify accordingly. Thanks for all your input on this. Much appreciated!

Hi,
I use a BME280E connected with I2C on a Wemos Lolin 32 and I am regularly losing the connection with the BME module.
Currently the only solution I have is to reboot the Wemos. Is there a final solution for this issue or should I better connect it via SPI.

Thanks
Robert

@rrobinet are you using my fork stickbreaker/arduino-esp32 or the official fork (this one)?

If you are not using my fork, try it. My fork is a revision to implement I2C reliably.

Chuck.

Thanks Chuck,
I am loading it now and will see if its is more stable
Robert

@stickbreaker
Hi Chuck,
Any idea when this version will be incorporated in the official set?
Robert