Forums Niryo One Troubleshooting Burst CAN connection problem with motor

Viewing 12 posts - 1 through 12 (of 12 total)
  • Author
    Posts
  • FindingNiryo
    Participant
    Post count: 97

    Every few hours V2 rpi_setup.launch logs [ERROR] CAN connection problem with motor 1,2, or 3 (plus 4 on V1).

    A string of these errors occur within a few seconds for a single motor triggering CanComm : Stop hardware control loop followed by DXL : Start hardware control loop, Set calibration flag, CAN : start hardware control loop.

    This occurs while robot is powered on but otherwise idle and requires arm to be recalibrated.

    Edouard Renard
    Keymaster
    Post count: 230

    Yes, this can happen sometimes if the CAN bus starts to be too busy, but we have managed to run several robots (idle or not, it does not matter for this test) without this kind of issue for several hours.

    This scenario should be really rare though. Chances are increasing if the cable connection is not perfect. Maybe check if the connectors are correctly fixed, I guess if you reinforce some parts of the connector that will reduce the frequency at which you get this error (and maybe you’ll never get the error again).

    FindingNiryo
    Participant
    Post count: 97

    Isn’t all traffic on the CAN bus operating at fixed update frequencies?

    I would not expect Idle robots without torque engaged to have much variation in CAN bus traffic.

    Since each NiryoStepper motor has failed on both V1 & V2 robots it does not seem likely to be a connector issue.

    Is NiryoStepper implementing random back-off on bus collision detect?

    Edouard Renard
    Keymaster
    Post count: 230

    Here are more info to let you understand how things work on Niryo One:

    – The problem with the CAN bus is that it’s a multi-master bus. So, the RPi will send commands to each motor, but also each motor will send some data (at a pre-defined frequency, and each node of the CAN bus has a unique ID). Thus, we can’t really control when exactly data is sent/received, and it’s possible that suddenly the CAN bus is saturated and thus will preempt the frames from lower priority nodes (the ones with bigger IDs).

    – Second problem is that the MCP2515 we’re using has only 2 buffers for incoming messages. So the RPi will receive data from 3 motors (4 for V1), and the MCP2515 will have to handle all that with 2 buffers, which is quite limited compared to other components that can allow more than 6 buffers.

    Conclusion: CAN is a great communication bus, but maybe not the best type of bus when working on a master/slave system (RPi master + motor slaves). Why we chose to use it ? Well because it was a much better solution than what we tried before (I2C or SPI daisy-chain), and still affordable to setup + not too complicated with the components that are available for us. (same reasoning for MCP2515)

    Hope this gives you a better understanding. And as you can see, using or not using torque is not a variable is this problem, it’s mostly about the bus itself. That’s why I emphasized the physical connector quality, which is the only thing that you can play with to get better results. (and from the tests that we do here, we almost never have a problem with the CAN bus for new robots).

    FindingNiryo
    Participant
    Post count: 97

    I continue to see idle robot arms spontaneously restart the hardware control loop (requiring recalibration).
    I understand the factors you described in the working of the CAN system, but this failure mode is quite disruptive and there are many much more demanding applications which make use of the CAN bus without such faults.
    Could you please look into the bus traffic and message handling queues of Niryo_One & Niryo_Stepper to confirm interrupts and bus collision protocols are properly implemented?

    Edouard Renard
    Keymaster
    Post count: 230

    Finally I guess that it’s more of a cable issue than a saturated CAN bus. We have continuously improved the cables for Niryo One, and the tested robots we ship now have zero connection problems when being idle. (even if we still see some CAN frames dropped, but not enough to trigger the connection problem)

    If this problem persists, please contact our technical support and we’ll see what we can do for you!

    FindingNiryo
    Participant
    Post count: 97

    There are no cable issues. The robot at rest spontaneously triggers a NiryoStepper bus reset and requires re-calibration. I originally asked if the NiryoStepper firmware incorporated random back-off to accommodate occasional bus collisions. Is this an area of research for your team?

    Edouard Renard
    Keymaster
    Post count: 230

    With all the tests we’ve made on recent robots, I can almost guarantee you that the issue comes from cables, or more probably the connectors on motor boards.

    Even if the robot is idle, a bad quality connector can mess up with communication. This is the case in some of the first robots we shipped. As for now, this problem doesn’t happen anymore (same software, same electronic boards, but improved connectors and cables).

    Please contact our technical support and we’ll help you solve this! 🙂

    FindingNiryo
    Participant
    Post count: 97

    I have reached out to technical support but not received a response.

    thomas
    Keymaster
    Post count: 3

    Hi ,

    we’ve just replied to your request by email .

    Sincerelly

    FindingNiryo
    Participant
    Post count: 97

    Support, thank you for your assistance in trying to resolve this problem.
    After receiving the reconditioned arm and upgrading the firmware on stepper 1 to version 2.0 I booted the robot and established wifi connectivity (switched from hotspot mode). I performed an automatic calibration and let the robot sit. Within 1 hour there was one CAN bus error. within seven hours the CAN bus errors triggered a “calibration required” condition.
    Here is the log:
    [2019-05-16 14:59:21] Writing calibration offsets to file : 3:558 2:987 1:596
    [2019-05-16 14:59:21] Resume can hw control
    [2019-05-16 14:59:21] CAN : Start hardware control loop
    [2019-05-16 15:51:50] CAN connection problem with motor 1, hw fail counter : 0
    [2019-05-16 15:51:50] CAN connection problem with motor 1, hw fail counter : 1
    [2019-05-16 15:51:50] CAN connection problem with motor 2, hw fail counter : 0
    [2019-05-16 15:51:51] CAN connection problem with motor 2, hw fail counter : 1
    [2019-05-16 15:51:51] CAN connection problem with motor 2, hw fail counter : 0
    [2019-05-16 21:39:11] CAN connection problem with motor 1, hw fail counter : 0
    [2019-05-16 21:39:11] CAN connection problem with motor 3, hw fail counter : 0
    [2019-05-16 21:39:12] CAN connection problem with motor 3, hw fail counter : 1
    [2019-05-16 21:39:12] CAN connection problem with motor 3, hw fail counter : 2
    [2019-05-16 21:39:12] CAN connection problem with motor 3, hw fail counter : 3
    [2019-05-16 21:39:12] DXL : Start hardware control loop
    [2019-05-16 21:39:13] Stop Can hw control
    [2019-05-16 21:39:13] CanComm : Stop hardware control loop
    [2019-05-16 21:39:13] Set calibration flag
    [2019-05-16 21:39:13] CAN : Start hardware control loop
    [2019-05-16 21:39:13] DXL : Start hardware control loop

    FindingNiryo
    Participant
    Post count: 97

    Looking at NiryoStepper code shows Initialize MCP2515 running at 16MHz with a baudrate of 1000kb/s
    each stepper is transmitting 800 bits per second position data, 32 bits per second diagnostic data, and 12.8 bits per second firmware version.

    niryo_one_driver/src/hw_comm/can_communication.cpp
    hw_write_frequency; // 200 Hz (reset to 50 Hz by ROS param) sends for each motor:
    synchronizeSteppers() 1600 bits/sec
    sendPositionCommand() 3200 bits/sec
    sendMicroStepsCommand() 1600 bits/sec
    sendMaxEffortCommand() 1600 bits/sec

    Total bus traffic 8844.8 bits/second per (3) stepper in NiryoOne V2 = 26.5344 Kbits (2.6% of bus bandwidth)
    There should be no significant collisions on the CAN bus with this little amount of traffic.

    If “CAN connection problem” is due to hardware errors from connectors perhaps lower the CAN bus speed to CAN_200KBPS
    I also suggest reading and reporting MCP_CAN::errorCountRX() and MCP_CAN::errorCountTX()

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic.