This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Thingy91: pynrfjprog deadlocks at 100% cpu usage

I'm using the script from https://devzone.nordicsemi.com/f/nordic-q-a/53208/updating-nrf9160-modem-firmware-through-the-command-line/215357#215357

Under strace, it never forks any other process and doesn't create the file it's looking for, so I'm really not sure what it's expecting to happen.

Full strace at http://triffid-hunter.no-ip.info/pynrf-flash.strace since this forum thing keeps throwing errors when I try to attach it.

https://devzone.nordicsemi.com/f/nordic-q-a/62397/nrfjprog-bug-gets-stuck-due-to-temporary-directory-permissions seems related however my permissions are fine, and the directory tree in /tmp is recreated identically if I wipe it first.

Since this is apparently the only way to update the NRF91 modem firmware from a terminal, and all the examples seem to be complaining about out-of-date modem firmware, I'm once again stuck.

Parents Reply
  • Hi,

     

    I have been discussing your issues with a couple of colleagues.

    To sum up the behavior (there might be more, but these are the two we have been able to spot):

    * Cannot access /tmp/nrfjprogdll fully

    * nrf connect seems to find an older JLink driver (v6.30h) - this driver supports Cortex M33, but not the "nrf9160_xxaa" device profile.

     

    We do have an open issue related to /tmp/nrfjprogdll path, which is related to if two users try to use nrfjprog. This will then be problematic for the second user (unless its root, or the users share rights), due to permissions to the folder and files within the folder.

    A typical issue of this matter is that nrfjprog (and pynrfjprog) shows lines like these in the strace:

    stat("/tmp/nrfjprogdll/highlevel/file12YHdh", 0x7ffd8bf8a230) = -1 ENOENT (No such file or directory)

    You can try to run "sudo chown $MY_USER:$MY_USER /tmp/nrfjprogdll -R" in that case, and see if the issue disappears. We are addressing this specific issue in the upcoming nrfjprog v10.10.0.

     

    As for the JLink driver installation, from the trace it looks like you have an older installation in your home folder. Is this correct? If yes, could you try to remove that?

    The default location it looks at is "/opt/SEGGER/JLink" (which is a symlink to /opt/SEGGER/JLink_$(VERSION), on my end: /opt/SEGGER/JLink_V680a/).

    Could you try to install as described above, ie. /opt/SEGGER/JLink ?

     

    Could you share which specific debugger you are using?

     

    Kind regards,

    Håkon

Children
  • We do have an open issue related to /tmp/nrfjprogdll path, which is related to if two users try to use nrfjprog. This will then be problematic for the second user (unless its root, or the users share rights), due to permissions to the folder and files within the folder.

    A typical issue of this matter is that nrfjprog (and pynrfjprog) shows lines like these in the strace:

    stat("/tmp/nrfjprogdll/highlevel/file12YHdh", 0x7ffd8bf8a230) = -1 ENOENT (No such file or directory)

    I only have one user on this machine, and changing the owner had no effect since I already own everything in that path.

    According to strace, pynrfjprog never attempts to create that file/dir, which is why it doesn't exist:

    13529 openat(AT_FDCWD, "/tmp/nrfjprogdll/highlevel", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
    13529 lstat("/tmp/nrfjprogdll/highlevel", 0x7ffd6c987050) = -1 ENOENT (No such file or directory)
    13529 mkdir("/tmp/nrfjprogdll/highlevel", 0777) = 0
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba", 0x7ffd6c987230) = -1 ENOENT (No such file or directory)
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba.lock", 0x7ffd6c987120) = -1 ENOENT (No such file or directory)
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba.lock", 0x7ffd6c987070) = -1 ENOENT (No such file or directory)
    13529 openat(AT_FDCWD, "/tmp/nrfjprogdll/highlevel/file4bYtba.lock", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
    13529 write(2, "DEBUG:pynrfjprog.HighLevel:HighLevel_dll_open:\tCopy \"/\"->\"/tmp/nrfjprogdll/highlevel/file4bYtba\"\n", 97) = 97
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba", 0x7ffd6c986e90) = -1 ENOENT (No such file or directory)
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba", 0x7ffd6c987230) = -1 ENOENT (No such file or directory)
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba", 0x7ffd6c987230) = -1 ENOENT (No such file or directory)
    13529 stat("/tmp/nrfjprogdll/highlevel/file4bYtba", 0x7ffd6c987230) = -1 ENOENT (No such file or directory)

    This is after I deleted /tmp/nrfjprogdll, as you can see it recreates /tmp/nrfjprogdll/highlevel, creates file4bYtba.lock, but never tries to create file4bYtba itself - then spins in an endless loop waiting for it to exist.

    Note the lack of any -1 EACCESS (Permission denied) responses, which would show up if there was a permission issue.

    What exactly is supposed to create that file? It seems that that thing is failing, but pynrfjprog fails to capture any error.
    It says it's trying to copy something ("/" apparently..?), but strace can't see it actually try to achieve this.

    For your multi-user permissions issue, you could use mktemp -d or use /tmp/nrfjprogdll-$username$ - but that won't help with my  case since I don't (seem to) have a permissions problem.

  • Hi,

     

    This is the strace (here for reference: pynrfjprog.7z) from my side when doing a modem fw upgrade via pynrfjprog:

    stat("/opt/SEGGER/JLink/libjlinkarm.so.6.80.1", {st_mode=S_IFREG|0755, st_size=18584272, ...}) = 0
    openat(AT_FDCWD, "/tmp/nrfjprogdll/highlevel", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3

     

    For comparison, this is the same procedure in your strace:

    stat("/", {st_mode=S_IFDIR|0755, st_size=178, ...}) = 0
    openat(AT_FDCWD, "/tmp/nrfjprogdll/highlevel", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3

     

    This looks like a weakness in our libraries (or the search algorithm).

    I believe that pynrfjprog isn't able to successfully find your segger driver path. Could you please place it in /opt/SEGGER/JLink to see if it is able to detect it properly then?

     

    What distro are you running?

    Which debugger are you currently using?

     

    Kind regards,

    Håkon

  • I made /opt/SEGGER/JLink a symlink to my JLink_V680a folder, and NRF Connect programmer is rather happier about it (I'm not even sure how it was finding the older version before, it never asked me where to look), but pynrfjprog doesn't seem to have improved.

    I've tried with both a regular JLink (possibly a knockoff) as well as an official Nordic NRF9160-DK delivered by your FAE today.

    I'm using Gentoo because it's the only distribution I've tried that doesn't fight me when I tell it what I want.

    It certainly does seem like your software needs improvements in its error handling, a simple "The JLink library at <path> is too old, please update" or "The JLink library can't be found in <searchpath>" would have saved a ton of drama - with the specific path and the issue with them presented in this manner, it would take but a moment to drop a symlink there.

    PS: why aren't you using environ[LD_LIBRARY_PATH] for this? It exists specifically for this exact purpose, and I already have my JLink library path listed in there so nrfjprog can find it.

  • I reinstalled pynrfjprog and it seems to be working now.

    It seems like my problems were caused by your software totally ignoring LD_LIBRARY_PATH and instead trying to silently use a hard-coded path outside my home dir, coupled with a complete lack of usable error checking and reporting.

    Now I'm finding that the modem won't talk to anything but I'll make a separate post about that.

  • Hi,

     

    I have to apologize for the bumpy ride wrt. running our tools in Linux. We are continuously trying to improve, and you have very good feedback for us. It is highly appreciated!

    Triffid_Hunter said:
    It certainly does seem like your software needs improvements in its error handling, a simple "The JLink library at <path> is too old, please update" or "The JLink library can't be found in <searchpath>" would have saved a ton of drama - with the specific path and the issue with them presented in this manner, it would take but a moment to drop a symlink there.

    Yes, it is unfortunately restrictions in the current built binary, which I will bring up with the team as improvements in the future. 

     

    Triffid_Hunter said:
    PS: why aren't you using environ[LD_LIBRARY_PATH] for this? It exists specifically for this exact purpose, and I already have my JLink library path listed in there so nrfjprog can find it.

     I'll also discuss this with them.

     

    Triffid_Hunter said:
    but pynrfjprog doesn't seem to have improved.

    Is the strace equal as before? ie. loads of "stat("/tmp/nrfjprogdll/highlevel/file12YHdh", 0x7ffd8bf8a230) = -1 ENOENT (No such file or directory)" ?

    The file<rand> is a copy of the local /path/to/JLink/libjlinkarm.so, as shown here:

    hkn /tmp/nrfjprogdll/highlevel 
    $ file /opt/SEGGER/JLink/libjlinkarm.so.6.80.1 
    /opt/SEGGER/JLink/libjlinkarm.so.6.80.1: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=329721bc307868771dcf0d4ee24af5790fc4b16d, stripped
    
    hkn /tmp/nrfjprogdll/highlevel 
    $ file filezT91qO
    filezT91qO: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=329721bc307868771dcf0d4ee24af5790fc4b16d, stripped
    

    If that is still not created, I suspect that pynrfjprog still isn't able to successfully find the segger installation.

    As a quick workaround, can you try to copy the libjlinkarm.so to the tmp/nrfjprogdll/highlevel/file<rand-prefx> and see how it behaves?

    I see we have a race-condition in updating the thread, as re-installing pynrfjprog seems to have helped on this scenario. 

    Kind regards,

    Håkon

Related