Random data aborts

This is where you talk about the NXJ software itself, installation issues, and programming talk.

Moderators: 99jonathan, roger, imaqine

Random data aborts

Postby matejdro » Fri Jun 08, 2012 2:06 pm

Hello, me again.

I'm getting random data aborts through my program. It appears that they always occur at random spot in my program and at random time. Sometimes it can work for minutes without any problems and sometimes it throws error after a few seconds. Here are some of them:

Code: Select all
PC 0010547A
AASR 00650072
ASR 00020001
OPCODE ???
DEBUG1 00000000
DEBUG2 00000000

PC 0010547A
AASR 00560041
ASR 00020001
OPCODE ???
DEBUG1 00000000
DEBUG2 00000000

PC 00107A3A
AASR 00380036
ASR 00020202
OPCODE ???
DEBUG1 00000000
DEBUG2 00000000

PC 00107AB2
AASR 005A006A
ASR 00020001
OPCODE ???
DEBUG1 00000000
DEBUG2 00000000


Whole program is quite complex and big, so I don't think I can upload it completely. But I suspect that culprit is RS485, which I use to communicate with AVR slave (hsRead and hsWrite methods).

I have commented out most of the code from my program and this is what it currently does in forever loop:
- Start a new thread that runs RS485. Everything from now on happens in that thread. Main thread then sleeps forever (it does something in real program, but for testing purposes I commented it out).
- Send a byte via RS485
- Receive 15 bytes via RS485
- Clears RS485 buffer (read until there is nothing to read)
- Calculate CRC - if result is not correct, repeat above 3 steps
- Print result on the monitor

Any idea what could be causing these data aborts?
matejdro
Novice
 
Posts: 54
Joined: Wed Mar 14, 2012 9:10 am

Re: Random data aborts

Postby skoehler » Fri Jun 08, 2012 2:34 pm

Usually, data aborts are a result of mixing the linker and firmware of different leJOS versions or a leJOS snapshot. But the data aborts caused by that tend to happen more deterministically and less random.
Can you make sure, that there's only one leJOS version on your computer? Can you re-flash the NXT brick?
Do you have the chance to test another NXT?
Do you draw heavy current from the NXT so that the voltage of the NXT may drop?

And of course, there might be a bug inside the leJOS firmware, resulting in memory corruption which then might lead to a data abort.
skoehler
leJOS Team Member
 
Posts: 1458
Joined: Thu Oct 30, 2008 4:54 pm

Re: Random data aborts

Postby gloomyandy » Fri Jun 08, 2012 2:50 pm

What version of leJOS are you running? Is there anyway you can reproduce this using standard NXT hardware (say with two NXTs), it will be hard for us to reproduce this as you are using your own hardware. Perhaps you could post the RS485 part of your code for us to take a look at...

Andy
User avatar
gloomyandy
leJOS Team Member
 
Posts: 4239
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Random data aborts

Postby matejdro » Fri Jun 08, 2012 4:00 pm

I have 0.9.1 beta. Here are some parts of code:

this executes first:
Code: Select all
RConsole.open();
dataFetcher.start();
while (true) Delay.msDelay(10000);  //Forever loop because everything else in main thread is not used in this test


dataFetcher thread:
Code: Select all
public void run()
   {
      while (!stop)
      {
         RConsole.println("clearing screen...");
         LCD.clear();
         RConsole.println("displaying data...");
         LCD.drawString("Gx" + AVR.gSenzorX.getResult() + " Gz" + AVR.gSenzorZ.getResult(), 0, 5);
         LCD.refresh();
                  
         RConsole.println("Fetching data from AVR...");
         
         AVR.updateData();
         RConsole.println("sleeping...");
         Delay.msDelay(100);
      }
   }


AVR part:
Code: Select all
public static int[] blizine = new int[5];
public static Averager gSenzorX = new Averager(5);
public static Averager gSenzorZ = new Averager(5);

public static void updateData()
   {
         RConsole.println("Updating data...");
         byte[] data = new byte[16];
         while (!request(18, data, 14)) //Loop until valid data is received
            Delay.msDelay(1);
         RConsole.println("Data transfer complete!");
         
         //Store data into variables
         for (int i = 0; i < 5; i++)
         {   
            blizine[i] = EndianTools.decodeUShortLE(data, i * 2);
         }
         
         gSenzorX.addValue(EndianTools.decodeShortLE(data, 11));
         gSenzorZ.addValue(EndianTools.decodeShortLE(data, 13));
         
         RConsole.println("Data updated!");
   }

private synchronized static Boolean request(int requestByte, byte[] data, int length)
   {   
      //rs485Busy is here to prevents multiple methods from multiple threads to access RS485 at the same time.
      //But currently there is only one method and one thread accessing RS485 so this part is irrelevant
      while (rs485Busy) Delay.msDelay(1);
      rs485Busy = true;
      
      RConsole.println("Writing data...");
      RS485.hsWrite(new byte[] { (byte) requestByte },  0, 1);
      
      int received =0;
      
      long start = System.currentTimeMillis();
      
      //While loop to read all needed bytes + one byte for CRC
      while (received < length + 1)
      {
         RConsole.println("Reading...");
         received += RS485.hsRead(data, received, length + 1);
         Delay.msDelay(1); //I read that this is better than Thread.yield()?
                     
         //Timeout if AVR slave does not send enough data for some reason
         if (System.currentTimeMillis() - start > 200)
         {
            rs485Busy = false;
            return false;
         }
         
      }
               
      RConsole.println("Emptying buffer...");
      while (RS485.hsRead(new byte[1], 0, 1) > 0); //Just in case
      
      rs485Busy = false;
      if (CRC.calcCRC(data, 0, length + 1) == 0)
      {
         return true;
      }
      else
      {
         return false;
      }
   }


CRC class will calculate CRC:
Code: Select all
 public static byte calcCRC(byte[] bytes, int offset, int length)
    {
       byte crc = 0;
       for (int i = offset; i < length; i++)
       {
          crc = (byte) (crc ^ bytes[i]);
          int uCrc = crc & 0xFF;
          crc = crcTable[uCrc];
       }
       
       return crc;
    }


Averager is simple class that is used to average readings from sensors:
Code: Select all
private float [] values;
   private final int size;
   private Float cache = null;

   public Averager(int size)
   {
      this.size = size;
      values = new float[size];
   }
   
   public void addValue(float value)
   {
      System.arraycopy(values, 0, values, 1, size - 1);
      values[0] = value;
      
      cache = null;
   }
   
   public float getResult()
   {
      if (cache != null) return cache;
      float result = 0;
      for (int i = 0; i < size; i++)
         result += values[i];
      
      result /= (float) size;
      
      cache = result;
      return result;
   }


I hope code is not too long and readable. This is basically all that happens, with exception of several other variables being printed to screen, but I don't think that could cause it.

There are so many RConsole statements, because I wanted to figure out where it errors out. But each time last message was different, so it's random.
matejdro
Novice
 
Posts: 54
Joined: Wed Mar 14, 2012 9:10 am

Re: Random data aborts

Postby gloomyandy » Fri Jun 08, 2012 6:20 pm

Hi,
Please make sure you are using the very latest version of leJOS as 0.9.1-3 as this contains a data abort fix.

Andy
User avatar
gloomyandy
leJOS Team Member
 
Posts: 4239
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Random data aborts

Postby matejdro » Fri Jun 08, 2012 6:27 pm

Oh yes, I have 0.9.1-2. I will test it tomorrow and report.

Thanks.
matejdro
Novice
 
Posts: 54
Joined: Wed Mar 14, 2012 9:10 am

Re: Random data aborts

Postby matejdro » Mon Jun 11, 2012 10:30 am

Yup, it appears to be fixed.

Thanks!
matejdro
Novice
 
Posts: 54
Joined: Wed Mar 14, 2012 9:10 am


Return to NXJ Software

Who is online

Users browsing this forum: Yahoo [Bot] and 3 guests

more stuff