Navigator

This is where you talk about the EV3 software itself, installation issues, and programming talk.

Moderators: roger, gloomyandy, skoehler

Re: Navigator

Postby epascual » Mon Feb 24, 2014 1:33 pm

malulam wrote:The stacktrace shows a DeadLock.
Good work. I think you've located the culprit ;) I am impatient to be back home to experiment with this option.

My configuration is exactly the same as yours in terms of pilot and navigator.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby epascual » Mon Feb 24, 2014 10:40 pm

Hi,

Have done some investigations, thanks to Ctrl-\

I could notice that there are a lot of synchronized code calling synchronized code, calling in turn... guess what ?... synchronized code. Since this happens while running concurrent threads, it can easily create deadlocks. After commenting out all synchronized modifiers I could find in the path, it seems that the thing is working.

But it's clear that some synchronization must be put back to secure the situation. My guess is that it should only be put at the lowest level, i.e. only when accessing ibuf in EV3MotorRegulatorKernelModule class, and not in any of the calling methods. I'll try this strategy tomorrow to see if it works as expected.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby gloomyandy » Mon Feb 24, 2014 10:54 pm

Hi,
I don't think things are as bad as you think. Have you tried simply making the change I described (removing synchronized on the getTachoCount method)? Most of the synchronized tags are required (but this one is not) and have been present for a long time, I added a few new ones as part of the new motor code and this is what is causing the problems. The real issue here is the use of listeners and the fact that they hold locks. I have plans to resolve this in the longer term but for now I'd just like to get things working again.

Andy
User avatar
gloomyandy
leJOS Team Member
 
Posts: 3881
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Navigator

Postby epascual » Mon Feb 24, 2014 10:58 pm

Hi Andy,
Have you tried simply making the change I described (removing synchronized on the getTachoCount method)?
Yes I did it as the first attempt, and it didn't fix the problem. I then tried to "unroll" the wire, removing others one by one until it worked.

The real issue here is the use of listeners and the fact that they hold locks.
About the listeners, I noticed too that there are some synchronization "stacks" such as EV3MotorPort.checkComplete being synchronized and calling the listeners with EV3MotorPort.getTachoCount() (which is synchronized too on the same instance) as one of their args.

I can try investigating this track on next session.

Eric
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby gloomyandy » Mon Feb 24, 2014 11:20 pm

Please remove the synchronized from getTachoCount and post a stack trace of the deadlocked code. It's not a good idea to just remove the tags.
User avatar
gloomyandy
leJOS Team Member
 
Posts: 3881
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Navigator

Postby epascual » Tue Feb 25, 2014 9:27 am

gloomyandy wrote:Please remove the synchronized from getTachoCount and post a stack trace of the deadlocked code. It's not a good idea to just remove the tags.
I agree with you. I'll do it tonight.

However I have noticed the the version of the sources I use is not the same as the one you pointed me in a previous message. I'm working with 0.6.0 tag, as suggested in the public documentation. As far as I remember (I'm not using my private machine at this time), the version of the sources I'm working with does not use the same type of synchronization (synchronized method versus synchronized statements) as the one you have pointed to. It could be wiser maybe to try modifying first the synchronization type before removing it. What's your opinion ?
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby gloomyandy » Tue Feb 25, 2014 8:01 pm

I'm not aware of any significant difference between the master version and 0.6.0-alpha in this respect, both versions use a mixture of synchronization mechanisms.
User avatar
gloomyandy
leJOS Team Member
 
Posts: 3881
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Navigator

Postby epascual » Tue Feb 25, 2014 11:24 pm

gloomyandy wrote:I'm not aware of any significant difference between the master version and 0.6.0-alpha in this respect, both versions use a mixture of synchronization mechanisms.
Sorry, I've been misled by getTachoCount existing in two classes in the same source file.

I've applied the suggested fix and it seems to behave normally now. No more lock for the moment.

Since the robot is navigating now, I found something weird elsewhere, causing the computed heading to be off 90 degrees after a couple of moves. I have to discuss with Porgy and investigate a bit more to clarify what happens.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby gloomyandy » Tue Feb 25, 2014 11:31 pm

Hi,
thanks for the update. Let me know how the two of you get on!
User avatar
gloomyandy
leJOS Team Member
 
Posts: 3881
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Navigator

Postby epascual » Wed Feb 26, 2014 10:18 pm

Let me know how the two of you get on!
We have lot of fun together :)

We've not yet found the misbehaviour mentioned before, but noticed that when Navigator.clearPath() is used to abort a path in progress, the moveStopped() listener is called twice in a row.

Here is a part of the program trace, the stack trace being dumped from the moveStopped listener :
Code: Select all
[DEBUG ] moveStarted event=TRAVEL 300.0 at 100.0 provider=lejos.robotics.navigation.DifferentialPilot@bfd66a thread=13
[DEBUG ] seen something (color=BLACK)
[DEBUG ] ask navigator to abort current path
[DEBUG ] moveStopped event=TRAVEL 269.54865 at 100.0 provider=lejos.robotics.navigation.DifferentialPilot@bfd66a thread=1
java.lang.Exception: Stack trace
   at java.lang.Thread.dumpStack(Thread.java:1344)
   at org.pobot.ev3.demos.PobotGripper.moveStopped(Unknown Source)
   at lejos.robotics.navigation.DifferentialPilot.movementStop(DifferentialPilot.java:1016)
   at lejos.robotics.navigation.DifferentialPilot.rotationStopped(DifferentialPilot.java:976)
   at lejos.internal.ev3.EV3MotorPort$EV3MotorRegulatorKernelModule.checkComplete(EV3MotorPort.java:182)
   at lejos.internal.ev3.EV3MotorPort$EV3MotorRegulatorKernelModule.waitComplete(EV3MotorPort.java:403)
   at lejos.hardware.motor.BaseRegulatedMotor.waitComplete(BaseRegulatedMotor.java:218)
   at lejos.robotics.navigation.DifferentialPilot.waitComplete(DifferentialPilot.java:1037)
   at lejos.robotics.navigation.DifferentialPilot.stop(DifferentialPilot.java:451)
   at lejos.robotics.navigation.Navigator.stop(Navigator.java:299)
   at lejos.robotics.navigation.Navigator.clearPath(Navigator.java:125)
   at org.pobot.ev3.demos.PobotGripper.run(Unknown Source)
   at org.pobot.ev3.demos.PobotGripper.main(Unknown Source)
[DEBUG ] moveStopped event=TRAVEL 269.54865 at 100.0 provider=lejos.robotics.navigation.DifferentialPilot@bfd66a thread=1
java.lang.Exception: Stack trace
   at java.lang.Thread.dumpStack(Thread.java:1344)
   at org.pobot.ev3.demos.PobotGripper.moveStopped(Unknown Source)
   at lejos.robotics.navigation.DifferentialPilot.movementStop(DifferentialPilot.java:1016)
   at lejos.robotics.navigation.DifferentialPilot.stop(DifferentialPilot.java:452)
   at lejos.robotics.navigation.Navigator.stop(Navigator.java:299)
   at lejos.robotics.navigation.Navigator.clearPath(Navigator.java:125)
   at org.pobot.ev3.demos.PobotGripper.run(Unknown Source)
   at org.pobot.ev3.demos.PobotGripper.main(Unknown Source)
[DEBUG ] pathInterrupted wp=Point2D.Float[0.0, 300.0] pose=X:0.188687 Y:269.54575 H:88.53333 seq=0 thread=1
[DEBUG ] path cleared

Some details about the context, so that messages are a bit more understandable : the robot is a coloured object chaser, which detect objects in its path, and depending on their colour bring them to distinct locations. The first part of the action is to travel until either it reaches the limit (300 mm) or finds something in its path.

My understanding of the situation is that there is some ping-pong game between the pilot and the motors when stopping the move, which leads to listeners being called a first time in response to the motors telling the pilot that they are stopped by invoking the its rotationStopped callback from checkComplete, which triggers in turn the listeners because calling movementStop, and after that the pilot calling again movementStop near the end of its stop method.

I tried this quick and dirty hacks in DifferentialPilot class :
Code: Select all
   public void stop()
   {
       _suspend_listeners = true;
       try {
          _left.stop(true);
          _right.stop(true);
          waitComplete();
       } finally {
            _suspend_listeners = false;
       }
      movementStop();
      setMotorAccel(_acceleration);
   }


Code: Select all
   private synchronized void movementStop()
   {
       if (_suspend_listeners) return ;
      for (MoveListener ml : _listeners)
         ml.moveStopped(new Move(_type, getMovementIncrement(),
               getAngleIncrement(), _robotTravelSpeed, _robotRotateSpeed,
               isMoving()), this);
   }

It eliminated the extraneous calls, without breaking anything as far as I could test.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby gloomyandy » Wed Feb 26, 2014 10:47 pm

Thanks for the report. Not really my area of expertise to be honest. But another reason why I really don't like listeners very much. They are very easy to get wrong, and fixes tend to be very ugly because you have to supply all of the context via sort of global state. Anyway enough of my moaning. I'll try and get one of the other devs that has a better understanding of this stuff to take a look.
User avatar
gloomyandy
leJOS Team Member
 
Posts: 3881
Joined: Fri Sep 28, 2007 2:06 pm
Location: UK

Re: Navigator

Postby epascual » Thu Feb 27, 2014 9:22 am

About listeners, you should read this paper : http://www.osgi.org/wiki/uploads/Links/whiteboard.pdf. I think you'll appreciate :)

It happens that I've worked with OSGi for professional activities for several years, and I 100% agree about what the authors says about preferring other patterns such as the white board for instance. The involved project is an embedded open platform for ambient instrumentation of buildings, for research work at the beginning, but which is also used nowadays for operational systems (photo-voltaic fields monitoring for instance). We recently moved from OSGi to another approach for relaxing the "all Java" constraint (we need our platform be implementation language agnostic and allow multi-language implementation of its components, just like in a Linux working environment you can find parts written in C/C++, others in bash, others in Python,...), but we kept the white board principle, and replaced the OSGi integrated communication bus by D-Bus. I'm not a ROS specialist, but I have understood that it works in a similar way.

WRT to the Navigator module, there are BTW a couple of things which surprised me a bit at implementation level, and some parts of the code look a bit convoluted too (from my humble point of view). I rewrote them trying to clarify the logic while working on understanding how it works. I can send you my version if ever you are interested in. Let's be clear : I'm not telling that it was not good work, but just that I would have done things differently. I can be wrong of course.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby roger » Fri Feb 28, 2014 2:20 am

I agree with Andy; I do not think the Event Generator - Listener pattern is a good one to use. I think we could eliminate all the listeners and perhaps it would be thread safe. I have looked at the existing code a bit, and I think the major change would be in the PoseProvider. Ir could have a method void UpdatePose(Pose aPose, Move aMove) As small modification of the Navigator would be necessary also. I have not tried anything out, and would to look at any alternative Navigator designs.
But before spending much time, there should be some agreement about this change to the Navigation classes. This is a good time to do it, and the change would not be such a big departure from NXT as the SeensorFramework.
Roger
roger
Moderator
 
Posts: 357
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

Re: Navigator

Postby epascual » Fri Feb 28, 2014 8:47 am

As said above, I agree with both of you wrt listeners. They are ok for simple collaboration schemes but as soon as things get more complicated, they are a source of brain damages, especially when cascades or loops of listeners are created, most of the time by side effect.

Roger, which kind of architecture would you favor so that the notification connections are not hardwired ? I' mentioned stuff like the white board pattern since IMHO, although it uses a form of listener inside, it cures the listeners spaghetti plate syndrome by being more or less based on a hub communication topology.
Eric PASCUAL - POBOT association VP & co-founder - http://www.pobot.org
epascual
Active User
 
Posts: 123
Joined: Sun Jan 17, 2010 12:15 am
Location: Sophia-Antipolis (France)

Re: Navigator

Postby roger » Fri Feb 28, 2014 11:36 pm

Hi Eric,
I recommend a very simple structure for navigation object collaboration: There should be only one object that controls the drive motors ( a Pilot) and only one object that controls the Pilot, a Navigator for example. The Navigator is responsible for updating its pose, and uses a PoseProvider to do the calculations. The PoseProvider can be greatly simplified; it needs only the method: void UpdatePose (Pose p, Move m). Pilot already has a method Move getMovement(); Whenever the Navigator needs an updated pose, it calls poseProvider. update (myPose, myPilot.getMove). But it must update its pose whenever the Pilot completes a movement so it contains an inner class that tests pilot.isMoving() in its main loop and updates the pose every time the a movement is completed. With this restricted object collaboration, there is no need for listeners. Of course, it is not the most general structure imaginable, but it seems to me that requiring a single object to control the pilot and that is the single object to control the motors - constitutes about the simplest scheme to do the job.
Roger
roger
Moderator
 
Posts: 357
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

PreviousNext

Return to EV3 Software

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest

more stuff