Concurrency Bug in OdometryPoseProvider?

This is where you talk about the NXJ software itself, installation issues, and programming talk.

Moderators: 99jonathan, roger, imaqine

Concurrency Bug in OdometryPoseProvider?

Postby Orzeszek » Tue Sep 20, 2011 1:14 pm

We're building a 'minesweeping' robot for a university course, and we've run into what appears to be a concurrency bug in the OdometryPoseProvider in leJOS NXJ 0.9.0. As best as I can tell, it occurs when getPose() is called at about the same time as moveStopped(...).

Here's some code which should allow you to reproduce the bug:

Code: Select all
import java.util.*;

import lejos.nxt.*;
import lejos.robotics.localization.*;
import lejos.robotics.navigation.*;

/**
 * Main class and entry point for the Minestorm Robot.
 *
 * @author Krzysztof Dziemborowicz
 * @version 2011.0920
 */
public class MinestormRobot {
   
    // A blank line that is LCD.DISPLAY_CHAR_WIDTH long
    private static final String BLANK_LINE = "                ";

    /**
     * <code>MinestormRobot</code> entry point.
     *
     * @param args command line arguments (ignored).
     */
    public static void main(String[] args) {
       
        // Make controllers
        DifferentialPilot pilot = new DifferentialPilot(5.6, 11.75, Motor.B, Motor.C);
        OdometryPoseProvider poseProvider = new OdometryPoseProvider(pilot);
        NavPathController pathController = new NavPathController(pilot, poseProvider);
       
        // Set speed
        pilot.setTravelSpeed(10.0);
        pilot.setRotateSpeed(45.0);

        // Make a long zig-zag path
        ArrayList<WayPoint> waypoints = new ArrayList<WayPoint>(4);
        int x = 0, y = 0;
        for (int i = 0; i < 100; i++) {
            if (i % 4 == 0) {
                x += 20;
            } else if (i % 4 == 2) {
                x -= 20;
            } else {
                y += 20;
            }
            waypoints.add(new WayPoint(x, y));
        }
       
        // Trace the path
        pathController.followRoute(waypoints, true);
        long lastScreenRefresh = Long.MIN_VALUE;
        while (pathController.isGoing()) {
            // Hammer getPose()
            Pose pose = pathController.getPoseProvider().getPose();
            // Show pose on the screen every 100 ms
            if (lastScreenRefresh + 100 < System.currentTimeMillis()) {
                drawCenteredLine(Integer.toString((int) pose.getX()), 0);
                drawCenteredLine(Integer.toString((int) pose.getY()), 1);
                drawCenteredLine(Integer.toString((int) pose.getHeading()), 2);
                lastScreenRefresh = System.currentTimeMillis();
            }
        }
        Pose pose = pathController.getPoseProvider().getPose();
        drawCenteredLine(Integer.toString((int) pose.getX()), 0);
        drawCenteredLine(Integer.toString((int) pose.getY()), 1);
        drawCenteredLine(Integer.toString((int) pose.getHeading()), 2);
       
        Sound.twoBeeps();
        Button.waitForPress();
    }

    /**
     * Draws a centred string on the LCD on the specified line, clearing the line before doing so.
     *
     * @param s the string to draw.
     * @param line the line number on which to draw the string.
     */
    protected static void drawCenteredLine(String s, int line) {
        LCD.drawString(BLANK_LINE, 0, line);
        if (s != null) {
            if (s.length() > LCD.DISPLAY_CHAR_WIDTH) {
                s = s.substring(0, LCD.DISPLAY_CHAR_WIDTH);
            }
            LCD.drawString(s, (LCD.DISPLAY_CHAR_WIDTH - s.length()) / 2, line);
        }
    }
}


I've made an attempt at my own implementation of OdometryPoseProvider (based heavily on the one that came with leJOS NXJ 0.9.0) that appears to solve the problem for us:

Code: Select all
import lejos.geom.*;
import lejos.robotics.localization.*;
import lejos.robotics.navigation.*;

/**
 * Keeps track of the robot <code>Pose</code> using odometry (dead reckoning) data contained in a
 * Move, which is supplied by a MoveProvider. This class is based on the OdometryPoseProvider
 * supplied with leJOS, except that it works.
 *
 * @author Krzysztof Dziemborowicz
 * @version 2011.0920
 */
public class OdometryPoseProvider implements PoseProvider, MoveListener {

    // Current pose
    private float x = 0, y = 0;
    private float heading = 0;

    // Current move
    private MoveProvider mp;
    private float baseAngle, baseDistance;

    /**
     * Creates a new <code>MinestormPoseProvider</code>.
     *
     * @param mp a <code>MoveProvider</code> to listen to.
     */
    public OdometryPoseProvider(MoveProvider mp) {
        mp.addMoveListener(this);
    }

    /**
     * Returns the current estimate of the robot's <code>Pose</code>.
     *
     * @return the robot's <code>Pose</code>.
     */
    @Override
    public synchronized Pose getPose() {
        if (mp != null) {
            updatePose(mp.getMovement());
        }
        return new Pose(x, y, heading);
    }

    /**
     * Sets the current <code>Pose</code>.
     *
     * @param pose the current <code>Pose</code>.
     */
    @Override
    public synchronized void setPose(Pose pose) {
        Point loc = pose.getLocation();
        x = loc.x;
        y = loc.y;
        heading = pose.getHeading();
    }

    /**
     * Invoked whenever a movement starts.
     *
     * @param move the move that just started.
     * @param mp the MoveProvider that invoked this method.
     */
    @Override
    public synchronized void moveStarted(Move move, MoveProvider mp) {
        this.baseAngle = 0;
        this.baseDistance = 0;
        this.mp = mp;
    }

    /**
     * Invoked whenever a movement stops.
     *
     * @param move the move that just started.
     * @param mp the MoveProvider that invoked this method.
     */
    @Override
    public synchronized void moveStopped(Move move, MoveProvider mp) {
        updatePose(move);
        this.baseAngle = 0;
        this.baseDistance = 0;
        this.mp = null;
    }

    /**
     * Updates the current pose with whatever incremental movement has occurred since the last move
     * started.
     *
     * @param move the current move.
     */
    private synchronized void updatePose(Move move) {

        float angle = move.getAngleTurned() - baseAngle;
        float distance = move.getDistanceTraveled() - baseDistance;

        if (move.getMoveType() == Move.MoveType.TRAVEL || Math.abs(angle) < 0.2f) {
            double headingRad = Math.toRadians(heading);
            x += distance * Math.cos(headingRad);
            y += distance * Math.sin(headingRad);
        } else if (move.getMoveType() == Move.MoveType.ARC) {
            double headingRad = Math.toRadians(heading);
            double angleRad = Math.toRadians(angle);
            double radius = distance / angleRad;
            x += radius * (Math.sin(headingRad + angleRad) - Math.sin(headingRad));
            y += radius * (Math.cos(headingRad) - Math.cos(headingRad + angleRad));
        }
        heading = normalise(heading + angle);

        baseAngle = move.getAngleTurned();
        baseDistance = move.getDistanceTraveled();
    }

    /**
     * Returns the angle normalised, such that -180 < angle <= 180.
     *
     * @param angle an angle.
     * @return the normalised angle.
     */
    private static float normalise(float angle) {
        while (angle > 180) {
            angle -= 360;
        }
        while (angle <= -180) {
            angle += 360;
        }
        return angle;
    }
}


Can anyone else reproduce the problem? And can anyone confirm whether my analysis of what is going on here is correct? Or am I way off?
Orzeszek
New User
 
Posts: 4
Joined: Tue Sep 20, 2011 1:02 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby roger » Fri Sep 23, 2011 8:02 pm

Hi Orzeszek,
I have not been able to reproduce a concurrency problem. I am using classes from the latest snapshot from the repository. At the end of your 100 waypoint route and at the end, the calculated pose as displayed in the LCD as integers is 0, 999, 90. As a float: 0.1396, 999.95, 90.075
At the other 99 waypoints, x and y errors in the calculated pose are less tan 0.6 cm , and usually much smaller.
However, in attempting to reproduce your problem, I did discover a bug in the Navigator class. Now repaired.
Good luck with your course project.
Roger
roger
Moderator
 
Posts: 359
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

Re: Concurrency Bug in OdometryPoseProvider?

Postby Orzeszek » Sat Sep 24, 2011 3:11 am

Hi Roger,

Thanks for looking into this. I think the problem still occurs. It's most apparent in the heading. For example, while turning from 0 to 90 degrees, it will go 0, 10, ... 80, 90, 0, 10 ... That is, it will (sometimes) snap back to 0 degrees when the original rotation stops. It will then do the rotation a second time.

When the DifferentialPilot stops a movement, it calls movementStop(), which in turn calls moveStopped(...) on each registered MoveListener. The OdometryPoseProvider should then call updatePose(...), update the pose, and set current = true (because the movement has stopped). Any subsequent call to getPose() on the OdometryPoseProvider should do nothing. Since the robot is not moving, the pose shouldn't be changing.

When the DifferentialPilot starts a new movement, it calls movementStart(...), which in turn calls reset() to reset the tachometers and then moveStarted(...) on each registered MoveListener. The OdometryPoseProvider then resets angle0 and distance0 and sets current = false. Each subsequent call to getPose() on the OdometryPoseProvider should call updatePose(...) to get an updated position based on the change in the tachometer readings in the DifferentialPilot since the last time the pose was updated. That's how it should work.

Imagine what happens, though, if getPose(...) is called on the OdometryPoseProvider and the thread is pre-empted after it gets passed the if (!current) test but before it executes updatePose(...). Suppose during the thread's pre-emption, the DifferentialPilot stops one move and begins starting the next move. Suppose further that that thread is itself pre-empted while the DifferentialPilot is executing movementStart(...), after it has called reset() but before it has called moveStarted(...) on each registered MoveListener. The system is now in an inconsistent state, because the movement increment and the angle increment in the DifferentialPilot are now 0, but distance0 and angle0 in the OdometryPoseProvider are not (because they get reset only after moveStarted(...) is called).

So, suppose that the first thread now resumes and updates the pose by executing the updatePose(...) call. When it calls event.getAngleTurned() (for example) the DifferentialPilot returns 0 (because reset() had already been called). However, angle0 and distance0 have not yet been reset to 0 because moveStarted(...) hasn't been called yet, so when it calculates the change in, say, the heading by taking the difference between event.getAngleTurned() and angle0 it gets a meaningless result.

In the rotation example, in the last few calls to updatePose(...), the first line evaluating the change in the heading would evaluate as follows (assuming it was updated approximately every 10 degrees):

1. float angle = 70 - 60;
2. float angle = 80 - 70;
3. float angle = 90 - 80;
4. float angle = 0 - 90; // reset() called on DifferentialPilot so event.getAngleTurned() returns 0, but angle0 is still set to 90 because moveStarted(...) hasn't been called, so the change in angle is -90 degrees, and the robot's pose appears to snap back.

This of course doesn't happen on every turn, which is why you need 100 waypoints to demonstrate it. The end position (on the screen) will always be correct, because the NavPathController always steers the robot to the waypoint from where it thinks the robot is. But if you watch the screen for the entire 100-waypoint trip, you'll see nonsense values come up every now and then (well, we see them...) :?

It's odd that you can't reproduce the problem. (What happens if you actually let the robot roam around? Does it go where expected?) I guess it's not important. It's working for us now, and hopefully we won't fail our project. 8)
Orzeszek
New User
 
Posts: 4
Joined: Tue Sep 20, 2011 1:02 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby roger » Sat Sep 24, 2011 7:07 am

Hi Orzeszek,
Now that I look very closely at the heading values and I do see some strange numbers . So I synchronized the moveStopped, moveStarted, updatePose, getPose and setPose methods in a test version of the pose provider, and the values look OK. The maximum error at any waypoint is less than 2 degrees and none of the intermediate values look unreasonable. I do not notice any snap back of headings to 0. However, I printed the results to RConsole (using the USB) so I could study them at leisure, so my experiment was not exactly the same as yours.
When I have access to a large smooth area, I will see if the robot actually behaves properly, not merely reporting good numbers.
Thanks for discovering the problem.
I notice you made some additional changes in your OdometryPoseProvider beyond synchronize the critical methods. Are these necessary, or is synchronizing enough?
Roger
roger
Moderator
 
Posts: 359
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

Re: Concurrency Bug in OdometryPoseProvider?

Postby Orzeszek » Sat Sep 24, 2011 3:42 pm

Hi Roger,

As far as I know, the only change that *needs* to be made is the synchronization of the critical methods. Because the DifferentialPilot calls moveStopped(...) before it ever calls reset(), the bug cannot occur if the critical methods are synchronized. moveStopped(...) puts the OdometryPoseProvider in a state where it won't update the pose until after the DifferentialPilot has called moveStarted(...), and the DifferentialPilot won't do that until after it has called reset. So it's all good...

I made one other change to the OdometryPoseProvider, which was to use the MoveProvider mp as the flag for whether the pose provider should update the pose on getPose() instead of the boolean current flag. This was just to ensure (well, not *ensure* but ...) that the OdometryPoseProvider would die if the synchronization was stuffed up by accident in the future (rather than failing silently when using a boolean flag).

Other than that, I just made some changes to comply with the coding conventions that we happened to be using for our project. Nothing of substance... Good luck! :wink:
Orzeszek
New User
 
Posts: 4
Joined: Tue Sep 20, 2011 1:02 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby roger » Sat Sep 24, 2011 4:39 pm

Hi Orzeszek,
I have confirmed your analysis by experiment. Using the pose provider with the critical methods synchronized, my robot completed the mine sweeping route with no problems. The test was even more difficult than your example: getPose() was called every 50 ms instead of every 100 ms.
Thanks for discovering the problem and providing the solution.
Roger
roger
Moderator
 
Posts: 359
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

Re: Concurrency Bug in OdometryPoseProvider?

Postby Orzeszek » Sat Sep 24, 2011 5:34 pm

Hi Roger,

No worries! That's all good news... :D

(My code actually hammers the getPose() as often as possible. It's only the LCD updates that are restricted to once every 100 ms.)
Orzeszek
New User
 
Posts: 4
Joined: Tue Sep 20, 2011 1:02 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby siggy » Fri Feb 24, 2012 3:39 am

hi roger,
i was looking into this since i am working on a swarm project using a compass etc. i tried to use the code mentioned in this thread just to test out the OdometryPoseProvider and came up with a problem right out of the box. i am using lejos .91 and eclipse. it seems that the NavPathController and the WayPoint classes are missing from the plugin (or at least my version). am i missing something here? i also have a question about using CompassSensor and poseProvider as discussed in the api: are there any examples for reference?
thanks
siggy
siggy
New User
 
Posts: 9
Joined: Wed Jul 21, 2010 11:04 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby roger » Fri Feb 24, 2012 4:23 am

Hi siggy,
NavPathController was replaced by Navigator in 0.91, and WayPoint is in the navigation package.
There is a CompassTest in samples. I don't know of any samples using Navigator, but you might find the tutorial on controlling wheeled vehicles helpful.
Good luck,
Roger
roger
Moderator
 
Posts: 359
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA

Re: Concurrency Bug in OdometryPoseProvider?

Postby siggy » Mon Feb 27, 2012 6:41 pm

Hi Roger,
found the new classes and some things are working, but i still have a question:
the tutorial on controlling wheeled vehicles uses CompassPilot which will be deprecated soon. from the docs it seems that
we are supposed to use a poseProvider but i don't see how to bind the compass to the poseProvider (as was done in the CompassPilot).
i am sure it's something ssinmple, but what am i missing?
thanks
siggy
siggy
New User
 
Posts: 9
Joined: Wed Jul 21, 2010 11:04 pm

Re: Concurrency Bug in OdometryPoseProvider?

Postby roger » Mon Feb 27, 2012 8:16 pm

Hi siggy,
The CompassPilot was deprecated some months ago because it did not exactly fit with a proposed architecture for real time motion control. The hope was that classes could be developed to use any type of motion sensor for feedback control of the robot path. But this hope has not been realized by classes in the library. Not much progress has been made in this front recently, so I suspect that CompassPilot will be around for a while.
(IMHO, the CompassPilot should not have been deprecated before working alternative existed.)
Since CompasPilot is a sub-class of DifferentialPilot, it should work with the OdometryPoseController.
Roger
roger
Moderator
 
Posts: 359
Joined: Fri Jun 01, 2007 4:31 am
Location: Berkeley, CA


Return to NXJ Software

Who is online

Users browsing this forum: No registered users and 3 guests

more stuff