+ All Categories
Home > Documents > Intersection Representation Enhacement by Sensorial Data...

Intersection Representation Enhacement by Sensorial Data...

Date post: 12-Jul-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
8
Intersection Representation Enhacement by Sensorial Data and Digital Map Alignment Sergiu Nedevschi, Voichita Popescu, Tiberiu Marita, Radu Danescu Computer Science Department Technical University of Cluj-Napoca Cluj-Napoca, Romania Marc-Michael Meinecke, Marian-Andrzej Obojski, Joern Knaup Driving Assistance Systems Volkswagen AG Wolfsburg, Germany Abstract— Alignment of the stereo perception outcomes with the GPS based digital map information is the main goal of this paper. This alignment relies on discriminative sensorial landmarks as stop lines or lane markings and augmented GPS maps. The alignment is needed to provide: an increased accuracy global localization of the static and dynamic detected entities; fusion of the perceived sensorial information with the map information for filling in the gaps or for increasing the accuracy of the perceived information, generating an enhanced representation of the intersection. Keywords- stereovision; environment sensorial representation; augmented digital map; alignment; data fusion I. INTRODUCTION The intersection scenario is the most complex, demanding and dangerous part of all driving situations. Depending on the region and country, from 30 to 60% of all injury accidents and up to one third of the fatalities occur at intersections. For this reason, the INTERSAFE-2 project aims to develop and demonstrate new systems, algorithms and technologies that are able to significantly reduce injury and fatal accidents at intersections. The analysis of the user needs for Intersection Safety Assistance Systems [1] identified the following driving assistance functions: left turn assistance (LTA), intersection crossing assistance (ICA), right turn assistance (RTA), right of way and stop line assistance (SLA). The main roles of the stereovision sensor in an intersection driving assistance system are related to the sensing and perception [2] in the front of the ego vehicle in a region with a large horizontal field of view. The usual static road environment perception functions are: current and side lanes 3D model estimation based on lane delimiters (lane markings, curbs); static obstacles detection, 3D localization and classification including parked vehicles, poles and trees. The dynamic road and intersection environment perception functions are: preceding, oncoming and crossing vehicles detection, tracking and classification; preceding, oncoming and crossing vulnerable road users detection, tracking and classification. Tracking of the dynamic objects provides relative position, speed and acceleration information. Using this information, the movement history of the ego and tracked vehicles can be inferred and used for risk assessment. Alignment of the stereo perception outcomes with the GPS based digital map information is the main goal of this paper. This alignment relies on discriminative sensorial landmarks as stop lines or lane markings and augmented GPS maps. The alignment is needed to provide a more accurate global localization of the static and dynamic detected entities, and to fuse the perceived sensorial information with the map information, in order to fill in the gaps or to increase the accuracy of the perceived information. II. PROBLEM STATEMENT For a Driving Assistance System dedicated to intersection safety, knowledge about the static components and dynamic population of the intersection are essential. The perception should be focused on extracting the most information out of disparate, independent cues. The structure and position of the intersection can be inferred from cues like the stop lines, the pedestrian and bicycle crossings, the zebra crossing signs, the lane markings, the painted arrows, the traffic light poles, the curbs. In order to be useful these cues have to be detected, localized in the ego vehicle coordinate system and classified. The missing information regarding the static components of the intersection can be inferred from GPS digital maps. For that, the sensorial representation of the perceived static components of the intersection should be aligned with the similar map information using as initial data the GPS position and GPS orientation of the ego vehicle. The result of this alignment should be an improved localization of the ego vehicle coordinate system in the global coordinate system and the possibility to fuse the perceived information with the map one. Unfortunately this alignment cannot be always achieved. 978-1-4244-8230-6/10/$26.00 ©2010 IEEE 393
Transcript
Page 1: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

Intersection Representation Enhacement

by Sensorial Data and Digital Map Alignment

Sergiu Nedevschi, Voichita Popescu, Tiberiu Marita, Radu Danescu

Computer Science Department Technical University of Cluj-Napoca

Cluj-Napoca, Romania

Marc-Michael Meinecke, Marian-Andrzej Obojski, Joern Knaup

Driving Assistance Systems Volkswagen AG

Wolfsburg, Germany

Abstract— Alignment of the stereo perception outcomes with the GPS based digital map information is the main goal of this paper. This alignment relies on discriminative sensorial landmarks as stop lines or lane markings and augmented GPS maps. The alignment is needed to provide: an increased accuracy global localization of the static and dynamic detected entities; fusion of the perceived sensorial information with the map information for filling in the gaps or for increasing the accuracy of the perceived information, generating an enhanced representation of the intersection.

Keywords- stereovision; environment sensorial representation; augmented digital map; alignment; data fusion

I. INTRODUCTION

The intersection scenario is the most complex, demanding and dangerous part of all driving situations. Depending on the region and country, from 30 to 60% of all injury accidents and up to one third of the fatalities occur at intersections. For this reason, the INTERSAFE-2 project aims to develop and demonstrate new systems, algorithms and technologies that are able to significantly reduce injury and fatal accidents at intersections.

The analysis of the user needs for Intersection Safety Assistance Systems [1] identified the following driving assistance functions: left turn assistance (LTA), intersection crossing assistance (ICA), right turn assistance (RTA), right of way and stop line assistance (SLA).

The main roles of the stereovision sensor in an intersection driving assistance system are related to the sensing and perception [2] in the front of the ego vehicle in a region with a large horizontal field of view.

The usual static road environment perception functions are: current and side lanes 3D model estimation based on lane delimiters (lane markings, curbs); static obstacles detection, 3D localization and classification including parked vehicles, poles and trees.

The dynamic road and intersection environment perception functions are: preceding, oncoming and crossing vehicles detection, tracking and classification; preceding, oncoming

and crossing vulnerable road users detection, tracking and classification.

Tracking of the dynamic objects provides relative position, speed and acceleration information. Using this information, the movement history of the ego and tracked vehicles can be inferred and used for risk assessment.

Alignment of the stereo perception outcomes with the GPS based digital map information is the main goal of this paper. This alignment relies on discriminative sensorial landmarks as stop lines or lane markings and augmented GPS maps. The alignment is needed to provide a more accurate global localization of the static and dynamic detected entities, and to fuse the perceived sensorial information with the map information, in order to fill in the gaps or to increase the accuracy of the perceived information.

II. PROBLEM STATEMENT

For a Driving Assistance System dedicated to intersection safety, knowledge about the static components and dynamic population of the intersection are essential.

The perception should be focused on extracting the most information out of disparate, independent cues. The structure and position of the intersection can be inferred from cues like the stop lines, the pedestrian and bicycle crossings, the zebra crossing signs, the lane markings, the painted arrows, the traffic light poles, the curbs. In order to be useful these cues have to be detected, localized in the ego vehicle coordinate system and classified. The missing information regarding the static components of the intersection can be inferred from GPS digital maps.

For that, the sensorial representation of the perceived static components of the intersection should be aligned with the similar map information using as initial data the GPS position and GPS orientation of the ego vehicle. The result of this alignment should be an improved localization of the ego vehicle coordinate system in the global coordinate system and the possibility to fuse the perceived information with the map one. Unfortunately this alignment cannot be always achieved.

978-1-4244-8230-6/10/$26.00 ©2010 IEEE 393

Page 2: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

The GPS based digital maps, with static or even dynamic content have been the solution world-wide adopted for navigation both in urban and non-urban environments. Unfortunately the GPS signal is affected by various factors that introduce significant positioning errors (between 1 to 30 meters), and thus a more accurate localization of the ego-vehicle on a digital map is necessary. It can be achieved by aligning the corresponding landmark elements, information provided by the sensorial perception and digital maps.

Assembles of painted road signs like stop lines or pedestrian crossings and adjacent lane markings, in correlation with lane markings type, painted arrows, curbs or side lanes information can be successfully used as triggers.

The specific static road and intersection perception functions in charge with extraction of the discriminative landmarks are: lane markings detection, 3D localization and classification; curbs detection and 3D localization; stop line, pedestrian and bicycle crossing detection and 3D localization; painted signs (turn right, turn left, and go ahead) detection and 3D localization.

By consequence an augmented GPS digital map including the landmarks and the supplementary information required by the driving assistance applications is necessary.

III. SENSORIAL PERCEPTION

A. Sensorial Representation of the Environment

The sensorial representation of the environment is provided by the Onboard Sensorial System, which in this case relies on the stereo vision system. The stereo vision system in development at Technical University of Cluj Napoca provides an extensive 3D description of the intersection’s visible static and dynamic components together with the landmarks required for the alignment with the global map.

The elements of this description are 3D representations of the following entities: current lane, side lanes [3], curbs, isles [4], pillars, vehicles and pedestrians [5]. The used landmarks are the stop-lines (for longitudinal positioning) and lane markings and painted arrows (for lateral positioning).

In Figure 1 (a) the perspective view of the 3D representation is shown over the intensity image. In Figure 1 (b) the 3D representation of the perceived entities is shown.

a. b. Figure 1. The 3D representation of the detected intersection environment

entities.

B. Extraction and 3D Localization of the Lane Markings and Painted Traffic Signs

The lane markings are detected in two ways: as feature points for the lane tracking algorithm, and as standalone painted road objects. Both approaches rely on the classical Dark-Light-Dark (DLD) transition principle.

For the lane tracking algorithm, we seek edge points having similar gradient magnitude but opposing sign. A simple search for gradient pairs suffers from several drawbacks: the level of detail of the road surface decreases with the distance, and the distance between opposing gradients is also variable, both problems being caused by the perspective effect. Our solution tries to take into account this effect, both in gradient computation and in pair selection.

A perspective aware differentiation filter is employed for the variable level of detail. The width of the filter and the width of the search interval are computed from the 3D width of the lane markings (a possible range of widths) and the camera parameters, which enable us to compute beforehand the effect of the perspective.

The value of the horizontal gradient of a point of coordinates (x, y) is given by equation (1):

1 1

( , ) ( , )( , )

2( )

x D x D

i x i xN

I i y I i yG x y

DD KernelSize y

+ −

= + = −

=

=

∑ ∑ (1)

Applying the above formula directly is computationally expensive, as D’s value may even go beyond 20, for the lowest image lines. However, we can observer that the formula for the gradient of a point differs very little from the formula for the gradient of its previous neighbor. In order to take advantage of that, we have to defer the division by 2D to the end of the image line. Let’s denote the un-normalized gradient by GU:

1 1

( , ) ( , ) ( , )x D x D

Ui x i x

G x y I i y I i y+ −

= + = −

= −∑ ∑ (2)

GU can be computed using a recurrent equation:

( , ) ( 1, ) ( , )

( , ) ( 1, ) ( 1, )U UG x y G x y I x D y

I x y I x D y I x y

= − + + −

+ − − − − (3)

Normalization takes place at the end of each line: ( , )

( , )2

UN

G x yG x y

D= (4)

Figure 2 shows the effect of horizontal gradient computation, compared to the original grayscale image.

394

Page 3: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

Figure 2. Original grayscale image (left) and the result of applying the adaptive horizontal gradient filter (right)

The next step is to eliminate the intermediate values, leaving only the points of minimum and of maximum gradient (Figure 3, left).

Figure 3. Non-maxima/minima suppression (left) and pairing (right)

A point of maximum and a point of minimum form together a DLD pair if the distance between them falls inside an acceptable range and the absolute values of the gradients are similar (Figure 3, right). Non-paired features are discarded.

The final step is to eliminate the DLD pair edges that do not belong to the road surface, using the 3D information provided by stereovision. The lane marking features are shown in Figure 4.

Figure 4. Extracted lane marking edges

Detecting the road markings as features for a lane detection process means that the errors in marking detection can be compensated by the performance of the tracking algorithm. False results can be filtered out, and missed features can be interpolated. However, the pained markings do not necessarily define a lane, but their position, size and type can help the intersection assistance system to position us accurately in space.

In order to detect and classify the markings as standalone objects, a more elaborate method is required. First, in order to identify the painted road objects, we have to search for dark-light-dark transition patterns. Instead of pairs of gradients of opposing signs and similar magnitude, we use the already existent road edges. The 3D road points are selected after the

pitch of the vehicle is detected from the stereo 3D information.

The edges have the role of dividing each horizontal image line into regions, as shown in Figure 5. A region is the part of an image line that covers the horizontal distance between two edge points.

For each region, the average intensity of the corresponding pixels in the original grayscale image is computed. The average intensity of each region is compared to the average intensity of the neighboring regions. A valid region is one that has the average intensity higher than the intensity of its neighbors, and the intensities of the neighbors are similar. Another validation condition for a line region is imposed on its width: the width must be lower than the perspective projection of the widest acceptable object, for the specific image line. The widest acceptable object is the pedestrian crossing element.

Figure 5. Image lines are divided into regions by the road edges.

The initial region-based segmentation is used to extract a Gaussian mixture of intensity distributions for the road marking pixels. The parameters of the mixture are used for classifying the pixels that neighbor the initial marking candidates. In this way, the objects are considerably less fragmented, as shown in Figure 6.

Figure 6. Initial region-based segmentation (top), and refined segmentation using intensity distribution (bottom).

The individual objects are identified by labeling, and the 3D bounding cuboid is extracted using the constraints of the camera’s perspective. The objects are then classified by a decision tree based on a simple set of six features, all of them derived from the relation between the object’s left and right border with the best linear RANSAC fit of that side. For each side },{ RightLeftS ∈ we extract the following features:

RS – ratio between the number of points that do not fit on the line, and the total number of points on a border (left or right).

AS – the average error (deviation from line) for the non-compliant points, for each side.

395

Page 4: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

PS – the position of the maximum number of error points for each side. The height of the object is divided into three regions, top, middle and bottom, and each region gets a position value, from 0 to 2.

After the object’s features are run through the decision tree, and each object receives a class, a final validation based on the 3D size is applied. This will ensure that the object’s size is consistent to its class. The following image shows the final product of our system, a set of 3D road objects with their associated class.

Figure 7. Painted road objects, detected and classified.

C. Stop-line Detection and 3D Localization

Most of the horizontal road markings (stop-lines, pedestrian or bicyclist crossings) exhibit few 3D reconstructed points when a dense stereo vision engine with the two cameras displaced horizontally is used. This can be explained by the impossibility to correlate areas with uniform intensity/color (i.e. no texture) along the horizontal direction (Figure 8 (a)). 3D points density increases only at the side ends of the horizontal road markings where vertical edge like intensity variations are better handled by the most of the stereo reconstruction engines. Accordingly, the accuracy of the 3D points along horizontal features is questionable due to the correlation uncertainties (Figure 8 (b))

(a)

(b) Figure 8. (a) Reconstructed 3D road points are figured with magenta. (b) Top

view of the 3D road points reconstructed on the stop-line. Ground truth is about 7750 mm for the depth (Z) and 500 mm for the thickness of the stop-

line (area highlighted with the red box).

Therefore a detection approach relying mostly on 3D points would have limited success. On the other hand, a detection based on 2D image analysis alone would lack the positioning information, which is essential for the usage of the method in driving assistance applications.

The proposed solution uses a hybrid approach by combining the 2D detection and 3D validations in order to increase the robustness of the detection. Model based reasoning is also used in order to eliminate false positive situations (e.g. the scenario from Figure 9) and to detect exactly the objects of interest (stop lines, pedestrian or bicyclist crossings).

Figure 9. False positives can be generated by other horizontal structures (e.g. curb of a transversal road).

The main steps of the horizontal road markings detection algorithm are described bellow: 1. Selection of the ROI (Region of Interest) used for detection. Two types of ROIs are considered:

- a 3D ROI used for the 3D validation is specified (depth and lateral offsets). If a current lane exists, the lateral offsets are adjusted according to the lane limits.

- a 2D ROI used for the 2D detection is computed automatically based on the camera setup (pitch angle, nearest blind spot, maximum detection range)

The 2 ROIs are merged (intersected) in the detection/validation process.

2. Detection of horizontal line structures in 2D: based on horizontal edges, the Hough transform is applied to detect horizontal lines. The Hough lines are limited to near horizontal angles to reduce the processing time. Closely detected Hough lines (belonging to the same horizontal structure/marking) are groped based on vicinity and orientation criteria.

3. 3D validation and grouping. This step generates stop-line like hypotheses. A 2D Hough line is validated if it contains 3D road points, and a corresponding 3D line is generated. Neighboring 3D Hough lines are grouped (clustered) based on their vicinity in the 3D space using the MSABS algorithm [6]. Each cluster will be a stop-line hypothesis.

4. 2D hypothesis validation. This step is performed in the 2D image based on the models derived from Table 1.

- local 2D ROI generation: the 3D hypothesis is projected on the image.

- local 2D ROI analysis: valid stop-line (SL) cells are searched. A valid SL-cell has a black-to-white transition followed by a white-to-black transition (from top to bottom) (Figure 10 (a)), A feature vector containing the average 2D thickness and the binary pattern of the SL-cells (‘1’ for a valid SL cell and ‘0’ for an invalid one) is constructed for each hypothesis and its 2D limits are refined (Figure 10 (b)).

396

Page 5: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

- model based 2D classification: by analyzing the feature vector against country specific models the hypotheses are classified as: stop lines, crossings, continuous lines (others then stop lines) and unknown structures.

(a)

(b) Figure 10. (a) The SL-cells search process (b) Refined local 2D ROIs after the

2D validation process

5. Refinement of the 3D position. Based on the 3D data associated to each the SL-cell, the 3D position and size/shape of the horizontal line marking is recomputed. The estimation can be performed either by averaging (to obtain un-oriented objects) or by using RANSAC fitting [7] (to obtain oriented objects).

6. 3D objects generation. Only the lines classified as stop-lines and crossings (pedestrian and bicycles) are reported and corresponding 3D objects are generated (Figure 11). Only the nearest stop line and bike-crossing objects are reported. The position and limits of the objects are computed based on the 3D data associated to each horizontal marking (step 5).

7. 3D objects tracking. The position of the detected 3D objects is filtered over time by using the odometry data of the ego car.

(a) (b) Figure 11. Detected stop-line and lane markings represented as 3D objects:

(a) perspective view; (b) top view.

IV. AUGMENTED DIGITAL MAP

A. Requirements

The purpose of this solution is to align the vehicle’s perceived information about the surrounding environment with the elements in the digital map. The standard digital map provides only limited information about the streets and the intersection, like the centerline of the road and the number of

lanes, but this information is not enough for the alignment algorithm. This gives rise to the need to represent more features about the geometry of the roads, at the intersection level, which will allow the alignment of the information from the two information sources: the sensorial representation and the digital map. Important elements about the street geometry are: the right of way per lane (one way or two ways), the number of lanes per way, the lanes widths, the stop-line and the painted arrows. In this paper we propose to bring these additional elements to the digital map that we are using, without altering the existing data of the map, and thus creating what we will refer to from now on as the Augmented Digital Map (ADM). The new data, with which we propose to enrich the current digital map, is also provided in GPS coordinates (Latitude Longitude Height) and represents the information required for alignment.

Using the alignment mechanism, we achieve the accurate 3D localization of the vehicle with respect to the approaching intersection. Thus, properly aligning the detected landmarks in the sensorial representation with the corresponding elements in the ADM will enrich the knowledge about the surrounding environment with additional information useful for the Driving Assistance System, like: sections, roads, ways, lanes, possible directions for the intersection in question, as well as curbs details that define the drivable area through the intersection.

As a starting point for the digital map, we use the open source project OpenStreetMap [8]. OpenStreetMap is a free editable map of the whole world. It allows the viewing, editing and use of geographical data in a collaborative way from anywhere on Earth. This map provides the basic information about the streets and intersections. Our proposal is to augment this GPS data with the detailed geometry of the roads.

As a test case intersection, a real intersection from Cluj-Napoca, Romania was chosen. Figure 12 illustrates this intersection with the basic information provided by the digital map. We can notice how shallow this information is, and the need for much detailed description of the roads when approaching the intersection.

Figure 12. Satellite image overlapped with basic digital map information

Figure 13 (a) illustrates the geometry of the proposed intersection, with the lanes per way, lanes width, lane markings, stop-lines and pedestrian crossing. Figure 13 (b) underlines the elements that we propose to bring into the digital map. The green segments represent the centerline of

397

Page 6: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

each lane, which we will refer to as the lane axis. The lane axis is the middle of a lane, and has the one of the end points on the painted stop-line, and the other one at a distance of 30 m. We are interested only in the lanes entering the intersection. Using this information, and knowing each lane width we can determine the pointes denoted C1, C2, C3 and C4 in Figure 13. The points C1, C2, C3 and C4 define the stop-line and together with their corresponding pairs (C1’, C2’, C3’, C4’) determine the lane markings position. Figure 13 (c) exemplifies the overlapping of the existing basic information (the existing GPS data structure with yellow) with the proposed extra information regarding the road geometry (with green). All the information in the digital map is in GPS coordinates.

Having this additional information in the digital map, together with the information provided by the sensorial representation, we have all the necessary data for the alignment process.

Figure 13. Road geometry of the test intersection (a) Simple road geometry (b) Elements of interest introduced in the Augmented Digital Map (c) Road

geometry overlapped with basic digital map information

B. Proposed Open Street Map extension

The first issue is how to augment the OpenStreetMap, to support the upper mentioned information. We will do that by adding some proposed new features [9] to the existing OSM data structure.

The current data primitives in OSM are: nodes, ways and relations. A node is the basic element of the OSM scheme consisting of latitude and longitude; it is a single geospatial point. A way is an ordered interconnection of at least 2 nodes that describes a linear feature such as a street, footpath, railway line, area, building outline. A relation is a group of zero or more primitives with associated roles. It is used for specifying relationships between objects, and may also model an abstract object.

Our first objective is to create on the map: the lanes with specified characteristics such as lane axis, lane width, lane number, possible directions, for segments of streets, with perfectly parallel lanes near to an intersection. The second objective is to join the lanes into a street segment and then join the street segments into a unified structure called the intersection. Therefore we suggest the use in the XML schema of the following proposed features, all of which are based on the lane concept:

1. Lane way - is a way that represents a single lane. It is tagged with the: lane key associated with one of the following values: vehicle_lane, cycle_lane, footway, bus_lane. Example of lane way is the following:

<way id='1' visible='true'> <nd ref='11' /> <nd ref='12' /> <tag k='width' v='3.2' /> <tag k='oneway' v='yes' /> <tag k='lane' v='vehicle_lane' />

</way>

We will use this concept to represent the lane with the following information: the lane axis given by two nodes, the lane width and the lane direction: if oneway=yes it means the lane is entering the intersection (it is a right lane), and if oneway=-1 the lane is leaving the intersection (it is a left lane).

2. Lane_group relation – is a relation that groups all the lane ways in a single structure. Its specific tagging is: type=lane_group. The following is an example of lane_group:

<relation id='5' visible='true'> <member type='way' ref='1' role='1'/> <member type='way' ref='2' role='2'/> <member type='way' ref='3' role='-1'/> <tag k='type' v='lane_group'/>

</relation>

The relation with reference 2 has role=1, this means that it is a first lane after the middle of the directions, and since way 1 has specified oneway=yes, it means that it is the first lane to the right of the middle of the directions. Similarly, the relation with reference 3 is the first lane, to the left of the middle of the directions. We will use this concept to represent segment of the road joint to the intersection.

In our approach, the lane_group represents a section of the intersection. If we join together several such sections we obtain the structure called intersection. Therefore, we further propose the use in OSM of another new feature:

3. Intersection relation – is a relation that has as members all the lane_groups. Example of intersection:

<relation id='8' visible='true'> <member type='relation' ref='3' role='S1'/> <member type='relation' ref='4' role='S2'/> <tag k='type' v='intersection'/> </relation>

In the example above, the role=S1, means that the lane_group with reference 3 represents the section 1 of the intersection.

For the future, we are interested in specifying the possible travelling paths through the intersection. We achieve this using the proposed feature:

4. Lane-directions tag - specifies all possible directions, for each lane at intersection. Example: lane_directions = 1:L, 2:S, 3:R. Adding this feature to a lane_group, on the last segment approaching the intersection, specifies the allowed directions of travelling through the intersection, for each lane.

398

Page 7: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

The numbers represent the lane number, with the convention that the lane counting starts from middle of the directions and that only the lanes entering the intersection have associated directions. The letters represent directions: L-left turn, R-right turn, S-straight. If a lane has multiple possible directions, all of them can be specified separated by comma: lane_directions = 1:R,S; 2:S; 3:S.

V. ALIGNMENT ALGORITHM AT INTERSECTION LEVEL

The Alignment Algorithm is a fusion process consisting of properly aligning the reference elements, i.e. the stop line, the lane number and the lane markings from the two representational systems: Sensorial Representation and ADM. For this, the two different coordinates systems must be taken into consideration: the vehicle coordinate systems and the map coordinates system, as well as their transformations into a common Cartesian coordinates system, which will be the Local East North Up (ENU) Coordinate System, also called Navigation Coordinate System generally used in targeting and tracking applications [10], [11].

Figure 14 (a) illustrates the basic flow of the algorithm: the state of the vehicle is perceived using the GPS receiver for global positioning and using the Sensorial Representation for alignment trigger detection. If the vehicle is not approaching an intersection, the car continues its path, otherwise the algorithm continues with the interrogation of the Augmented Intersection Digital Map. If, from the ADM there exist the required information to perform the alignment then the Alignment Module is called to perform this task. The output of this black box is the ego-vehicle accurate global-positioning.

The Alignment Module (Figure 14 (b)) performs by overlapping the corresponding elements in the two representations: the stop-line and lateral lane boundaries, which will constitute the alignment trigger. This moduleconsists of a minimum of three geometric transformations: a rotation, followed by two translations (one for the lateral and one for the longitudinal positioning). Another rotation, in the case when the painted stop-line is not perpendicular on the lane directions is required. The initial configuration of the data from the two representation systems: the ADM and the Sensorial Perception, brought into the same coordinates system (the ENU Coordinate System) is pictured in Figure 15. A rotation by equation (5) is used to get the two stop-lines parallel. The rotation angle θ is the angle between the lines containing the two stop-lines segments, from the two representations:

' cos sin

' sin cos

x x

z z

θ θ

θ θ

−=

⎡ ⎤ ⎡ ⎤⎡ ⎤⎢ ⎥ ⎢ ⎥⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦

(5)

A translation equation (6) is required in order to overlap

the now parallel stop-lines. The translation vector1 1 , 1( )

X Zt t tr

is

used to finalize the longitudinal alignment by translating the

vehicle stop-line and vehicle origin with translation vector 1

tr

.

no

LATERAL ALIGNMENT MODULE

SENSORIAL PERCEPTION StopLine (Crossing)

Vehicle CS to World CS Transformation

AUGMENTED DIGITAL MAP,

StopLine Preprocessing & World CS Representation

Compute θ, the angle between S1 and S2

Rotate with θ (Vehicle, Stop_line)

Compute t1, the first translation vector

Translate with t1(t1X, t1Z) (Vehicle, Stop_line)

AUGMENTED DIGITAL MAP

NbLanes = GetNbInLanes(S)

Ego-Vehicle Longitudinal

Positioniong O'(t1X,t2Z)

case of

1st

Lane

LANE MATCHING MODULE

based on SENSORIAL PERCEPTION

LaneSelection(Curbs, Left/

Right_boundary, Left/Right_side_lane)

2nd

Lane

3rd

Lane

Compute t2, the second translation vector

Translate with t1(t1X, t1Z) (Vehicle, Stop_line)

Ego-Vehicle Lateral Positioning

O''(t1X+t2X,t1Z+t2Z)

PREPROCESSING

LONGITUDINAL ALIGNMENT MODULE

1 2 or 3

GPS Position Reading

SENSORIAL

PERCEPTION Reading

Intersection

Detected

AUGMENTED

INTERSECTION DM

Reading

ALIGNMENT MODULE

Ego-Vehicle Global

Localization

Exist sensorial

Alignment

Configuration?

no

(a) (b)

Figure 14. (a) General Algorithm (b) Alignment Module

1 0'

1 0'1

x x

zz

xx t tx

ztz z t

+= =

+

⎡ ⎤⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦

⎢ ⎥⎣ ⎦

(6)

A second translation is used to lateraly align the stop-lines. The ADM and the Sensorial Representation togheter provide the information required to determine the second translation

vector 2 2 , 2( )

x Zt t tr

. In the case in which the map stop-line not is

perpendicular on the road axis, the alignment process is not complete, a final rotation is required to propertly overlapp the stop-lines from the two representaions.

Figure 15. (a) Initial Configuration of the elements of the Sensorial and ADM Representations in the same coordinate system

(b) Longitudinal Alignment (c) Lateral Alignment

399

Page 8: Intersection Representation Enhacement by Sensorial Data ...users.utcluj.ro/~tmarita/CV/Papers_IEEE-Xplore/ICCP-2010.pdf · Using this information, ... localization of the static

For the Lane Matching Module, a simple logic was used to infer the correct lane on which the car is situated using the previous mentioned pieces of information, from the two different data sources.

VI. EXPERIMENTAL RESULTS

For the evaluation of the proposed algorithm we used the manually created ADM of the proposed intersection. The geometry of the roads composing the intersection was measured using satellite images and on the spot measurements. The test case scenario consisted of driving through the intersection coming from different roads. The initial ego-vehicle GPS position is obtained with a standard GPS receiver, with a position precision degree of about 5 m and an update rate of 1 Hz. The stereo vision system is used to detect the stop-line and lane boundaries. The stop-line is detected between the blind spot and up to 20 m for the current camera setup (8,5 mm focal length, 190 mm baseline, 42 deg horizontal field of view). Above 20m the stop-line image projection is not visible in the image (the stop line thickness is below 1 pixel and sub-pixel positioning of the b-w & w-b transitions not possible). Using this approach, the vehicle positioning error is moved from the GPS reading to the error of the stereo camera sensorial system.

Figure 16 illustrates results of the alignment system in the study case intersection, when coming from one of the road sections. The alignment algorithm positions the vehicle correctly on the 2nd lane, at the distance of 14 m from the real painted road stop-line. All four transformations are required; the last rotation due to the fact that the stop-line is not perpendicular on the road axis. Figure 16 (a) illustrates the sensorial representation alignment elements detected. Figure 16 (b) pictures the results of the alignment algorithm. With red we can see the initial positioning of the vehicle, the stop-line and the first rotation. With magenta color we can see the vehicle and the vehicle stop-line longitudinal alignment and with green the lateral alignment. With black it is pictured the final position of the vehicle, after the last rotation. The error of the proposed alignment algorithm is of 3% of the distance between the vehicle and the detected the stop-line.

(a) (b) Figure 16. (a) Sensorial Representation alignment elements detected

(b) Experimental results

VII. CONCLUSIONS

In this paper a method for the sensorial data alignment with an augmented digital map (ADM) for intersection scenarios was proposed. The main contributions are related to:

- The detection and 3D representation of specific landmarks characteristic to intersection scenarios as stop-lines and lane markings

- The augmented digital map which provides more features which include the map representation of the landmarks and supplementary information regarding the road geometry and intersection’s configuration.

- An algorithm for the alignment of the sensorial data with the augmented digital map.

- The achieved alignment accuracy is at the level of stereo sensor accuracy.

The method was successfully implemented and experimented in some specific intersection scenarios

The proposed approach will be able to provide an increased accuracy in the global localization of the static and dynamic entities of an intersection and a fusion mechanism of the perceived sensorial information with the map information for increasing the completeness of the intersection representation.

ACKNOWLEDGMENTS

This work was conducted within the research project INTERSAFE-2. It is part of the 7th Framework Programme, funded by the European Commission. The partners of INTERSAFE-2 thank the European Commission for supporting the work of this project.

Partial support comes from the “Doctoral studies in engineering for developing knowledge-based society- Project SIDOC”, -contract POSDRU/88/1.5/S/60078.

REFERENCES

[1] INTERSAFE-2 Consortium, “Specification and Architecture documentation” deliverable D4.1, available via http://www.intersafe-2.eu/public/, cited 28th April 2009.

[2] S. Nedevschi, R. Danescu, T. Marita, F. Oniga, C. Pocol, S. Bota, M.-M. Meinecke, M. A. Obojski, “Stereovision-Based Sensor for Intersection Assistance”, in Advanced Microsystems for Automotive Applications 2009: Smart Systems for Safety, Sustainability, Springer, pp.129-164.

[3] R. Danescu, S. Nedevschi, “Probabilistic Lane Tracking in Difficult Road Scenarios Using Stereovision”, in IEEE Transactions on Intelligent Transportation Systems, vol. 10, no. 2, 2009, pp. 272-282 [4] F. Oniga, S. Nedevschi, “Processing Dense Stereo Data Using Elevation Maps: Road Surface, Trafic Isle and Obstacle Detection”, IEEE Transactions on Vehicular Technology, vol.59, no 3, 2010, pp. 1172-1182.

[5] S. Nedevschi, S. Bota, C. Tomiuc, “Stereo-Based Pedestrian Detection for Collision-Avoidance Applications”, in IEEE Transactions on Intelligent Transportation Systems, vol. 10, no. 3, 2009, pp. 380-391 [6] S Theodoridis, K. Koutroumbas: “Pattern recognition”, 2-nd edition, Elsevier Academic Press, 2003.

[7] R. C. Bolles, M. A. Fischler: “A RANSAC-Based Approach to Model Fittingand Its Application to Finding Cylinders in Range Data”, 1981.

[8] “Open Street Map, The Free Wiki World Map, available via http://www. openstreetmap.org/, “, cited May 2010

[9] “Proposed features “, available via http://wiki. openstreetmap.org/wiki/ Proposed_features, “, cited May 2010

[10] S. P. Drake, "Converting GPS Coordinates to Navigation Coordinates (ENU)", Published by DSTO Electronics and Surveillance Research Laboratory Edinburgh, Australia, 2002, http://dspace.dsto.defence.gov.au/ dspace/bitstream/1947/3538/1/DSTO-TN-0432.pdf [11] ”Geodetice system”, available via http://en.wikipedia.org/wiki/ Geodetic _system, cited Februray 2010

400


Recommended