3,214
Views
0
CrossRef citations to date
0
Altmetric
Research

Assessing navigational map reading competencies with the location-based GeoGame “GeoGami”

& ORCID Icon
Pages 73-85 | Received 30 Nov 2021, Accepted 08 Mar 2023, Published online: 31 Mar 2023

Abstract

School curricula as well as scientific literature emphasize the need of understanding how children navigate with maps. To date, however, there is no empirically validated assessment tool for analyzing navigational map reading competencies that can be used in map-based navigation tasks in the real-world. This paper fills this gap by presenting GeoGami NMR Assessment, a test for measuring children’s navigational map reading (NMR) competencies for educational purpose. Our target group are primary school children aged six to ten. The assessment uses configural and directional map information to assess the learning progress of individual children regarding their spatial navigational performance. While we evaluated our GeoGami NMR Assessment for two specific locations, our study design, our script scoring participants’ performances as well as the assessment can easily be adapted by researchers and teachers for their own locations. Our item analysis evaluates difficulties of individual tasks. We validate our assessment and demonstrate, that GeoGami NMR Assessment is suitable to distinguish strong and weak navigational map readers.

Introduction

Orientation and navigation with maps play an important role in people’s everyday life as well as in professional contexts (Hemmer et al. Citation2008). Both adults and children differ in their way and success in using the maps. For instance, literature reveals differences in navigation strategies of women and men (Dabbs et al. Citation1998) or in the map-based orientation among girls and boys or younger and older children (Hemmer et al. Citation2013). Considerable developmental changes in the use of maps were observed to take place at the age of eight to twelve (Hemmer et al. Citation2013; Liben & Myers Citation2007). However, the performance in using maps is not fixed to gender or age: Hemmer et al. (Citation2013) studied more than 300 third to fifth graders and concluded that orientating with maps can be fostered via educational training. The results show a positive correlation between the children’s experiences in using maps and their success in orientating with them. These outcomes underline the significance of maps in national school curricula and educational guidelines for primary or secondary schools from all over the world (e.g.: Germany: DGfG Citation2020; Hong Kong: CDC-HKEAA Citation2017; UK: Department of Education (Citation2013); USA: NAEP Citation2020). Scientific literature agrees with these curricula in the relevance of understanding and using maps for geoscience in school education (Liben & Myers Citation2007; Jones & Lambert Citation2017).

However, research still lacks an assessment for investigating children’s competencies in navigating with maps that is applicable at different outdoor locations. Existing studies do not fill this gap: For instance, Hemmer et al. (Citation2013) investigated influencing factors on children’s map-based orientation competence with a factor-analytically evaluated test, but did not create a test with a set-up replicable at other settings. Lobben (Citation2007) created a real-world map navigation exercise with an indoor set-up for validating navigational map reading ability paper pencil tests with adult participants, but did not validate the real-world exercise itself, nor developed a test especially for children or real-world environments. This paper presents a real-world assessment for measuring children’s competencies in using maps for orientation that is empirically evaluated and applicable at different locations.

Ishikawa and Kastens (Citation2005) accentuate that methods in geoscience education should be chosen according to the individual abilities of each student. This raises the question of which learning resource can meet such individual needs. One answer can be found in geoinformation (GI) technologies: especially location-based tools bear the potential to automatically detect the user’s map interaction and movement in space as well as to react individually to it. Authors indicate the need of exploring this kind of technology from an educational perspective (Baker et al., Citation2015). The demand is supported by studies emphasizing people’s preference to use digital tools instead of paper maps for navigation (Hergan & Umek Citation2017; Hurst & Clough Citation2013; Speake & Axon Citation2012) indicating that GI technologies found their way into people’s lives. However, there is an evidence for negative short- and long-term effects of GI tools like navigation aid systems on people’s spatial knowledge, wayfinding, and orientation (Ishikawa Citation2019; Field et al. Citation2011; Münzer et al. Citation2006). The user’s passivity when following navigation instructions is one explanation for these effects (Ishikawa Citation2019). Since the tools investigated in these studies were not adjusted for educational purpose, we suspect that GI technologies can also have positive effects through implementations that actively address spatial thinking. This paper presents an assessment for investigating orientation with maps which is based on an educationally adjusted GI tool.

Research question and research design

We developed and evaluated an assessment for NMR competencies of primary school children aged six to ten (). The assessment is created for educational purposes. This paper investigates to what extent it is possible to assess spatial NMR competencies. We define “assess” as the pedagogical evaluation of children’s NMR performances. In a pilot study, our assessment was empirically evaluated for two exemplary locations by estimating its reliability and validating it against teacher judgements. The implementation of our NMR assessment makes use of our mobile app GeoGami, which provides an avenue for implementing NMR assessment tests.

Figure 1. Design framework for the GeoGami NMR Assessment construction. The assessment development and evaluation are closely interlinked and are to be understood as iterative processes. The intention of the assessment determined the decisions on methods for the development and validation of the instrument.

Figure 1. Design framework for the GeoGami NMR Assessment construction. The assessment development and evaluation are closely interlinked and are to be understood as iterative processes. The intention of the assessment determined the decisions on methods for the development and validation of the instrument.

Additionally, this paper offers a detailed description of our assessment design and provides open access to GeoGami (Bistron et al. Citation2021) as well as to an evaluation script (Bistron & Schwering Citation2021b) making the assessment also applicable at other locations. By running the evaluation script with their own data, the readers have access to the NMR scores for their participants and a validation of the assessment for the new location. This paper builds upon the preprint (Bistron & Schwering, Citation2021a).

Development of the NMR assessment

Theoretical construct “navigational map reading”

In the scientific literature, there are different terms defining map reading for orientation or navigation.

Hemmer et al. (Citation2013) coin the term map-based (spatial) orientation (MBO) competence which refers to spatial transformation between map and reality. According to Ishikawa and Kastens (Citation2005), three main understandings are involved in a map-reality transformation: The representational correspondence (understanding the meaning of the symbols), the configurational correspondence (finding the relation between a specific object and its representation on the map) and the directional correspondence (understanding the alignment of the map). MBO is defined as a competence which is a broadly defined term in educational contexts: on the one hand, competence or competency refers to the input (people’s skills required to achieve a certain performance) and, on the other hand, it describes the output (result of training) in the sense of a “competent” performance (Hoffmann Citation1999). Hemmer et al. (Citation2013) formulate a theoretical construct for the MBO competence by dividing it into the ability to locate objects from the map in the real world and vice versa, the abilities in taking correct turns, in pointing in directions correctly, in estimating distances correctly as well as in aligning the map according to the view direction. Some of these abilities ask for map-to-real-world transformations, others for real-world-to-map transformations. The theoretical construct was revised factor analytically in a study with more than 300 participating children that were asked to navigate through a city center by following a route which was marked on a map. The results showed that tasks related to the above-mentioned abilities loaded onto one factor defined as the MBO competence.

Lobben (Citation2007) uses the term navigational map reading (NMR) to describe navigating with a map. Her definition of NMR is close to the construct of the MBO competence by Hemmer et al. (Citation2013) and the term competency defined by Hoffmann (Citation1999): associated with NMR tasks, Lobben (Citation2007) defines different spatial abilities referring to people’s aptitude (e.g. innate brain function) and achievements (through training and experience) in making spatial relations between objects in geographic space. In a study with 45 adults, she proved the predictive power of three of these abilities: self-location showed the highest predictive power followed by route memory and map rotation. The predictive power of these abilities was determined by correlating the participants’ performance in corresponding paper pencil tests with their results in a real-world NMR performance test. Self-location was studied by presenting participants a map and photos of different locations from the map. Using the photos, the subjects determined locations and view directions on the map. Route memory was investigated by showing the participants routes on maps and asking them to remember these routes by ordering verbal route descriptions. The map rotation ability was tested by presenting the participants two maps simultaneously and asking them to determine whether the second map was flipped or rotated. The real-world NMR performance test itself included self-location and navigation tasks. It took place in a hallway network of a large building. The task performance was evaluated by considering the number of manual map rotations and hesitancies, the number and duration of stops, as well as the amount of time spent studying the map.

In our research approach (c.f. , “theoretical construct”), we will use the term navigational map reading (NMR) competencies in analogy with the definition of Lobben (Citation2007). In accordance to Hemmer et al. (Citation2013) we use the term competencies instead of abilities since we are interested in children’s trainable performances in NMR tasks.

Technical realization with the location-based geogame GeoGami

We created a mobile app called GeoGami (Bistron et al. Citation2021, Bartoschek et al., Citation2018, Bistron et al. Citation2022) for testing and training navigational map reading competences. For motivating children to participate in our studies and play our app, we implemented components of location-based game mechanics. GeoGami is running on GPS enabled mobile devices (e.g. tablets and smartphones). The app allows the user to create map-based rallies to implement NMR trainings or NMR assessment for a particular location. GeoGami automatically collects data on the participants’ movements in space (GPS based trajectory, speed, and cardinal view direction) and on their interaction with the tablet (tapping on the map, panning, and zooming), as well as timestamps (in particular the beginning and the end of each task) and GPS inaccuracy data and stores this in an encrypted data base. Additionally to the spatial information, the app records the performance on the thematic tasks.

The digital map used in GeoGami is based on standard OpenStreetMap material. Streets and paths are shown as colored lines, areas of grass, forests or water are colored in natural colors, houses are shown as polygons in their shape from top view, and objects like trees, statues, benches, bridges, or trash cans are represented with a symbol on the map.

We created tests for two locations for our NMR assessment using GeoGami. GeoGami displays the map of the test area, places the tasks, and automatically records participants’ movement and map interactions.

Task development

Based on our above defined understanding of Navigational Map Reading competence, we derive tasks to assess their NMR competence (c.f. steps in ). Our GeoGami NMR assessment consists of four NMR task types which we classify into two groups: Location Tasks (12 tasks in total) and Direction Tasks (8 tasks in total) (). Location Tasks include self-location tasks (LOC) that ask the user to mark their own location by tapping on the map and navigation-to-a-flag tasks (LNV) that ask to navigate to a location indicated by a flag on the map. Direction Tasks include mark-view-direction tasks (DM) that ask the user to mark their own view direction by tapping on the map as well as adopt-view-direction tasks (DA) that ask to change their own view direction through turning by adopting a direction that is presented on the map.

Figure 2. Self-location (LOC), navigation-to-a-flag (LNV), mark-view-direction (DM) and adopt-view-direction (DA) tasks in the GeoGami NMR assessment (location A) as displayed on the tablet screen. The hand is representing the participant’s tapping on the tablet screen © GeoGami.

Figure 2. Self-location (LOC), navigation-to-a-flag (LNV), mark-view-direction (DM) and adopt-view-direction (DA) tasks in the GeoGami NMR assessment (location A) as displayed on the tablet screen. The hand is representing the participant’s tapping on the tablet screen © GeoGami.

We chose tasks that represent the main activities of NMR situations following already existing tests and assessments. In line with the real-world tests of Lobben (Citation2007) and Hemmer et al. (Citation2013), our assessment consists of location and navigation tasks (LOC and LNV tasks). Additionally, we included direction-pointing tasks (Hemmer et al. Citation2013) in a way that is perceptible for mobile devices: instead of pointing with a finger, we ask the participants to turn the body and device in a certain view direction (DA tasks). The existing above-mentioned test instruments include not only configurational and directional aspects of NMR, but also representational transformations such as symbol or label reading. Symbol reading is often not the crux in navigating with a map (Lobben Citation2004). Label-reading must not be the crux to avoid that reading skills influence the NMR performance. Our assessment also represents the configurational and directional correspondences between map and real-world as mentioned by Ishikawa and Kastens (Citation2005): LNV and DA tasks ask for map to real-world transformations and LOC tasks for real-world to the map transformations. We equipped the assessment with DM tasks in order to also include Direction Tasks with real-world-to-map transformations.

In total, there are six LOC, six LNV, four DM, and four DA tasks. These tasks are presented in pairs (twins) of two similar tasks having the same task difficulty. The tasks of each task twin are (i) of the same task type, (ii) located at comparable locations (i.e. the locations are comparable with respect to the surrounding objects and the path structures) and (iii) oriented in the same direction (i.e. the main view direction during or toward the tasks is the same). Having two test halves of the same difficulty allows us to divide the assessment into two parts for conducting reliability estimations and for using the test halves as pre- and post-tests.

In order to differentiate between strong and weak performances, tests and assessments require different task difficulties (Moosbrugger & Kelava Citation2020). Therefore, the task twins in our assessment are created in order to differ by the angle between map alignment (which is fixed to north) and view direction during or toward the task (which is north, south, or west/east). The difficulty of the tasks correlates with the angle between map alignment and view direction. The most difficult tasks and thus the worst performances can be observed for 180° angles (Presson Citation1982). These findings might be explained with different strategies used by the children depending on the angle: for instance, reading 180° rotated maps often lead to a “left-right-reversal” strategy (Aretz & Wickens, Citation1992) (“objects on my right side are on the left side of the map”) or, angles larger than 90°, to mentally rotate the map. For angles smaller than 90°, people tend to mentally manipulate the own perspective (Kozhevnikov & Hegarty Citation2001).

In order to avoid participants changing the task difficulty by rotating the map, we did not allow participants to rotate the device. Lobben (Citation2007) interprets manual map rotations as a criterion for a weaker NMR performance.

The tasks in our NMR assessment are placed in a sequence with Transition Tasks leading from one task to the next one. These tasks are not scored, but give all participants the same initial location and orientation regardless if they solved the previous task correctly or not. Moreover, they separate the task locations in order to reduce knowledge transfer from one task to the next task. The transition tasks preceding a LOC task is an arrow-navigation task leading the children to a new location without showing them the map. The transition task preceding a LNV is an arrow-navigation task but children see their initial location on the map which disappears when they start the LNV task. The transition tasks preceding DM and DA are arrow-navigation tasks, in the case of DM followed by a task turning the children to a certain view direction.

All tasks are given in an easy and short written form or read out loud by the app. The test area is marked with a yellow line on the map. It is allowed to zoom and pan the map. At the beginning of each task, the app zooms out to the whole test area. At the end of each task it zooms into the task area (to the location- or direction-marker given by the task or set by the children) and asks the users to verify their given answer for a second time. This procedure ensures the same zoom level for all participants when solving tasks and prevents accidental responses. Location- and direction-marker are disabled during the assessment and no automatic feedback on the accuracy of the task solution is given. Instead GeoGami plays a sound after solving a task and shows the message “answer saved”.

The GeoGami NMR assessment is created for park areas to minimize the influence of urban structures on spatial NMR: the test area needs a branched network of paths and different repetitive objects for orientation (e.g. benches, trash cans, and trees) without street names and house numbers. These characteristics prevent orientation just by reading street names and house numbers or by recognizing unique landmarks. Instead, they trigger NMR that asks for the combination of several spatial hints for orientation and navigation.

The assessment starts with a tutorial about the tablet usage, task types, and symbol meanings to ensure a common knowledge for all participants with regard to the representational correspondence of real world and map. The total duration of the assessment (without tutorial) is around 30 to 45 minutes. We did control for time, however none of the students reached the time limit of 60 minutes.

The following data is recorded by GeoGami for each task: (i) time duration until a task was solved, (ii) the performance while solving the task such as amount of zooming, panning and (iii) the error: for the navigation tasks the app recorded the distance error of the actual location to the intended location. For the direction tasks the app recorded the direction error as the angle difference of the actual to the intended viewing direction.

NMR competency score development

The existing NMR test instruments described above are based on direct observations. While current research reveals the great potential of GI technologies for automatically collecting and analyzing spatial trajectory data (Mazimpaka & Timpf Citation2016), in the studies on navigational map reading competences mentioned in the related work section, the collected data consist of notes taken by external observers. The potential of GI technologies and its data for assessments has not yet been exploited in the context of map reading. This paper aims to close this gap.

For the computation of a NMR competency score development (c.f. steps in ), we wrote an evaluation script that calculates an NMR competency score based on the participants performance data (Bistron & Schwering Citation2021b). From this data, we derive the children’s accuracy and the process in solving NMR tasks (). The accuracy in solving a task is defined by the deviation (in m or in degrees) from the correct solution. The process data gives additional information on the way a user solved a task. We assume that the number of pan and zoom interactions with the map (for LOC tasks), the length of the route walked (for LNV tasks), as well as the sum of turning movements (for DA tasks) grow with the uncertainty in solving these tasks. These assumptions follow the work of Hergan and Umek (Citation2017) and Lobben (Citation2007) who define similar data for evaluating the NMR performance of their participants, e.g. the number of going “one crossroad too far”, taking wrong turns, stops, spins around the own axis, hesitations, map-rotations, as well as the time in solving a task.

Table 1. Accuracy and process data on each task type in the GeoGami Navigational Map Reading assessment.

Using the accuracy data, we score one point for a correct solution. One additional point is obtained, if the process of giving the correct answer has the following characteristics: depending on the task type, the task has to be solved with small numbers of pan and zoom events, short route lengths (in comparison to the shortest correct solution of all participants) or a small sum of all turning-angles. The resulting score is the sum of all achieved points. Thus, the score scale ranges from 0 to 20 points for considering only the accuracy data and 0 to 40 for considering the accuracy and process data. When focusing on Location Tasks only, the scale is defined from 0 to 12 points for considering only the accuracy data and 0 to 24 for considering the accuracy and process data.

We define thresholds () for assigning distances and route lengths, angles, as well as numbers of pan and zoom events to the different point categories (two, one, or zero points), i.e. for defining a “correct” and “incorrect” or “confident” and “unconfident” solution. These thresholds result from data collected in test-runs in which we simulated different correct and incorrect answers with the help of children and adults. They were verified in the assessment validation described below.

Table 2. Thresholds defining different point categories for defining correct/incorrect confident/inconfident solutions in the GeoGami NMR assessment.Table Footnote1

The score does not consider absolute deviations (in m or in degrees) from the correct solutions. There are two main reasons: (i) smaller deviations from a correct solution might be induced by bad GPS signals or inaccuracies in marking locations and directions on the map and (ii) wrong answers with larger deviations might not necessarily be worse than wrong answers with smaller ones (e.g.: a child could choose a wrong location in a LOC task in 200 m deviation from the correct solution which shows more similarities to the correct solution than any other solution with a deviation of less meters).

Assessment validation

For the assessment validation (c.f. design of the framework in , right part), we conducted a pilot study to evaluate the GeoGami NMR assessment for two different locations.

Study population, setting, and materials

Fifty-one primary school students (29 boys and 22 girls) participated in the study. The mean age was 8.33 (sd: 1.11) years (range 6 to 10). We recruited the participants via letters handed out in local primary schools. The study was conducted as a summer event during the school vacations. We accepted primary school students who lived nearby and thus concluded that they are familiar with the environment of the study area. No further assessment of familiarity or time spent outside (e.g. walking, hiking, riding the bike and thus navigating through the environment) was done. We had no participant dropouts, but technical issues that made the data of one further participant useless. The children were rewarded with a science-toy and a certificate of participation.

The study areas were located in two different parks in North Rhine-Westphalia, Germany, the Citizens’ Park in Senden (location A) and the Wildlife-Park in Coesfeld (location B), each with short distances to walk. 29 participants took part in Senden and 22 in Dülmen.

Each participant had a tablet (with a 10.2” display) in their hand with GeoGami installed. A camera mounted on their head filmed the surroundings from the child’s perspective, the children’s speech and behavior. A screen-video recorded the GeoGami app (incl. digital map, written task instructions, and markers on the map).

Data collection

After the tutorial, the children performed the assessment tasks. The experimenter accompanied them during the assessment. The children were asked to think aloud explaining their task solutions. They were allowed to take breaks and to ask questions about task instructions and symbols, but didn’t received cues about solution strategies or explicit configurational or directional correspondences. All children participated alone without the influence of other participants. The study was done in compliance with the current Covid-19 regulations. The automatically collected data was scored via evaluation script (Bistron & Schwering Citation2021b).

After the experiment, three experts judged each child’s NMR competencies on a scale from zero to ten based on the audio (think-aloud statements) and video material. The experts were student teachers recruited via job posting that asked for strong navigational skills and experiences in working with young school children. The student teachers were payed for this task and signed a declaration of confidentiality. They received instruction on difficulties children can have in navigating with a map. They had neither access to the GeoGami data nor to the participants’ scores. Instead, they had to recognize correct and incorrect answers just by watching the videos remembering the correct locations from conducting the assessment themselves and by consulting photos of the correct locations and task solutions. They used the definition of NMR competencies described above and were asked to analyze every child in a wholistic way (i.e. without just calculating total scores on the basis of the task solutions). We informed the experts that they were allowed to ignore individual tasks if they thought they were not relevant for evaluating the NMR competencies of the children. The experts made their individual judgments and agreed together on one score for each child by discussing the individual cases.

Data analysis

The assessment was evaluated following the work of Moosbrugger and Kelava (Citation2020). The authors emphasize that a test instrument should be evaluated with respect to its intention. We understand our instrument as assessment for navigational map reading that shall reflect a teacher’s evaluation of students’ NMR competencies. It is supposed to be applicable for educational diagnostics as a basis for the development of NMR trainings. Therefore, it has to distinguish between weak and strong performances and should be designed to be used as a pre- and post-test.

Our assessment tasks shall be able to distinguish between weak and strong navigational map readers and, hence, we investigate the item-total correlations, the item difficulties as well as the score distribution. As a reliability measure we decided to estimate the split-half reliability in order to evaluate the assessment in forms of two parallel test halves that can be used as pre- and post-tests. For estimating the reliability, some preconditions need to be checked in forms of a factor analysis. If the preconditions check is not possible (due to the small sample size, as in the case of this study), Moosbrugger and Kelava (Citation2020) recommend estimating the reliability via Cronbach’s Alpha which gives even reasonably results when the preconditions are not met completely.

Knowing that the assessment validation is affected by pre-defined thresholds for the point categories, we did a sensitivity analysis on different possible threshold combinations and its effect on the resulting assessment validation.

The data and assessment validation are documented in an evaluation script in R (Bistron & Schwering Citation2021b) to ensure reproducibility and for applying the NMR assessment in other locations.

Analogue to the test of Hemmer et al. (Citation2013), the tasks (items) of our assessment are expected to be unidimensional, but the participant number is not sufficient for conducting a factor analysis (Bühner Citation2011). Therefore, the assessment validation was done for (i) all tasks (assuming one factor behind all tasks) and for (ii) the Location Tasks and Direction Tasks separately (assuming one factor behind each group of tasks) as well as for (a) the accuracy and process data (assuming one factor behind the data for each task) and (b) the accuracy data only.

Results of the assessment validation

Item analysis and score distribution

We determined the item difficulties Pi of each task (item)  i. As intended, the individual tasks show different difficulty indices spanning from PLNV4=33 (hardest task) to PDA4=93 (easiest task) for location A and PLNV3=23 (hardest task) to PDA3=91 (easiest task) for location B (). Based on the item difficulty indices and the task types, we built task twins and divided the assessment into two halves.

Table 3. Item difficulties  P for each task ranging from 23 for the hardest task to 93 for the easiest task. Based on the item difficulty indices and the task types, the GeoGami NMR assessment was divided into test halves with similar tasks (task twins) in each test half.Table Footnote1

Within the group of Location Tasks and the group of Direction Tasks, we determined item-total correlations with a part-whole-correction for each task. The majority of the items show a highly positive item-total correlation, but we also observe items with a correlation close to zero or with negative coefficients (). Therefore, we proceeded the further assessment validation in two different ways: (1.) we evaluated the assessment for all items and (2.) we also did the calculations by excluding tasks with an item-total correlation lower than 0.3 and their task twin partners.

Table 4. Item-total correlations within the group of location tasks and the group of direction tasks of the GeoGami NMR assessment.Table Footnote1

The NMR competency scores span most of the intended score scale from very low to high scores (). The interquartile range for the score including all tasks and all data (accuracy and process data) is δA=10 (25% of total scale) with a median of 28 and δB=13.75 (34% of total scale) with a median of 20.5. For the Location Tasks it is δA=8 (33% of total scale) with a median of 16 and δB=8.5 (35% of total scale) with a median of 10, describing a wider distribution of the score for Location Tasks than for all tasks, especially for location A. The skewness of the score distribution is negative for location A and positive for location B. For location A the median is always slightly above the mean. When considering accuracy and process data for location B, the opposite is the case.

Table 5. NMR competency score distribution in our GeoGami NMR assessment study.

Estimating reliability and determining validity

We estimated Cronbach’s Alpha for the Location and Direction Tasks separately (assuming one factor behind each group of tasks) as well as for all tasks (assuming one factor behind all tasks) (). For Location Tasks, we observed values of 0.78 and higher for location A and of 0.71 and higher for location B. For Direction Tasks, we observed values of 0.66 and higher for location A and low values of 0.42 and lower for location B. For all tasks, we found values of 0.84 and higher for location A and 0.77 and higher for location B.

Additionally, we decided to estimate the split-half reliability (). We correlated the scorings of the halves via Spearman’s ρ (with p < 0.001) and estimated the split-half reliability with the Spearman-Brown formula. For both locations, the split-half reliability values are higher than 0.82 and, in most cases, higher than the corresponding Cronbach’s Alphas (). The split-half reliability is higher when including all data (accuracy and process data) instead of calculating with accuracy data only.

Table 6. The split-half reliabilityTable Footnote1 for both locations of the GeoGami NMR assessment, exceeding the Cronbach’s Alpha in most of the cases.

We correlated the score of our assessment with the scorings of the expert judgements via Spearman’s ρ. The determined correlation coefficients range from 0.82 to 0.92 (p < 0.001). They are at highest when including process data and at lowest when dropping tasks with an item-total correlation lower than 0.3 ().

Table 7. Correlations coefficients for NMR competency score of the GeoGami NMR assessment and scoring of the expert judgements.1 The correlations are higher when process data is included.

Moreover, we did a sensitivity analysis on different possible threshold combinations and its effect on the resulting assessment validation. In we present five examples showing increases and decreases in reliability and validity, but with consistently high values for reliability and validity. However, changes in the thresholds can also deteriorate the assessment: details on the complete sensitivity analysis can be found in the corresponding evaluation script (Bistron & Schwering Citation2021b) that allows the readers to manipulate and analyze the thresholds on their own.

Table 8. Sensitivity analysis: thresholds and their effect on the GeoGami NMR assessment validation.Table Footnote1 The five examples in the table show small increases and decreases in reliability and validity, but overall a consistently high value for reliability and validity.

Discussion and limitations

Reliability and validity of the assessment

The assessment validation for both locations resulted in satisfactory reliability values as well as adequate validity coefficients. The sensitivity analysis revealed that changes in the thresholds can go along with similar results. However, unidimensionality is precondition for calculating a score or estimating the reliability (Moosbrugger & Kelava Citation2020). We tried to counter this by calculating the reliability for different groups of tasks. However, this needs to be investigated further with a larger data sample. Moreover, since the reported split-half reliability values are higher than the Cronbach’s Alphas, there is a need for conducting a factor analysis to find explanations.

Measurement range of the assessment

An intermediate assessment difficulty as well as a wide range of item difficulties were found for the GeoGami NMR assessment for both locations A and B. This enables the assessment for both locations to distinguish between strong and weak navigational map readers. Moreover, the assessment includes enough easy tasks to motivate students, but still sufficiently intermediate and harder ones to distinguish performances among stronger navigational map readers. However, additional tasks of intermediate difficulty would improve it for distinguishing between performance differences in even higher resolution.

Possible differences in the results for the locations and individual tasks can be explained by differences in the characteristics of the task locations: whereas the park area of location A showed more path intersections and more objects on a smaller scale, in the park area of location B distances were larger and landmarks were less.

Item quality of the assessment

Several tasks of the assessment show a satisfactory item-total correlation. However, a few show an item-total correlation less than 0.3, especially some Direction Tasks and some LNV-tasks, predominantly for the Location B. These results are also represented in the Cronbach’s Alphas: we observed acceptable Cronbach’s Alphas for both locations, except from the ones for the Direction Tasks of location B that are unsatisfactory low. The low item-total correlations may represent a low internal consistency of the items: low values then would represent only slight or no differences in the task performances of strong and weak navigational map readers and negative values would represent better performances for weak than for strong navigational map readers.

The low item-total correlations for task LNV1 illustrates this interpretation: some children passed the flag destination and came back to it, others passed it and chose a wrong destination for the flag. In both groups, several children recognized their mistake afterwards and explained it to the experimenter: since it was the initial task, they were not familiar with the scale and therefore surprised that the flag was located close to the starting point. This initial-scale-problem could be observed for weak and strong navigational map readers equally which might explain the low item-total correlations. However, low or negative item-total correlations can not only be induced by weak internal consistencies but also by the affiliation of individual tasks to different factors (Moosbrugger & Kelava Citation2020) which underlines the need for a factor analysis.

Moreover, we saw that Direction Tasks were less suitable to distinguish between strong and weak navigational map readers and show lower reliability values than Location Tasks, especially for the location B. Two possible reasons can be figured out. First, the Direction Tasks were presented in a package of four tasks (one task for each cardinal direction) which made it easy to derive a task solution from former tasks. Task performances of previous tasks may have influenced performances in the following tasks. Second, Direction tasks are possibly also solvable without transforming directions in mind (e.g. just by turning to objects the arrow on the map is pointing at). Performances in such cases does not necessarily say anything about the children’s spatial NMR competencies and, thus, does not necessarily correlate with the performances in other NMR tasks of the assessment.

Usefulnes of process data for the assessment

Including process data for determining the NMR competency score led to slight changes in the score itself, the task difficulties, the item-total correlations, Cronbach’s Alpha, split-half reliability, as well as the validity coefficients. What we could not find is a clear improvement or deterioration of our assessment when including process data. We assume that we need to examine for each task individually whether the inclusion improves the assessment or not. Two examples illustrate this assumption:

  1. LNV4A was solved correctly by nearly half of the participants. During the experiment we witnessed that a lot of children solving this task correctly by trial-and-error. Since the destination of the flag was unique regarding the shape of a nearby object, the children recognized the destination when seeing it in their environment regardless of whether they expected the destination at that location or not – in other words: weak navigational map readers, possibly with a trial-and-error solution, solved this task correctly. This explains the low item-total correlation for this specific task. The only difference between strong and weak navigational map readers then is the length of the route walked (process data): some children took longer routes than others. A longer route possibly goes along with a trial-and-error solution which would explain the increase in the task difficulty and item-total correlation when including process data into the scoring.

  2. For LOC4A the opposite is the case: including process data for this specific task led to a decrease in the task difficulty and in the item-total correlation. Possibly, some weak map readers chose the solution by recognizing the unique spot directly on the map without thinking spatially and without checking their solution against other locations for a second time via panning and zooming. Thus, low numbers for panning and zooming would not represent a confident or elaborated solution.

We assume that the process data mainly describes the uncertainty in solving tasks. However, in order to learn more about this aspect, a factor analysis is needed to show that process data defines no own factor itself and can be used additionally to accuracy data in order to evaluate task performances.

Recommended settings of the assessment for other locations

As mentioned in the introduction, our research aims to develop an NMR competence assessment that is applicable for different outdoor locations. While the GeoGami tool to implement the assessment tasks as well as the script scoring participants performance are location independent there are certain requirements for the study site when adapted by researchers for their own locations. To ensure that other skills such as reading or symbol understanding do not affect the performance, the study site shall not contain place name signs (e.g. street names or signs with landmark names) that can be found on the map and used for orientation instead of the spatial layout. The same holds for unique landmarks that can be identified via symbols on the map. The study site should consist of a certain spatial complexity: A branched network of paths is required to implement orientation tasks toward different cardinal directions. Several different, repetitive objects that are included in the map representation are important to implement orientation tasks based on landmarks.

Conclusion

We developed the GeoGami NMR assessment, an educational tool for assessing NMR competencies of primary school children aged six to ten in a real-world scenario. The assessment makes use of the digital geogame GeoGami (Bistron et al. Citation2021). It focusses on configurational and directional information and therefore helps teachers to understand the spatial navigational performances of children by excluding confounding factors like text and symbol reading skills.

We evaluated the assessment for two exemplary locations by analyzing the items, estimating the reliability, and validating the NMR competency score against an expert judgment. For these two locations and with the interpretation of the score in the sense of a pedagogical evaluation of NMR performances, the assessment is shown to be reliable and valid, i.e. it is able to quantify children’s NMR competencies with a score that reliably represents teachers’ evaluation of these competencies.

All suggested tasks (Location and Direction Tasks) as well as all kind of data (accuracy and process data) are helpful to learn more about the participants’ NMR competencies. The assessment validation showed that it is sufficient to focus on process data only when it improves the item-total correlation of the individual task. Especially tasks that can be solved by searching for a unique location on the map or in the environment (trial-and-error solution) do not assess NMR competencies as accurate as tasks that can only be solved via spatial thinking. Thus, for evaluating the children’s competencies, not only the accuracy in solving a task but also the process in doing can be important to understand the children’s thinking behind a solution—a finding that can be applied to many learning situations.

Our results are based on assumptions which need to be factor analytically proven with a larger sample size. A factor analysis would also help to decide on dropping individual tasks and on the method to estimate reliability.

Implications for classroom teaching

With our assessment, we are able to distinguish between strong and weak navigational map readers. Since it has no norm-reference, it is not intended to be used for evaluating the performance of an individual against a norm group. Instead, the assessment can reveal the learning progresses of individual children and the differences in children’s learning levels within a group (for a certain location). Teachers can use this knowledge to make instructional and methodical decisions for NMR trainings for a whole class or individual children. Moreover, they can conduct the assessment for evaluating NMR teaching and learning processes by comparing the children’s competencies before and after an intervention (Bistron et al. Citation2022).

Especially in class situations in which multiple children are supervised by only one teacher, the GeoGami NMR assessment can give a reliable overview on individual performances without the need to accompany each child individually in its navigational process. The finding that a digital tool enables the measurement of NMR competencies indicates that it is worth to also investigate related learning analytics solutions that automatically react on the children’s task solutions and behavior by giving suitable and individual feedback, instructions, and support.

Implications for navigational map reading research and future work

We offer an assessment design that is applicable at different locations: it is implemented in a location-based geogame which allows the reader to create and conduct it for any location. We created an evaluation script (Bistron & Schwering Citation2021b) that automatically measures and scores the participants’ performances and evaluates the assessment and its items for a new location. Moreover, the evaluation script helps the reader to determine two comparable assessment halves which can be a basis for developing pre- and posttests for learning progress investigations in NMR trainings. Although the assessment can be applicated at different locations, a validity check for each location is recommended to ensure an adequate application of the assessment tasks within the new location: disregard for the above task and task location requirements can lead to undesirable effects on test scores. For example, putting navigation flags next to unique landmarks can lead to good navigational performances for strong and weak navigational map readers since the task can be solved without thinking spatially.

Although the assessment is transferable to different locations, we suggest future work to consider our findings for the development of a location-independent scientific test instrument for measuring NMR competencies. A virtual reality solution might be a suitable approach in order to avoid location-dependent scores and the need for reevaluating the test for new locations. Virtual solutions would also facilitate the creation of “ideal” worlds that do not include unique locations and therefore prevent trial-and-error solutions.

Additional information

Funding

This article has received funding from the programme “Profilbildung 2020,” an initiative of the Ministry of Culture and Science of the State of North Rhine-Westphalia. The sole responsibility for the content of this publication lies with the authors.

References

  • Aretz, A. J., & Wickens, C. D. (1992). The mental rotation of map displays. Human Performance, 5(4), 303–328. https://doi.org/10.1207/s15327043hup0504_3
  • Baker, T. R., Battersby, S., Bednarz, S. W., Bodzin, A. M., Kolvoord, B., Moore, S., Sinton, D., & Uttal, D. (2015). A research agenda for geospatial technologies and learning. Journal of Geography, 114(3), 118–130. https://doi.org/10.1080/00221341.2014.950684
  • Bartoschek, T., Schwering, A., Li, R., Münzer, S., & Carlos, V. (2018). OriGami - A mobile geogame for spatial literacy. In O. Ahlqvist & C. Schlieder (Eds.), Geogames and geoplay (pp. 37–62). Springer. https://doi.org/10.1007/978-3-319-22774-0_3
  • Bistron, J., & Schwering, A. (2021a). Measuring navigational map reading competencies. A pilot study with a location-based GeoGame. Preprint, https://doi.org/10.31234/osf.io/yps2n
  • Bistron, J., & Schwering, A. (2021b). Open reproducible data of the GeoGami NMR assessment for navigational map reading competencies. https://doi.org/10.17605/OSF.IO/BJUS2
  • Bistron, J., Schwering, A., Erdmann, F. (2021). GeoGami – An educational geogame for training navigational map reading – open source implementation. https://doi.org/10.5281/zenodo.5384903
  • Bistron, J., Schwering, A., Özkaya, I., & Qamaz, Y. (2022). Children’s navigational map reading strategies and implications for educational geogames. In C. Chinn, E. Tan, C. Chan, & Y. Kali (Eds.), Proceedings of the 16th International Conference of the Learning Sciences - ICLS 2022. International Society of the Learning Sciences. pp. 639–646.
  • Bühner, M. (2011). Einführung in die Test- und Fragebogenkonstruktion [Introduction to the Design of Tests and Questionnaires]. Pearson, Hallbergsmoos.
  • CDC-HKEAA. (2017). Geography. Curriculum and Assessment Guide (Secondary 4 – 6). Curriculum Development Council & The Honk Kong Examinations and Assessment Authority. https://www.edb.gov.hk/attachment/en/curriculum-development/kla/pshe/Geog_C&A_Guide_e-Nov_2017_clean_ok.pdf
  • Dabbs, J. M., Jr, Chang, E. L., Strong, R. A., & Milun, R. (1998). Spatial ability, navigation strategy, and geographic knowledge among men and women. Evolution and Human Behavior, 19(2), 89–98. https://doi.org/10.1016/S1090-5138(97)00107-4
  • Field, K., O’Brien, J., Beale, L. (2011). Paper Maps or GPS? Exploring Differences in Wayfinding Behaviour and Spatial Knowledge Acquisition. Proceedings of the 25th International Cartographic Conference.
  • Department of Education. (2013). National curriculum in England: Geography programmes of study. https://www.gov.uk/government/publications/national-curriculum-in-england-geography-programmes-of-study
  • DGfG – Deutsche Gesellschaft für Geographie. (2020). Bildungsstandards für das Fach Geographie für den Mittleren Schulabschluss mit Aufgabenbeispielen [Educational Standards in Geography for the Intermediate School Certificate with Task Examples]. https://geographiedidaktik.org/bildungsstandards/
  • Hemmer, I., Hemmer, M., Obermaier, G., & Uphues, R. (2008). Räumliche Orientierung - Eine empirische Untersuchung zur Relevanz des Kompetenzbereichs aus der Perspektive von Gesellschaft und Experten [Spatial orientation - An empirical study of the relevance of the competence area from the perspective of society and experts]. Geographie Und Ihre Didaktik, 36, 17–32.
  • Hemmer, I., Hemmer, M., Kruschel, K., Neidhardt, E., Obermaier, G., & Uphues, R. (2013). Which children can find a way through a strange town using a streetmap? - Results of an empirical study on children’s orientation competence. International Research in Geographical and Environmental Education, 22(1), 23–40. https://doi.org/10.1080/10382046.2012.759436
  • Hergan, I., & Umek, M. (2017). Comparison of children’s wayfinding, using paper map and mobile navigation. International Research in Geographical and Environmental Education, 26(2), 91–106. https://doi.org/10.1080/10382046.2016.1183935
  • Hoffmann, T. (1999). The meanings of competency. Journal of European Industrial Training, 23(6), 275–285. https://doi.org/10.1108/03090599910284650
  • Hurst, P., & Clough, P. (2013). Will we be lost without paper maps in the digital age? Journal of Information Science, 39(1), 48–60. https://doi.org/10.1177/0165551512470043
  • Ishikawa, T., & Kastens, K. A. (2005). Why some students have trouble with maps and other spatial representations. Journal of Geoscience Education, 53(2), 184–197. https://doi.org/10.5408/1089-9995-53.2.184
  • Ishikawa, T. (2019). Satellite navigation and geospatial awareness: Long-term effects of using navigation tools on wayfinding and spatial orientation. The Professional Geographer, 71(2), 197–209. https://doi.org/10.1080/00330124.2018.1479970
  • Jones, M., & Lambert, D. (Eds.) (2017). Debates in Geography Education (2nd ed.). Routledge. https://doi.org/10.4324/9781315562452
  • Kozhevnikov, M., & Hegarty, M. (2001). A dissociation between object manipulation spatial ability and spatial orientation ability. Memory & Cognition, 29(5), 745–756. https://doi.org/10.3758/BF03200477
  • Liben, L. S., & Myers, L. J. (2007). Developmental changes in children’s understanding of maps: What, when, and how? In J. M. Plumert & J. P. Spencer (Eds.), The emerging spatial mind (pp. 193–218). Psychology Press Ltd. https://doi.org/10.1093/acprof:Oso/9780195189223.003.0009
  • Lobben, A. K. (2004). Tasks, strategies, and cognitive processes associated with navigational map reading: A review perspective. The Professional Geographer, 56(2), 270–281. https://doi.org/10.1111/j.0033-0124.2004.05602010.x
  • Lobben, A. K. (2007). Navigational map reading: Predicting performance and identifying relative influence of map-related abilities. Annals of the Association of American Geographers, 97(1), 64–85. https://doi.org/10.1111/j.1467-8306.2007.00524.x
  • Mazimpaka, J. D., & Timpf, S. (2016). Trajectory data mining: A review of methods and applications. Journal of Spatial Information Science, 13, 61–99. https://doi.org/10.5311/JOSIS.2016.13.263
  • Moosbrugger, H., & Kelava, A. (2020). Testtheorie und Fragebogenkonstruktion [Test theory and questionnaire design]. Springer.
  • Münzer, S., Zimmer, H. D., Schwalm, M., Baus, J., & Aslan, I. (2006). Computer-assisted navigation and the acquisition of route and survey knowledge. Journal of Environmental Psychology, 26(4), 300–308. https://doi.org/10.1016/j.jenvp.2006.08.001
  • NAEP – National Assessment of Educational Progress. (2020). The NAEP geography achievement levels. https://nces.ed.gov/nationsreportcard/geography/achieve.aspx
  • Presson, C. (1982). The development of map-reading skills. Child Development, 53(1), 196–199. https://doi.org/10.2307/1129653
  • Speake, J., & Axon, S. (2012). “I never use ‘maps’ anymore”: Engaging with Sat Nav Technologies and the implications for cartographic literacy and spatial awareness. The Cartographic Journal, 49(4), 326–336. https://doi.org/10.1179/1743277412Y.0000000021