1.1 Collecting social network data
Social network analysis investigates the relationship between social entities (Newman, 2010). It has been used since the mid-1930s, but it is in the 1970s that modern social network analysis has emerged [see Freeman (2004) and Scott 2017 (Chapter 2) for a history of social network analysis]. Modern social network analysis uses methods from graph theory and depends on relational data which describes “webs” of relationships between sets of social entities called actors in social science (Newman, 2010). These networks can be represented as graph composed of vertices and edges — the vertices correspond to actors (e.g. persons, organizations) and edges to their relationships (e.g. friendship, business relationship). Two types of social networks can be distinguished — sociocentric and egocentric. Sociocentric networks focused on a set of interrelated actors bound to a specific context (e.g. the social network of students in a specific classroom) while egocentric networks focused on the local network of one actor (e.g. a person’s friendship network) (Carrington et al, 2005 ). Social networks can also use different level of analysis – actors may range from people, groups to larger social systems (e.g. villages, schools). In this paper, we will concentrate on egocentric network at the person level (i.e. describing a person’s direct links to her/his peers).
The most widely used method to collect social network data is surveys with self-report questionnaires (Marsden, 1990; Marin and Hampton, 2007). In an egocentric design, a focal person (i.e. ego) is asked to recall about other people who he or she is linked (i.e. alters) (Carrington and al, 2005). Survey questionnaires can use “global” item to assess the size (i.e. number of alters), the composition (e.g. friends, family) and other measures of a person’s network. For example, in the Canadian General Social Survey (StatCan, 2017), people are asked how many close friends they have and how often they communicated with them. This approach presents some limitations — it requires extensive cognitive demands because the respondents need to estimate all of his alters (Marin and Hampton, 2007) and it does not provide relational data usable in social network analysis. A common alternative are name generator and name interpreter questionnaires. A name generator is structured around one or multiple questions used to assess a list of individuals (i.e. alters) with whom a person is related (i.e. ego). Within this framework, alters can be collected through four different ways – people with whom the ego is in contact for a specified period (interaction approach), people defined by certain roles in specific context like friends, coworker or relatives (role-relation approach), people who provide support (exchange approach) or according to the “closeness” or the “affective value” of ties (affective approach) (Marin and Hampton, 2007). Therefore, name generators allow the assessments of a part of a person social networks where the boundaries of these networks depends on the specific question(s) used to generate alters. Afterward, additional information on alters and relations can be asses via a name interpreter. It can include characteristics of enumerated people (e.g. age, education), details on ego-alters relationships (e.g. emotional closeness, duration of relations, frequency of contact) and relations between identified alter s (e.g. which alters are friend) (Marsden, 1990).
Measured network with name generators/name interpreters are sometimes referred as “cognitive” networks because they rely on the personal interpretation of someone’s own social network (REF). They are opposed to more “objective” measures that can be collected using for exemple proximity sensors (REF), online social networks (REF), and call detail records (REF). Researchers using questionnaires may still seek to approximate operational social ties and not only respondent perception (REF). When comparing social network measures collected through “cognitive” and “operational” operational, people are better to recall intimate relations compared to weaker one (Brewer 2000) and typical social relations than time-specific social interactions (Freeman 1987). Name interpreters are better for observational features (e.g. sex, age, ethnicity) compared to more subjective features (e.g. political preferences) and as for name generators, name interpreter data for close ties are more valid than for weak ties (For a review on name generator validity, see Marsden 1990 and Marsden 2005).
1.2 Collecting activity space data
Definition of activity space
People “constantly makes geographical decision and act upon them” (Jackle, 1976, p.1) resulting in mobility patterns that occur in space and time (chaix et al, 2013) and these individual spatial behaviours can be studied through the lenses of the activity space concept (Rai et al 2007). It Activity space can be defined as the area in which people move during their day-to-day activities (Albert and al, 2000) and it contains all the locations to which an individual is in direct contact during his daily activities, the travelling pattern between them and the area surrounding these locations (Järv, 2015). The activity space is related to the concept of individual action space, which was developed by American geographers in the 1960s and 1970s to describe all people interactions with their environment, or in reaction to it (Golledge and Stimson, 1998). It proposes that people interact with their environment directly by moving within the geographical space and indirectly through media and communication devices – the movement and communication components. Through these interactions, people gather information and subjective appreciation about all the places that compose their action space. Therefore, the action space of a person comprises all the places with which she or he can potentially interact, and the activity space is the movement component of the action space (Jackle, 1976).
Temporal aspect of activity space
The activity space can be viewed from a temporal perspective. We can first derived probability distributions of doing activities according to the time period by considering the frequency and the regularity with which people participate in specific activities. For example, regularly scheduled activities (e.g. work, school) are more time specific (e.g. from 9am to 5pm) compared to shopping trips for items consumed on a regular basis (e.g. grocery shopping) or randomly occurring activities (e.g. emergency health situation). Activities can also be time-contagious – participating in an activity raise the probability of participating in the same activities afterward (e.g. taking a yoga class). On the opposite, shopping for goods would result in a zero probability of doing the activity after the item is purchased (e.g. buying breads) (Jackle, 1976). Another temporal perspective is to consider the overall scheduling of activities in a specific period (e.g. 24h). It focusses on start/end time and duration of activities, and it allows decomposing the budgeting of time that compose a person’s activity space (REF ) and it’s an effective way to observe temporal changes in activity during a person life (REF). It also gives information about the sequence in which activities are organized and to reveal activities realized du
ring multipurpose trips (Hanson, 1980). Having discussed some temporal aspects of activity space, we can now explore some spatial aspects.
Spatial aspects of activity space
Perhaps the simplest spatial indicator of individual activity space is the trip distance between the origin and the destination (REF). This measure has been used to study distance decay at the person level which postulates that the frequency of trips decreases when the distance between the origin and the destination increase. However, the relation between frequency and distance is more complex and is highly related to the nature of the activities (REF), the time period (REF), the accessibility and the time of transportation used (REF). One step further in considering the spatial aspects of activity space is to measure two-dimensional localization of destinations . It allows mapping the distribution of activity location in the geographical space. From these we can observe the presence of directional bias – the localization of activities in particular area because of people’s preferences and not their distance from the origin (REF). Mobility and spatial distribution of activities can also take different patterns, from being mostly aggregate around the home to being multipolar with different centres of activities (REF). Spatial aspects can also be viewed in terms of constancy – certain activities are more often realized at a same location while others are done at multiple locations (Hanson, 1980) . Movement patterns between activity locations can also be geolocalized which informs on travelling patterns (REF), identifiesy multipurpose trips (e.g. leaving home to see a friend and in the way stop at the electronic store to buy a device) and assess environmental exposure during travel (REF).
Measuring activity space with map based questionnaire
Map-based questionnaire are effective tools to collect information on people activity space. Using two-dimensional (REF) or raised-relief map (REF), people can localize route, places and areas generally using geometrical shapes (i.e. points, lines and polygons). Then, they can provide further information about identified spatial objects (e.g. frequency of going to places, transport mode, attachment to places, perceived limits of the neighbourhood). The development of computational performance and geographical information systems has greatly improved our capability to collect, store, analyze and visually represent large amount of geographic data, which has greatly improved our capability of using map-based survey in research projects (REF). This method is close to People Participatory Geographical Information System (PPGIS) and Voluntering Gegraphical Information System (VGIS). PPGIS is a participatory approach of Geographical Information System that collect spatial information from people and communities to inform decisions with spatial implications (Brown, 2011), and promote collective-decision-making and community empowerment (REF ). VGI systems are user-generated GIS where geographical data are voluntarily provided by people (e.g. OpenStreetMap) (Goodchild, 2007). Map Based Surveys, PPGIS and VGIS all involve identification and investigations of locations from volunteered and / or selected participants.
Comparison with other activity space measurement tools
There are advantages and limits of collecting activity space data with map-based questionnaires when compared with time-budget recording and tracking devices. “ A time budget is a systematic record of a person’s use of time over a given period. It describes the sequence, timing, and duration of the person’s activities, typically for a short period ranging from a single day to a week” (Anderson, 1971, p.353). Time-budget data are generally recorded from recall, where activities are recalled for a defined time period, or with a diary, where people keep track in a diary of their activities in a free-format manner or according to predefined time scheduled (Golledge). These methods are more effective to assess temporal aspects of activity space while map-based questionnaire focuses more on the spatial aspects. Therefore, time-budget methods can be complementary to map-base questionnaire.
Another approach is to use tracking devices linked to Geographical Navigation Satellite Systems (GNSS) (e.g. United State GPS, Russian federation GLONASS and developing European Union Galileo) (Global Positioning: Technologies and Performance, 20XX). In comparison, Map-based survey collect declarative data while GNSS loggers gather objective data. Tracking devices also collect continues polylines which provide relatively reliable data to assess travel patterns and derive travel mode (Bohte and Maat, 20XX), although declarative information on travelling can be collected through a map-based questionnaire. For activities locations, Map-based questionnaire assess destinations over a long period of time while GNSS loggers gather mostly information for a short period. Therefore, Map-based questionnaire data may provide better estimation of activity locations visited regularly over a long period of time. Furthermore, GNSS loggers gather real-time data which inform on the time, the duration and the sequence in which activities are happening (Chaix and al, 20XX). GNSS trackers cannot also distinguish different activities that are happening at a same location (e.g. community centres with shared space for multiple cultural and sporting activity) without a post-validation process (REF). Finally, we can collect qualitative information with Map-based questionnaire to measure people’s perceptions of places (REF). As with time-budget methods, GNSS loggers and map base questionnaire provide complementary information on different aspects of individual activity space.
1.3 Linking social network and activity space
Conceptual and empirical relations
Early conceptualization of human spatial behaviours was already discussing relations between people’s social network and their activity space. Jackle (1976) point out that the action space “specifically draws attention to the individual’s relationships with his social and spatial environment and allows us to examine the pattern in which individuals interact in space.” This proposition highlights that action space (and by extension activity space) rely on people-space interactions, but also on interpersonal interactions which is fundamental mechanism of social networks (REF). Human social interactions largely depend on face-to-face contact (REF) and social relations are seen theoretically as travelling motivators since face-to-face meetings results from a joint decision and depends on other persons geographical decisions (Carrasco, 2008; Axhausen et al, 2012 ). This relation may also be reciprocal — travelling behaviours may be important to maintain distant social ties (Larsen, 2006). These propositions are coherent with the idea that a person activity space is influenced by the activity space of their friends and relatives (Jackle, 1976).
When looking at environmental aspects of activity spaces, most of today’s human activities happen within built environments (REF) which are physical environments built by human actions (Roof and Oleru, 2008), and characteristics of these environments explain spatial distributions of human social interactions. Public and commentary spaces (Heller, Byerts, & Drehmer, 1984 ;Cantor, 1975; Bernard et al. 2007), workplaces (REF), places of worship (Thomas et al. 1994; Krause 1994) and certain commercial spaces (REF) allow people to meet and engage in social interactions and it has been argued that these are essential places for creating and maintaining social relationships (Baum and Palmer 2002; Waxman and Ph 2006). Furthermore, the social nature of certain places justifies their presence within the built environment (Oldenburg 1989). The amount of vegetation in common outdoor residential
spaces is also positively related with composite measure of social interactions with neighbours (Kuo et al. 1998; Kweon, Sullivan, and Wiley 1998) while environments with major transportation infrastructure , graffiti, and neglected or abandon infrastructure discourage people of interacting with others within these environments (Baum and Palmer, 2002).
Despite the growing interest in the interplay between social interactions, spatial behaviours and aspects of the spatial environments in which activity take places (REF), there are few studies that collected integrated measures of social networks and individual activity spac e. In transportation science, Lasren et al (2006a) and Carrasco e al (2008) developed two questionnaires to explore how travelling behaviours and modern communications explains spatial dimensions of people social networks. Their approaches allow in-depth exploration of people socio-spatial environment either by using extensive mix-method approach, by collecting detailed information on alters “closeness” or by considering interactions through multiple communications means. However, measured social networks is limited to a small number (6 or 10) of close affective ties , and activity locations are either restricted places of social episodes and alters residences . In a third study from criminology, Masson et al (2010) combined a name generator and an activity space questionnaire to identify socio-spatial networks of adolescents. Respondents were first identifying up to five close personal contact, and then they were free listing their weekly routine activity locations (and their position in space). Social networks and places were linked posteriori by questioning “who in your network is with you at each location?” This approach allows a broader assessment of the activity space by free-listing weekly activity locations, but collected social networks is also limited to a small number of close affective ties.
The objective of this paper is to present a data collection tool that answer those limitations. It jointly measures people’s activity space and social networks with less restriction on collected information’s, and by simultaneously documenting where and with whom people do their activities, it opens the possibility to analyze their socio-spatial patterns. In the following section, we begin by describing the structure of the questionnaire and then we illustrate analysis possibilities using data from the CURHA project.
2 VERITAS SOCIAL: A Map-based questionnaire that georeference social networks
The questionnaire
We developed VERITAS Social in the context of the CURHA study, an international research project on healthy ageing in contrasting urban context (Kestens and al, 2016). A social module was added to the VERITAS questionnaire (Visualization and Evaluation of Route Itineries, Travel destinations, and Activity spaces), a map-based questionnaire using a Google Map © Application Programming Interface (API) allowing respondent to geolocate a series of predetermined spatial information’s using points, lines and polygons, and then provides further information on identified spatial object. VERITAS was used during the Record Study for mapping people regular destinations, route between locations (Chaix and al, 2012) and limits of perceived neighborhood (REF).
We build the social module from the VERITAS capability to map activity locations. Respondents are asked to geolocate a list of predefined regular destination (e.g. do you go to a park at least once a month?) and provide further information on transportation mode and frequency of visiting these places. Then, each time a participant identify an activity location, they are asked if they usually perform this activity with someone else. This step work as a name generator from which a participant can identify network members. Then, a name interpreter is used to inform on the role relation (e.g. friends, family), the affective status (e.g. close, acquaintance), the frequency of interactions, the exchange of supportive behaviour, the length of their relationship, if they live in the respondent neighborhood and demographic characteristic for each alters. Participants can also identify groups of persons that they met at specific locations (e.g. community centre choral). In this case, the name interpreter is reduced to the role relation, the frequency of interaction, to how many people compose the group and for how long they’ have been seeing those people. Therefore, these recursive steps allow joint collection of activity space and social network data that are connected by knowing with whom the activities are usually performed.
At the end of the questionnaire , people can identify additional network members that were not met at activity locations. For the CURHA project, participants were asked with whom they may discuss important matters and then specify by which methods they usually communicate (e.g. telephone, mail) – measuring other supportive social ties were relevant for this public health research project CURHA (Kestens and al, 2016). Other types of ties could be obtained by modifying this question (e.g. remote friends that are kept in touch through communication technologies). At last, participants are asked to define the relation between alters, which allow calculation of network structural properties (Newman, 2010).
Collected data
Collected information with VERITAS Social can be separate into three data sets. The first contain social network data, the second contain activity space data, and the third contains relations between both data sets (see figure 1).On the one hand, the composition, the relations between actors and the structural properties of participants social networks can be described with social network analysis. The centrality of certain actors, the homogeneity/heterogeneity of the alters based on their sociodemographic characteristics and density (i.e. the ratio of the number of links over the possible maximum of links in one network) (Newman, 2010, Carrington et al, 2005) are options for indicators to characterize obtained social networks. On the other hand, indicators of spatial behaviour and exposition to multiple environment can be derived from the activity space data when linked to a Geographic Information System (see Chaix and al (2012) for a discussion of possible indicators derived from the activity space data). The novel aspect of this questionnaire is that both data sets are linked by measuring with whom they regularly perform identified activities (Figure 1). This means that network members can be georeferenced with the locations of social interactions and that activity locations can be characterized by the people that are met at those places.
In the next section, we illustrate different analysis that can be done with these data. We will question (1) if the proportions of activities engaged with others change according to gender and types of human settlements (i.e. cities of different scales), (2) if types of network members (e.g. children, friends) are not randomly distributed across types activity locations (e.g. leisure and associative activity locations, shopping) and (3) we will explore how social network and activity space measures can be combined using bipartite networks. We conduct those analyses with data from the Canadian sample of the CURHA project, an international research project for which we developed VERITAS Social.
3 Illustration of analysis potential
The CURHA research project
CURHA is an international research project with a shared protocol to collect extensive data on mobility, built environment, social networks and healthy ageing outcomes. It is built on two pre-existing and one developing cohort of adults and elderly people living in contrasted human settling in France, Canada and Luxembourg (see Kestens et al (2017) for more information o
n the CURHA project). For the current article, we used the Canadian population which is composed of 183 older adults aged from 80 to 95 years old, with 51.4% female and living in two different areas (i.e. Montreal and Sherbrooke). One criterion of inclusion was that all participants had good cognitive function – they all scored at least 17 in the Mini Mental State Examination questionnaire (Cockrell and Fostein, 2002). People were sampled from the NuAge cohort (Gaudreau et al, 2007), an age- and sex-stratified random sample from the Quebec Medicare database (RAMQ – Régie de l’assurance-maladie du Québec) for the regions of Montreal and Sherbrooke in the province of Québec, Canada (see Kestens et al (2017) for more information on the sampling procedure). All participants answered the VERITAS Social questionnaire in spring and autumn 2015.
Relations between social networks and activity space
First, we calculated the proportion of all reported activities that were engaged with other people and tested if it changes according to participant gender and living areas. We used a two-way ANOVA with type III sum of squares – we had unbalanced categories and we were considering the interaction terms. We used heteroscedasticity corrected covariance matrices to obtain unbiased estimates of the standard errors (Long and Ervin, 2000). Type III ANOVA were calculated with the car package (REF) in R 3.4.1. (CRAN, 2014). We found that the interaction term between gender and living areas (i.e. participant residence is in Sherbrooke or Montreal) is significant (p = 0.048145). We calculate significance of living area main effects by changing the gender reference level in the ANOVA. We found that the change in mean proportion between Montreal and Sherbrooke is significant for women (Montreal = 39.44% | Sherbrooke = 56.98% | p = 0.0029) but not for men (Montreal = 52.75% | Sherbrooke 52.45% | p = 0.9642) (figure 2). To test if types of network members are randomly distributed between types activity locations where they are encountered, we categorized our 29 activity locations in 8 categories (Table 1). Then we used multilevel multinomial logistic regression with types of activity location as a categorial dependent variable, and the number of friends, family members and children with whom the activity is engaged as discrete independent variables. We also used the presence of the spouse as a binary independent variable. Activity locations and network members were nested in participants since each participant may report multiple activity locations. Therefore, we used participant IDs as random effect (Table 2). We used the gllamm package (REF) in STATA 14.0 (REF) to calculate multinomial logistic regression with Food stores as the reference category. We found that spouses were more often visited , but less often involved in hairdressers, health services, others and residence. Friends, children and family members were more involved in leisure, religious and associative activities, visiting, residence and shopping. Friends were also less engage in health services.
Combining social and spatial data with bipartite networks
Another option to analyze VERITAS Social data is to integrate both data sets using bipartite networks. In graph theory, a bipartite network is composed of vertices that are partitioned into two disjoint sets such that there are no edges between two vertices of a same set (Asratian et al, 1998). Bipartites networks, also called two-type networks in social science, is a convenient way to represent relations between two different types of objects (Newman, 2010). They have been used to represent different systems such has ecological communities (Feng, 2014), scientific collaboration networks (Peltomäki, 2006) and professional soccer players (Onody and Castro, 2004). We can use bipartite networks to analyze VERITAS Social data with vertices representing network members and activity locations, and edges the relations between both sets of vertices. Bipartite allows to (1) visualize collected data and (2) to calculate network indicators that integrate both social and spatial information . To illustrate, we used data from two cases of the CURHA study – an 83-year-old single woman living in the Sherbrooke area (ID = 10943) and an 88-year-old man living in the Montreal area. First, we used a Force Directed Drawing algorithm (Kobourov, 2012) to visually represent bipartite networks of both participants (figure 3). This allows representation of network members and activity locations according to their relations. Second, we calculate the number of connected components, density, degree distribution, betweenness centrality and community structure to compare vertices, edges and structural properties of both bipartite networks (see table 2 for a brief description of these measures). Participant 10943 has a smaller bipartite network with 4 social members and 9 activity locations – participant 15451 bipartite network is bigger with 13 social members and 20 activity locations. The first is divided in 3 components, while the second is divided in 2. Both are divisible in 4 modules. The first is more connected (density = 0.278) compared to the second (density = 0.131). Maximum degree is higher for social members, but the median level stays relatively for social members and activity locations low in both networks. Maximum betweenness is also relatively higher for social members compared to activity location in both networks.
4 Discussion
VERITAS Social is a novel tool that collect integrated data of people activity space and social network. It’s an extension of an existing map-based questionnaire that was first developed to collect information on people activity space (Chaix and al, 2012). A name generator was added to this questionnaire to identify people that are met at different activity locations in order to connect people social networks with their activity space.
By combining a map-based questionnaire and a name generator / name interpreter, VERITAS social shared strength and limits of both methods. Map based questionnaires are good to evaluate spatial aspects of individual activity space, but are less adapted to evaluate the temporal aspects compared to time-budget methods. Map-based questionnaire can also collect activity locations over a long period of time, but information is self-report and subject to memory bias . However, for regularly visited locations, Shareck et al (2013) as shown that VERITAS questionnaire has high convergent validity with 8 days GPS tracking data which support utilization of this questionnaire to assess activities that are performed on a weekly basis. For a better estimation of people activity space, VERITAS can be used in combination with travel diaries, time budget questionnaires and/or GNSS loggers (Kestens and al, 2016). For social network measurement, people are better to recall typical (e.g. friends, family, people you see regularly) than time-specific relationship (e.g. people you met Friday morning) (Freeman, 1987), and better to recall close ties than weaker one (Marsden, 1990). Therefore, typical and close relations may be more represented social networks collected using VERITAS Social. For the relations between alters, only the ones known by the respondent are identified since their recall by her or him. Relations between alters might be overrepresented for alters that are closer or better known by the respondent.
In comparison with other questionnaire, VERITAS Social allow a broader assessment of integrated social networks and activity space data. Types of collected activity locations are defined a priori (e.g. bakery, park) and there is no limit in the number of destinations that can be identified by a respondent. The respondent also has the possibility to identify other destinations visited with the same criterion (e.g. at least once a month in CURHA) but that were not listed previously. There is also no limit in the number of identified a
lters, either met at specific activity locations or with whom they are in contact through other communications means (e.g. telephone, internet).
VERITAS questionnaire is built as a web application linked to a remote SQL database to facilitate data collection effort. Reported destinations are directly geocoded and usable in a Geographic Information System (GIS) and using a Google API allow respondent to search addresses directly in the Google © database. Collected data are directly transferred to the SQL database which minimizes errors and human time from transferring data from paper support to computer files (REF). Finally, using a web application format allows the questionnaire to be auto-administered to people who have access to an internet connection and have sufficient skills with information technology.
By collecting extensive data on people’s social network links to their activity space, VERITAS Social offers new possibility of interrogating the relations between both concepts, but also to question how people socio-spatial environment relates to other outcomes. Using data collected in the CURHA research project on healthy ageing in contrasting urban context, we found that the proportion of activities engaged with other people change between women living in different urban areas, but remain similar for man. We also found that types of social relationships are not met in the same activity locations – friends, children and family members seems to share similar spatial patterns compared to spouse. We also showed that bipartite networks are an effective way to visualize related social network and activity space data. The algorithm that we used in the previous example distribute nodes according to their overall relations in the whole bipartite networks. A next step would be to position activity nodes (i.e. places) depending on their geographical locations, and then distribute social nodes according to their relations with activity nodes. This would allow to consider the spatial distribution of activity locations and project measured networks on map layers. Bipartite networks also allow to calculate indicators that describes these related data sets. Our example on two cases from the CURHA study showed that they differ in terms of the number of components, community structure, distribution of degree and betweenness, and network density. These metrics can later be calculated for the whole study population to measure their distribution, to relate them with participant sociodemographic characteristics, with the built environment in which they are exposed, and with healthy ageing outcomes. These case studies illustrate the potential of using bipartite networks to analyze data collected with VERITAS Social.
5 Conclusion
VERITAS Social is a new tool to collect data on people social networks and activity space which are both related by identifying who is met at each activity locations. It’s a combination of a map-based questionnaire using a Google map © API and a name generator / name interpreter. Therefore, it contains advantages and limits of both methods. In comparison with other questionnaires, VERITAS Social collect more extensive data on activity locations and social network members, while considering relations between both. Consequently, we believe it offers new opportunities to study relations between people spatial behaviours and social environment, and to question how a person socio-spatial environment might influence, or be influenced by other outcomes.