Publication discusses newly discovered phenomena of population mobility
The results of the research presented in this publication are based on in-depth technological knowledge of cellular networks; data distributed computing in the Apache Hadoop platform as well as other technical and economic disciplines. The scope of the book focuses on population mobility data that become available once the above mentioned knowledge has been mastered. It ought to be said that our intention was to introduce the basic building blocks on which subsequent analysis could be applied in a simple way. The publication contains several examples of application and describes some previously unknown phenomena discovered, such as the need to consider a dynamic number of residents in the locations. Nevertheless, we intentionally leave up to the readers to consider the possibilities for an even better fulfilment of their duties.


Figure i
Comparison of measurement results of out-commuters with CZSO data
Figure ii
Selection of the trajectory’s definition point from the point of view of time spent in the station
Figure iii
Overview map of departure basic resident units which had been aggregated into municipalities of the tested segment (green)
Figure iv
Functional territory, reference territory and reference fixed routes
Figure v
Number of residents in selected stations and days
Figure vi
Comparison of various methods for the determination of out-commuters
Figure ix
Graphical representation of the relation matrix –out-commuters in the direction of the fixed route CZSO / VISITS / TRIIPS
Figure x
Comparison of the number of out-commuters with out-commuters in the direction of the fixed route
Figure xi
Comparison of the number of out-commuters with out-commuters in the direction of the fixed route at the level of stations
Figure xii
Share of passengers traveling by trains on transport demand
Figure xiii
Out-commuters transport demand potential, carried passengers and CZSO data
Figure xiv
Overall transport demand and number of boarded passengers per days and fixed routes from Monday to Sunday
Figure xvii
Carried passengers between-the-stops on fixed route in the context of total transport´s demand potential
Figure xviii
Comparison of the number of train passengers getting on and off on the fixed routes
Figure xix
Number of passengers getting on and off the trains per stations and days
Figure xx
Number of boarding passengers in selected stations and hours during the week.
Figure xxi
Visual comparation of key variables of the fixed route in hours of the week (overview and segment detail)
Figure xxii
Carried passengers between-the-stops on the fixed routes in the context of train connections´ capacity
Figure xxiii
Comparing peak hour load with transport demand on the fixed route
Figure xxiv
Peak hour load split according to vehicles and hours of the day
Figure xxv
Unsuitable average capacity of the train vehicle


Data items are described in order to ensure an ideal relationship between the length of the text and meta information per item (the name itself identifies the data item precisely) and at the same time enables their engine processing (issue of coding, admissible characters, etc.)

Table 1 provides an overview of the original description of variables (data items) used in the case study.

Variable group
Abbr. (chap. in method.)
Name of the variable
Variable - definition (1)
Variable- definition (2)
Variable - source (1)****)
Residence
RES-cso (2.3.7)
Residents (CZSO)
A resident is any person who has lived for a long time in the place where he has his household, or family.
RESIDENCE IN THE DECISIVE MOMENT (26. 3. 2011). The place in which the participant has actually lived in the long term and in which he has his household, or family. A person’s permanent residence is not relevant, neither is the fact that a person spends major part of the week at a different place due to work or studies. Where the place of residence differs from the address in the header, it should be noted as precise as possible.
CZSO
Out-commuting
OUTC-cso (2.3.7)
Out-commuters (CZSO)
Out-commuters include people commuting to work and students commuting to schools
In the analysis of the out-commuting, it is assumed that the out-commuters include not only employees (excluding working students and apprentices) but also students (including the working ones).
CZSO
Out-commuting
NOC-cso (2.3.7)
Non-commuters (CZSO)
Non-commuters are non-working pensioners, people with their own livelihood, homemakers, preschool children, other dependent persons and people with no economic activity established
Non-commuters include: // non-working pensioners (u113-101201) // people with their own livelihood (u113-101301) // homemakers, pre-school children, other dependant persons (u113-101401) // people with no economic activity established(u113-101601)
CZSO
Out-commuting
OUTCW-cso_LAU2 (2.3.2)
Out-commuters to work and schools (CZSO)
Out-commuters to work or schools are people residing within the municipality that leave from the municipality to work or school in another municipality
Out-commuters to work or school (CZSO) are people residing within the municipality that leave from the municipality to work or school in another municipality. Data only include economically active participants who state the exact destination of their out-communing.
CZSO
In-commuting
INCW-cso_LAU2 (2.3.2)
In-commuters to work and schools (CZSO)
In-commuters to work and schools are people that do not reside within the territory of the municipality (residing in other municipality than in the in-commuting target municipality) and terminating their trip (with the in-commuting destination) in the given municipality.
In-commuters to work and schools (CZSO) are people that do not resided within the territory of the municipality (residing in other municipality than in the in-commuting target municipality), and terminating their trip (with the in-commuting destination) in the given municipality. Data only include economically active participants who state the exact destination of their out-communing.
CZSO
Residence
RES-mn (2.3.7)
Residents in the station according to mobile network data
Residents in the station include people who, on the reviewed day, are in the station early in the morning and late in the evening. This station is considered their place of residence.
On a single reviewed day, the participant was recorded in the place early in the morning 0:00:00 - 04:10:00h and in the evening hours 19:50:00 - 23:59:59h
MN
Visits
OUTC_VIS (2.3.2)
Visits of out-commuters residing in the station to destinations outside the station
Visits of out-commuters who left the station of their residence on the reviewed day and spent a defined minimum aggregate time in at least one destination stations. The PLACE OF RESIDENCE-DESTINATION relation.
Visits in destinations are made by the residents of the station who leave the station to arrive in the out-commuting destination in a station other then the station of their residence, and spend a pre-defined period of time within 24 hours (00:00:00 – 23:59:59) in one of the destinations.
MN
Visits
OUTC_VIS_FRT (2.3.2)
Visits of out-commuters residing in the station to destinations outside of the station in the direction of the fixed route.
Visits of out-commuters who left the station of their residence on the reviewed day and spent a defined minimum aggregate time in at least one destination stations in the direction of the fixed route. The PLACE OF RESIDENCE-DESTINATION relation.
Visits in destinations are made by the residents of the station who leave the station to arrive in the out-commuting destination in a station other then the station of their residence, and spend a pre-defined period of time within 24 hours (00:00:00 – 23:59:59) in one of the destinations, provided these stations are located in the direction of the selected fixed route. The relation is established by all stops of the bus or train lines.
MN
Visits
OUTC_UTRIP (2.3.3)
Unlinked passengers´ trips of out-commuting persons who stayed in the station for minimal defined time out of the station
Trips of out-commuting people who left the station on the reviewed day and spent a defined minimum aggregate time in at least one other station within the given day. The SOURCE-DESTINATION relation
Trips into destinations with a defined period of time are made by people in the territory of the station who leave the station for another station in which they spend more than a pre-defined period of time within 24 hours (00:00:00 – 23:59:59).
MN
Visits
OUTC_UTRIP (2.3.3)
Unlinked passengers´ trips of out-commuting persons who stayed in the station for minimal defined time out of the station
Trips of out-commuting people who left the station on the reviewed day and spent a defined minimum aggregate time in at least one other station within the given day. The SOURCE-DESTINATION relation
Trips into destinations with a defined period of time are made by people in the territory of the station who leave the station for another station in which they spend more than a pre-defined period of time within 24 hours (00:00:00 – 23:59:59).
MN
Trips*)
OUTC_UTRIP_FRT (2.3.5)
Unlinked passengers´ trips of out-commuting persons who stayed in the station for minimal defined time out of the station in the direction of the fixed route
Trips of out-commuting people who left the station on the reviewed day and spent a defined minimum aggregate time in at least one other station in the direction of the selected fixed route within the given day. The SOURCE-DESTINATION relation
Trips into destinations with a defined period of time are made by people in the territory of the station who leave the station for another station in the direction of the selected fixed route in which they spend more than a pre-defined period of time within 24 hours (00:00:00 – 23:59:59)
MN
Trips**)
POT_UTRIP_FRT (2.3.6)
Potential of unlinked passengers´ trips who stayed in the station for a minimal defined time out of the station in the direction of the fixed route
Potential of trips of out-commuting people who left the station on the reviewed day and spend a defined minimum aggregate time in at least one different station along the selected fixed route. The ORIGIN-DESTINATION relation
Potential of trips is the sum of trips in the direction of the fixed route with the assumption of the sequence of destinations defined by the fixed route
MN
Trips
GETON (2.3.7)
Passengers boarding on a train vehicle operating on fixed route of a carrier
Total number of passengers boarding on the reviewed days on the carrier’s train vehicles on a given fixed route in the station and left in the direction of the next station
This includes people that have been ascertained by the counting entity (conductor, train manager) in person in the railway station or in a between-the-stops slot, or established by the mobile network, or by the CHECKIN/OUT system
MN/CAMP
Trips
CAP-PL_TRV
The total passenger capacity of a train vehicle operating on a fixed route departing from the stop in the direction of the fixed route(selected day)
Aggregate planned capacity of carrier’s train vehicles set out on the reviewed day from the station into the next station in the direction of the fixed route
Total planned fixed route capacity (seating capacity)
CISJR
Trips***)
CAR-PAS_TRV (2.3.6)
Carried passengers on the train vehicle between stops (occupancy)
Aggregate occupancy of train vehicles between the stops
This includes people that have been ascertained by the counting entity (conductor, train manager) in person in the railway station or in a between-the-stops slot, or established by the mobile network, or by the CHECKIN/OUT system
MN/CAMP

(*) Corresponds to the synthetic indicator ‘PASSENGER TRANSPORT’, see for instance Table 4.
(**) When multiplied by the length of the track stretches, it corresponds to the ‘TRANSPORT PERFORMANCE’ indicator, see for instance. Note: Item does not include trips by people who do not fulfil the definition, i.e. did not stay in the station for more than 60 minutes.
(***) When multiplied by the length of the track stretches it corresponds to the ‘RAIL TRANSPORT PERFORMANCE‘ indicator, see for instance.
(****) CISJR = Nomenclature of public transport flow diagram; MN = mobile network; CAMP = census based campaign; CZSO ´Czech Statistical Office



Figure i illustrates the differences between individual types of data items. On the left for the segment (904 municipalities), on the right stations (23 municipalities). The hierarchy of the items as defined by the methodology is clearly shown. The difference between the first and the second column with UTRIP_60 > RES in case of segment and UTRIP_60 < RES in case of stations is due to the fact that the segment consists of a random selection of municipalities (including small) and out-commuting “outside municipality” is more frequent than in the case of stations where mid-sized and large municipalities prevail and mobility within these municipalities is not considered out-commuting. The methodology applied defines out-commuting as crossing the borders of the municipalities. This situation is clearly illustrated by municipalities such as Pardubice, Kolín, Karlovy Vary or Beroun).

Figure i Comparison of measurement results of out-commuters with CZSO data

The selection of a suitable criterion from the point of view of time spent in the in-commuting destination and outside home is based on the fact that the number of residents and out-commuters changes each day and in individual regions while the structure based on the profile of time spent outside home remains almost the same as demonstrated by Figure ii.
The number of out-commuters is defined as a set of participants who spend more than 60 minutes outside their municipality of residence. The reasons are both methodological reflecting the significance of the mobility for the territory, and technical reflecting the frequency of the communication of the mobile terminal with the network.
For each group defined, the possibility and the method of determination of other temporal classified participants (180 min. and 300min.) by means of a simple calculation based on the prior known share is described.

Figure ii Selection of the trajectory’s definition point from the point of view of time spent in the station

Out of all territories with available population mobility information, the tested segment was defined, Figure iii and selected municipalities being served by particular fixed routes and trains were used to draft the methodology, Figure iv. The basic territorial unit for monitoring transport relations is the region defined by individual municipalities’ boundaries. Interpretation restrictions have been set up to provide examples. This is due to the limits of the applied tested data sets which failed to provide comprehensive underlying materials for solving the task. This concerns Prague, Brno and Pilsen. Further restrictions include the conditions of the licence of the applied data sets. Possible restrictions related only to examples of application and have no impact on the procedures defined by the methodology. Data from other sources (SLDB, census, etc.) are complete.

Table 1 Fixed route abbreviations


Figure iii Overview map of departure basic resident units which had been aggregated into municipalities of the tested segment (green)



Figure iv Functional territory, reference territory and reference fixed routes



This section defines the analysed territory and summarises the situation from the point of view of the overall mobility, number of residents and out-commuters.
The average number of residents in the stations, excluding the statutory cities, for 7 days – Monday to Sunday totals 233.000. The minimum was 195.100 (Saturday), maximum was 287.123 (Thursday), and the variation in the course of the analysed days is 92.100 inhabitants. The right-hand section of Figure v shows the number of residents in the stations on the suburban route, the left-hand part shows the total number of residents per day in all stations of the route.
The number of out-commuters differs substantially in relation to the determination method applied Figure vi. The examples provided document the overall absolute values for particular stations and day, trips and visits and in the direction of the route, complemented with the number of residents in the station. This is more detailed information (at the level of PM fixed route stations) illustrated by Figure v.
Where the variation coefficient equals 0, the number of residents of the particular place is the same over the entire reviewed period. The variation coefficient is determined as the share of standard deviation and absolute mean value (average). As this is a share indicator, its fluctuation is greater in the case of municipalities with a smaller number of inhabitants (monitoring). Examples include Potůčky, Pernink or Horní Blatná. Figure viii shows an average variation coefficient of the number of residents for groups of days Tue, Wed, Thu and Sat, Sun which is approximately 21% and 28% respectively. The average and the spread should however be assessed in view of the representation of size groups of the reviewed municipalities.
Where the variation coefficient of the out-commuters correlates with the data for the residents, this could be considered duplicate information. For the purpose of comparison, or rather correlation with the existing CZSO data about usual residents and out-commuters, the best match is the average for Tue-Thu, or Sat-Sun.
Figure ix is a concrete representation of the methodology application for the PM fixed route. It shows the difference between the data established by CZSO (first value in row), values for relations of visits (second value in row) and sequence relations of trips (third value in row).Fixed route PM_BE-PH_1 in blue, fixed route PM_BE-PH_2 in red. The values for the average for Tue, Wed and Thu, trajectory definition point, stay in the station for at least 60 minutes. The difference between the CZSO methodologies and this methodology is clearly illustrated by Figure xiii.
Figure x demonstrates the development of the number of out-commuting trips and the number of out-commuting trips in the direction of fixed routes. The situation is characterised by the need to chart information about the number of out-commuters in the direction of the fixed route on the minor axis in a different scale. If we focus on the trend of the development of the volume of overall transport demand in the course of the days and the trend of transport demand in the direction of the fixed route. The trends relating to the PM_BE-PH_1, PM_PH-BE_2 and R_PL-PH_1 routes are dependent, the notional link between the chart’s vertexes has a concave shape. The trend of the R_PH_PL_2 route behaves slightly differently. The trends relating to EC_BR-PH_2 and EC_PH-BR_1 routes behave differently although the shape of the overall transport demand does not differ from other relations. Possible interpretation: change in the demand for transport performances on a particular fixed route depends on certain types of relations and other factors than a mere change in the volume of total transport performance demand (mobility). Figure xi demonstrates the situation at the level of individual stations of the reviewed fixed routes.


Figure v Number of residents in selected stations and days

Figure vi Comparison of various methods for the determination of out-commuters

Figure vii Fluctuations of the number of residents in the stations on Tue, Wed and Thu

Figure viii Fluctuation of the number of residents in the stations according to day groups


Figure ix Graphical representation of the relation matrix –out-commuters in the direction of the fixed route CZSO / VISITS / TRIIPS


Figure x Comparison of the number of out-commuters with out-commuters in the direction of the fixed route


Figure xi Comparison of the number of out-commuters with out-commuters in the direction of the fixed route at the level of stations





This section analyses performances (number of passengers transported and the development of the trend in time) in relation to total transport demand, the so-called modal split. This enables to analyse the reasons behind the changes. These primarily included exogenous causes of changes. Does the number of transported passengers increase/decrease as a result of the decrease/increase in the overall number of passenger commuting on a particular fixed route (change in overall demand)? Does the railway transport in general have the same share on all operated fixed routes and market relations, provided other conditions remain unchanged?
The knowledge of conditions in the territory is represented by the knowledge of so-called 100% transport demand on a particular fixed route. The key question, to which this information provided the answer, is whether the number of transported passengers on individual fixed routes correlates with the figures and dynamics of the total transport demand of the given relation. Answering this question is important for further steps which result in business-operation measures. If differences occur, can we justify them? If so, can we influence them? Where business-operation measures have been implemented, what is the outcome?
Figure xii a share of passengers traveling by trains on transport demand, the top set, recaps the so-called modal split for individual fixed routes: total passenger transport demand and transport demand realised on the railway (analysed fixed route). The first three charts on the left show the absolute values, the shares are shown in the remaining two charts in the row. The bottom set shows the transport performance expressed by the number of passenger transported. The first two charts on the left show the absolute values, the shares are shown in the last chart in the row. The share of railway on the transport demand is higher on the weekends, with the exception of the PM_PH-BE_2 fixed route for the ‘out-commuting visitors’ category which could be interpreted as residents. In this case, taking into account the conditions on the route and within the territory, one could conclude that on the weekends, Prague residents use this fixed route less than the residents in other stations. In order to understand the significant share of the EC_PH-BR_1 fixed route on transport demand which exceeds 70% in terms of transport demand and 100% of transport performance on the weekends, one should thoroughly interpret the methodology applied, in particular the fact that only trips with a trajectory definition point are included. This fact affects the results on all routes, in particular in cases where the route or a station on it serves as a node or a transit point. In case of this particular fixed route, this concerns Prague, Pardubice and Česká Třebová (see Figure xii or Figure xiv). The restriction of the analysed data sets only plays a marginal role.
Out-commuters transport demand potential, carried passengers and CZSO data, Figure xiii, top set, compares the overall potential transport performance from the point of view of passengers transported for individual fixed routes, using the daily averages (Tue, Wed, Thu and Sat, Sun). The bottom set shows the equivalent of transport performance from the point of view of passengers transported as determined by CZSO statistical censuses, out-commuting and in-commuting direction flows into employment and school. It provides evidence why not to use such statistics as transport statistics.
Overall transport demand and number of boarded passengers per days and fixed routes from Monday to Sunday, Figure xiv, shows the changes in overall transport demand on days of “all” transport network participants’ trips, residents’ demand, second chart from the left; visits, third chart from the left; and boarder passengers first chart on the left for the analysed fixed routes. The fourth chart from the left shows the ratio of checked-in passengers to all network participants; and the fifth from the left shows the ratio of checked-in passengers to residents. The first row EC, second PM and third R. These formats are crucial for determining and specifying the set of passengers who do not use trains on the fixed routes. Analysing the set can provide answers to questions such as “which passengers composes the peaks”, “how big is the participant group who could but do not use railway transport”. These groups are clearly identifiable at the first sight due to concave vs. flat behaviour of the changes in overall transport demand or checked-in passengers on PM fixed routes and due to the concave vs. convex behaviour of changes in overall transport demand or checked-in passengers on the R_PH_PL_2 fixed route.
Figure xvii shows potential and realised transport performance. In accordance with the methodology, takes into account the number of passengers who boarded and got off, or started and ended their trip in individual stations on the route. The chart clearly illustrates that the PM fixed route serves the potential equally, the EC fixed route serves the potential only partially, and the R fixed route serves the potential selectively.


Figure xii Share of passengers traveling by trains on transport demand

Figure xiii Out-commuters transport demand potential, carried passengers and CZSO data

Figure xiv Overall transport demand and number of boarded passengers per days and fixed routes from Monday to Sunday

Figure xv Overall transport demand, residents´ demand and number of board on passengers per stations and days


Figure xvi Overall transport demand, residents´ demand and number of board on passengers per days and stations


Figure xvii Carried passengers between-the-stops on fixed route in the context of total transport´s demand potential





This section enables a detailed analysis of the conditions on the route from the point of view of individual vehicles, stations and short time stretches (hours).
Figure xviii a comparison of the number of train passengers getting on and off on the fixed routes characterises the fixed route in terms of daily cycle service. The PM route serves mostly the daily cycle (evenly distributed number of boarding passengers in both directions), the R route serves it only partially (mostly evenly distributed number of boarding passengers in both directions) and the EC route serves the daily cycle the least (different number of boarding passengers in both directions).
Figure xix a number of passengers getting on and off the trains per stations and days documents the dominance of the stations on individual fixed routes. In case of EC, the departure and end stations dominate. In case of PM, a strong dominancy of a single destination station can be seen; and in case of R, a strong dominancy of the destination and departure station can be seen.
Figure xx a number of boarding passengers in selected stations and hours during the week shows the detail of two stations – first row for Beroun, PM route, and the second row Černošice, PM route - from the point of view of the ratio of all passengers to board on passengers in the morning rush hour. In case of Beroun, the chart clearly shows the lower share of railway on transport demand between 6 am - 8am compared to other morning hours. In case of Černošice, no unambiguous morning pattern in the share of railway transport could be identified.
Figure xxi shows the hourly view on the situation analysed in the comments to Figure xiv an overall transport demand and number of boarded passengers per days and fixed routes from Monday to Sunday.
Figure xxii carried passengers between-the-stops on the fixed routes in the context of train connections´ capacity shows the total vehicle capacity (in grey) and number of passenger transported (in blue). The chart clearly illustrates the carrier’s efforts to respond to demand by offering transport capacities and the fact that the current supply adequately covers demand.
Figure xxiii Comparing peak hour load with transport demand on the fixed route shows that the capacity offer does not cover demand (bottom row), total number of transported passengers and total transport demand (top row). It is clear that with the same or similar demand volume, both overall and the number of passengers transported, the capacity in certain days is not sufficient. This format is crucial to provide answers to some research issues, in particular as to whether a better organisation of transport service, higher awareness of transport network participants or improved business policy would enable to exploit the available capacity better. In theory, the situation could be solvedby distributing the burden into two vehicles, the effect of which shows Figure xxiv a peak hour load split according to vehicles and hours of the day
Figure xxv an unsuitable average capacity of the train vehicle based on the capacity/occupancy analysis, taking into account the average values. The example of the PM_BE-PH_1 fixed route clearly shows that the offered capacity is probably being reduced (removing one suburban train module) one vehicle too early to ensure seating capacity to all passengers.



Figure xviii Comparison of the number of train passengers getting on and off on the fixed routes

Figure xix Number of passengers getting on and off the trains per stations and days


Figure xx Number of boarding passengers in selected stations and hours during the week.


Figure xxi Visual comparation of key variables of the fixed route in hours of the week (overview and segment detail)


Figure xxii Carried passengers between-the-stops on the fixed routes in the context of train connections´ capacity


Figure xxiii Comparing peak hour load with transport demand on the fixed route


Figure xxiv Peak hour load split according to vehicles and hours of the day


Figure xxv Unsuitable average capacity of the train vehicle






modata-method | Copyright © 2016 VŠB-TU Ostrava. All Rights Reserved. | Contact: miroslav.voznak@vsb.cz