Addition of Application and Analysis Guide#1086
Conversation
…e calculation as it's not necessary
There was a problem hiding this comment.
Thanks @JoeJimFlood, job well done. Following up on what I sent you earlier this week in reference to this TOD scenario:
- How are the (increased) willingness and propensity to ride transit reflected in this TOD scenario, from a travel behavior sensitivity perspective? Perhaps in the utility expression calculations? In tour mode choice and trip mode choice utility functions?
- How does this TOD scenario determine the impact of TOD on Single-Occupant Vehicle trips, in terms of significance?
There was a problem hiding this comment.
Thanks for the comments @guyrousseau:
- The scenario currently doesn't describe editing any of the utility expressions, but as the residents of the new development follow characteristics of TOD it could be expected that they'd be more likely to take transit, etc.
- I can add that to the metrics that are calculated.
|
Hi Joe - great example, would be something to work in "Barbenheimer" but that may be pushing it. I had a couple of thoughts on potential extensions to the guide:
|
@mmilkovits Thanks for the comments! I'm now thinking of maybe sneaking a covert reference or two into it...
|
|
|
||
| ## Introduction | ||
|
|
||
| Many contemporary urban planners are encouraging developers to build denser housing, particularly around transit stops. Naturally, planners will want to gauge what the impact of such a development would be on their jurisdiction's transportation system, particularly regarding metrics such as VMT (and subsequently greenhouse gas emission) and transit boardings (and subsequently farebox revenue). To demonstrate this, we will be analyzing a hypothetical development in the San Diego Region. The particular development will add 2000 households and 1000 retail jobs in the vicinity of the Grossmont Station on the Green and Orange Lines of San Diego's light rail system, where there is an existing auto-oriented shopping mall. The guide will show how to make changes to the ActivitySim inputs, how to run the test, and how to calculate some of the key metrics such as VMT and changes in mode share. |
There was a problem hiding this comment.
- If this is intended to be a hypothetical example (as later text makes clear), then I would drop references to specific places. For example, just say, "we will be analyzing a hypothetical mixed use development near transit"
|
|
||
| ## Setting Up the Scenario | ||
|
|
||
| Three input files need to be changed in order to run this test: the land use file (landuse.csv) and the files defining the synthetic population (households.csv and persons.csv). While updating the land use file may seem very straightforward, it is very easy to overlook some necessary changes that could result in the model understating the impact of the change. A modeler doesn't need to just edit the household and employment fields in the study area--they also need to edit any field derived from those fields. For example, every new household will have at least one person in it, so the population field will need to be updated as well (along with population density if that is present). If the total population is to be kept the same, households will need to be removed outside of the study area as well. |
There was a problem hiding this comment.
- Is the file landuse.csv or land_use.csv?
- Perhaps clarify, If the total population is to be kept the same in the entire reigonal modeling area
- In general, I think it would be good to idea specific fields that need to be updated
- The subsequent discussion deals with the synthetic population first, and then the land use, so perhaps reflect this order in this intro paragraph
- I'm a little confused by why the intro paragraph starts to get into the weeds a bit (eg. talking about updating density in the LU field, and then discussing about how denisty is buffer-based in the SANDAG model
|
|
||
| Three input files need to be changed in order to run this test: the land use file (landuse.csv) and the files defining the synthetic population (households.csv and persons.csv). While updating the land use file may seem very straightforward, it is very easy to overlook some necessary changes that could result in the model understating the impact of the change. A modeler doesn't need to just edit the household and employment fields in the study area--they also need to edit any field derived from those fields. For example, every new household will have at least one person in it, so the population field will need to be updated as well (along with population density if that is present). If the total population is to be kept the same, households will need to be removed outside of the study area as well. | ||
|
|
||
| Because Activity-based models use synthetic populations, those input files will need to be updated to reflect the different distribution in the population. There are multiple ways that this could be done. The ActivitySim consortium maintains the PopulationSim population synthesis software, which includes a `repop` mode that can be used to add households to an existing synthetic population. This demonstration will show how to do this, though any user is welcome to add the additional households in whatever way works best for them (such as through a script). |
There was a problem hiding this comment.
- Clarify what is meant by "the different distribution in the population" - is this referring to the geographic distribution? The demographic distribution(s)?
- I find it slightly confusing to say "any user is welcome to add the additional households in whatever way works best for them (such as through a script)". I understand the challenge that there could be any number of ways to adjust a population, but rather than just say "through a script", maybe lay out some options like, "you can repop, you can shift locations of existing synth pop, you can generate an entirely new pop"
|
|
||
| 1. The first thing to do would be to update the new synthetic population. This can be done using PopulationSim's repop mode, which adds additional population on top of existing PopulationSim outputs (a pipeline file is needed). A more detailed explanation of PopulationSim's repop mode can be found in [PopulationSim's documentation](https://activitysim.github.io/populationsim/application_configuration.html#configuring-settings-file-for-repop-mode), but this guide will briefly provide some examples of how PopulationSim can be configured for this particular scenario. | ||
|
|
||
| First, the `run_list` in the settings file needs to be adjusted to have PopulationSim run the repop steps: |
There was a problem hiding this comment.
- Mention the specific name of the settings file. Although individual implementations may vary in terms of the file names, field names, etc, to the extent possible try to make explicit.
| - write_tables.repop | ||
| ``` | ||
|
|
||
| Next, `repop_control_file_name: repop_controls.csv` should be added to the settings file. This tells PopulationSim which file to configure what the control totals will be within the configs directory (configs\repop_controls.csv). The following configuration will help control for characteristic of the population within a TOD area. For example, TOD is more likely to attract smaller households who are more likely to be workers, more likely to be held by younger adults, and less likely to have children than the general population. |
There was a problem hiding this comment.
Does the user need to first create the repop_controls.csv file? How does the configuration below reflect the anticipated TOD population?
| ``` | ||
|
|
||
| Next, `repop_control_file_name: repop_controls.csv` should be added to the settings file. This tells PopulationSim which file to configure what the control totals will be within the configs directory (configs\repop_controls.csv). The following configuration will help control for characteristic of the population within a TOD area. For example, TOD is more likely to attract smaller households who are more likely to be workers, more likely to be held by younger adults, and less likely to have children than the general population. | ||
| | target | geography | seed_table | importance | control_field | expression | |
There was a problem hiding this comment.
Maybe explain why there are no entries in this table for HHSize_3 or greater, no age controls for popualtions <18 or >54 (how does that work exactly?)?
| | Age_35to44 | mgra | persons | 100000 | Age_35to44 | (persons.AGEP >= 35) & (persons.AGEP <= 44) | | ||
| | Age_45to54 | mgra | persons | 100000 | Age_45to54 | (persons.AGEP >= 45) & (persons.AGEP <= 54) | | ||
|
|
||
| The totals in each of the zones are then defined in the control total file, which can be defined in the settings file as follows: |
There was a problem hiding this comment.
What is the name of the control total file? Where is it located? What is the difference between a filename and a tablename?
| | 579 | 500 | 250 | 150 | 50 | 240 | 120 | 400 | 150 | 200 | 200 | 150 | | ||
| | 4502 | 500 | 250 | 150 | 50 | 240 | 120 | 400 | 150 | 200 | 200 | 150 | | ||
|
|
||
| The output synthetic population files then need to be placed in the `data` directory for the ActivitySim run. |
There was a problem hiding this comment.
Meaning households.csv and persons.csv? The more explicit you are, the better.
|
|
||
| The output synthetic population files then need to be placed in the `data` directory for the ActivitySim run. | ||
|
|
||
| 2. The land use file now needs to be updated to reflect the updated population. These lines of code update the household and population values within the land use file (assuming that the synthetic population exists as data frames called `households` and `persons` and the land use file is a data frame called `land_use`). |
There was a problem hiding this comment.
Maybe you should provide instruction how someone can have acccess to the required dataframes?
| land_use["pop"] = persons.groupby("home_zone_id").count()["person_id"] | ||
| del persons["home_zone_id"] | ||
| ``` | ||
| Further variables will need to be edited as well. For example, the SANDAG example contains variables on the number of housing units in each MAZ, including those that are vacant (so this will be higher than the number of households). There are also variables for population density. It may initially seem that one could just calculate them by dividing the updated population by the total area. While some models may use population density calculated in that way, the population density variables in the SANDAG model are actually the population within a buffer and are calculated via a preprocessing step that's external to ActivitySim (which should be rerun in this particular scenario). One should pay close attention to how each of the variables are defined to reduce the risk of a misunderstanding causing incorrect model results. |
There was a problem hiding this comment.
I think you should explicitly list the fields that need to be updated given this example (acknowledging that other models may have different data items available, different naming conventions, etc).
|
|
||
| 3. The land use data needs to again be adjusted for the retail jobs. This is overall more straightforward than adjusting the population, as ActivitySim models don't typically have a set of synthetic set of establishments. As this particular scenario add 1000 additional retail jobs to the TOD development, the values of the retail and total employment fields simply need to be updated for the zones within the study area: | ||
| ``` | ||
| # Adjust employment |
There was a problem hiding this comment.
What is this block of code supposed to be applied to? Again, I think you should err on the side of being as explicit as possible.
| uv run activitysim run -c configs\common -c configs\resident -d data_full -o output --ext extensions | ||
| ``` | ||
|
|
||
| ## Analyzing the Results |
There was a problem hiding this comment.
This list of metrics feels a bit brief, given the scope that "Ideally, the Application & Analysis Guide should also provide guidance on potential analysis metrics as well as appropriate levels of aggregation."
|
|
||
| The following code blocks demonstrate how to calculate key metrics from the model outputs. They all assume that the ActivitySim output files will be read in as a data frame where the name will be the same as the file name but without the prefix or the file extension (e.g. final_trips.csv will be read as trips). | ||
|
|
||
| ### Vehicle Miles Traveled |
There was a problem hiding this comment.
- Other metrics may be of interest, such as VMT / capita. Part of the goal of these guides is to identify a set of metrics that are relevant for the tests, and some guidance on how to interpret them, so that the guide can help both model users as well as managers understand what they should be looking at.
- A key part of any metric calculation is the comparison to something else - to contextualize the metrics. Eg. Is per capita VMT greater or less than in other areas of the region? What is the percentage increase in aggregate VMT in the subarea as well as in the overall modeled area?
| ``` | ||
| mode_share = trips["trip_mode"].value_counts(normalize = True) | ||
| ``` | ||
| This will return the percentage of trips that use each mode. However, a single localized development won't move the needle much, so it may be hard to tell if there was an impact. The following metric computes the *tour* mode share to work of households living in zones close to the transit stop, which should show a much larger difference from the baseline (there are no households in the study area in the baseline run so the baseline mode share would be undefined). |
There was a problem hiding this comment.
I wouldn't think the point of this example would be to demonstrate how it moves the needle regionally, but rather, is transit mode share higher in the new development than in the region, which would be a step in the right direction.
| ``` | ||
| auto_ownership = households["auto_ownership"].value_counts(normalize = True) | ||
| ``` | ||
| However, auto ownership has the same issue with mode share where the TOD development will barely move the needle on the regional auto ownership rates. Therefore, a similar calculation would need to be done: |
There was a problem hiding this comment.
Clarify that this similar calculation is to calcualte this measure for the station area? (to compare to the region?)
This pull request adds the first iteration of the ActivitySim Application and Analysis guide. It includes guides on how to perform three scenarios that are under the existing user guide: