2 - Where to spend leisure time in NYC?

This project aims to explore the spatial distribution of some leisure spaces in New York City, including indoor spaces like museums, art galleries, and theatres, and outddoor spaces like parks or open streets. The project first aims to idenfity some patterns of spatial distribution of such spaces in each borough and neighborhood, and in relation to population and median household income. Then some interactive visualization is created for both local NYC residents and tourists to access to information about some leisure spaces more easily.

Imports

Code
import altair as alt
import geopandas as gpd
import hvplot.pandas
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import datetime
import math

%matplotlib inline

NYC Base Maps

Tracts, Neighborhood, and Borough

nyc_tracts = pd.read_csv(“data/2022 Census Tracts.csv”) nyc_tracts[‘geometry’] = gpd.GeoSeries.from_wkt(nyc_tracts[‘the_geom’]) geo_tracts = gpd.GeoDataFrame(nyc_tracts, geometry=‘geometry’) geo_tracts = geo_tracts.set_crs(epsg=4326) tracts = geo_tracts[[‘BoroName’, ‘CT2020’, ‘BoroCT2020’, ‘NTAName’, ‘Shape_Area’,‘geometry’,‘GEOID’]].copy() tracts_clean = tracts[[‘BoroName’, ‘CT2020’, ‘NTAName’, ‘geometry’,‘GEOID’]] Boro_NTA = tracts_clean[[‘BoroName’, ‘NTAName’]].drop_duplicates(subset = “NTAName”)

Dataset Selection & Setup

After glancing through NYC Open Data portal, I have selected art galleries, museums, libraries, theatres, and parks as common leisure spaces. In addition, I explored data of open streets, a program thatt had been implemented in many global cities including New York City to transform roads into public spaces for cultural and all kinds of events on particular days (mostly on weekends).

This step is to bring in all datasets, clean them up, and aggregate different types of leisure spaces. The aggregated dataframe is then spacially joined with tracts, neighborhood, and boroughs, ready for further investigation on their geospatial distributions.

Similar data wrangling is performed on parks. Since parks come in polygon instead of points, which may result in problems with spatial joins, in the case when one park falls in two or more tracts or neighborhood. Hence, the geometry of parks’ centroids is adopted to replace the polygon geometry for further analysis.

The Open Street data comes with more detailed information on the approved time for each street. The days of week (e.g. Monday) and time of the day (e.g. 7:30) that it opens and closes are rearranged by melting and pivoting.

Additionally, 2020 Census Data on population and median household income are brought in for analysis.

Indoor Leisure Spaces: Art Galleries, Museums, Libraries, and Theatres

Code
# establish geodataframe
art_galleries = gpd.read_file("data/art galleries.geojson")
geo_art_galleries = gpd.GeoDataFrame(art_galleries, geometry='geometry')
geo_art_galleries = art_galleries.set_crs(epsg=4326)

# add type
art_galleries_clean = geo_art_galleries[['name','zip','address1','geometry']]
art_galleries_clean.loc[:,"Type"]= "Art Galleries"
art_galleries_clean.rename(
    columns={"address1": "Address", "name": "Name", "zip": "Zip"},
    inplace=True,)
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1429971093.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  art_galleries_clean.rename(
Code
# Museums 

# establish geodataframe
museums = gpd.read_file("data/museums.geojson")
geo_museums= gpd.GeoDataFrame(museums, geometry='geometry')
geo_museums = geo_museums.set_crs(epsg=4326)
geo_museums

# add type
museums_clean = geo_museums[['name','zip','adress1','geometry']]
museums_clean.loc[:,"Type"]= "Museums"
museums_clean.rename(
    columns={"adress1": "Address", "name": "Name", "zip": "Zip"},
    inplace=True,)
museums_clean
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/3071166290.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  museums_clean.rename(
Name Zip Address geometry Type
0 Alexander Hamilton U.S. Custom House 10004.0 1 Bowling Grn POINT (-74.01376 40.70382) Museums
1 Alice Austen House Museum 10305.0 2 Hylan Blvd POINT (-74.06303 40.61512) Museums
2 American Academy of Arts and Letters 10032.0 633 W. 155th St. POINT (-73.94730 40.83385) Museums
3 American Folk Art Museum 10019.0 45 West 53rd Street POINT (-73.97810 40.76162) Museums
4 American Immigration History Center 0.0 Ellis Island POINT (-74.03968 40.69906) Museums
... ... ... ... ... ...
125 American Sephardi Federation / Sephardic House 10011.0 15 W. 16th St. POINT (-73.99389 40.73808) Museums
126 YIVO Institute for Jewish Research 10011.0 15 W. 16th St. POINT (-73.99379 40.73796) Museums
127 American Jewish Historical Society 10011.0 15 W. 16th St. POINT (-73.99393 40.73802) Museums
128 Yeshiva University Museum 10011.0 15 W. 16th St. POINT (-73.99382 40.73805) Museums
129 Center For Jewish History 10011.0 15 W. 16th St. POINT (-73.99387 40.73799) Museums

130 rows × 5 columns

Code
# Libraries

# establish geodataframe
library = gpd.read_file("data/libraries.geojson")
geo_library = gpd.GeoDataFrame(library, geometry='geometry')
geo_library= geo_library.set_crs(epsg=4326)
geo_library

# add type
library_clean = geo_library[['name','zip','streetname','geometry']]
library_clean.loc[:,"Type"]= "Libraries"
library_clean.rename(
    columns={"streetname": "Address", "name": "Name", "zip": "Zip"},
    inplace=True,)
library_clean
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/3524914390.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  library_clean.rename(
Name Zip Address geometry Type
0 115th Street 10026 West 115th Street POINT (-73.95353 40.80298) Libraries
1 125th Street 10035 East 125th Street POINT (-73.93485 40.80302) Libraries
2 53rd Street 10019 West 53rd Street POINT (-73.97736 40.76081) Libraries
3 58th Street 10022 East 58th Street POINT (-73.96938 40.76219) Libraries
4 67th Street 10065 East 67th Street POINT (-73.95955 40.76492) Libraries
... ... ... ... ... ...
211 Sunnyside 11104 Greenpoint Avenue POINT (-73.92167 40.74085) Libraries
212 Whitestone 11357 14 Road POINT (-73.81070 40.78854) Libraries
213 Windsor Park 11364 Bell Boulevard POINT (-73.75562 40.73450) Libraries
214 Woodhaven 11421 Forest Parkway POINT (-73.86146 40.69453) Libraries
215 Woodside 11377 Skillman Avenue POINT (-73.90979 40.74534) Libraries

216 rows × 5 columns

Code
# Theaters
# establish geodataframe
theaters = gpd.read_file("data/Theaters.geojson")
geo_theaters = gpd.GeoDataFrame(theaters, geometry='geometry')
geo_theaters= geo_theaters.set_crs(epsg=4326)
geo_theaters


# add type
theaters_clean = geo_theaters[['name','zip','address1','geometry']]
theaters_clean.loc[:,"Type"]= "Theatres"
theaters_clean.rename(
    columns={"address1": "Address", "name": "Name", "zip": "Zip"},
    inplace=True,)
theaters_clean
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1518346627.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  theaters_clean.rename(
Name Zip Address geometry Type
0 45th Street Theater 10036.0 354 West 45th Street POINT (-73.99062 40.75985) Theatres
1 47th Street Theater 10036.0 304 West 47th Street POINT (-73.98811 40.76047) Theatres
2 59E59 10022.0 59 East 59th Street POINT (-73.97038 40.76340) Theatres
3 Acorn Theater 10036.0 410 West 42nd Street POINT (-73.99332 40.75854) Theatres
4 Al Hirschfeld Theater 10036.0 302 W 45th Street POINT (-73.98921 40.75926) Theatres
... ... ... ... ... ...
112 Westside Theater 10036.0 407 W 43rd St POINT (-73.99255 40.75953) Theatres
113 Wings Theatre 10014.0 154 Christopher St POINT (-74.00889 40.73240) Theatres
114 Winter Garden Theatre 10019.0 1634 Broadway POINT (-73.98348 40.76152) Theatres
115 York Theatre 10022.0 619 Lexington Ave POINT (-73.96998 40.75836) Theatres
116 Delacorte Theater 0.0 Central Park - Mid-Park at 80th Street POINT (-73.96882 40.78018) Theatres

117 rows × 5 columns

Aggregate all types of indoor leisure space

Code
frames = [art_galleries_clean, museums_clean, library_clean, theaters_clean]

total_indoor = pd.concat(frames)

total_indoor
Name Zip Address geometry Type
0 O'reilly William & Co Ltd 10021.0 52 E 76th St POINT (-73.96273 40.77380) Art Galleries
1 Organization of Independent Artists - Gallery 402 10013.0 19 Hudson St. POINT (-74.00939 40.71647) Art Galleries
2 Owen Gallery 10021.0 19 E 75th St POINT (-73.96435 40.77400) Art Galleries
3 P P O W Gallerie 10001.0 511 W 25th St POINT (-74.00389 40.74959) Art Galleries
4 P P O W Inc 10013.0 476 Broome St POINT (-74.00176 40.72291) Art Galleries
... ... ... ... ... ...
112 Westside Theater 10036.0 407 W 43rd St POINT (-73.99255 40.75953) Theatres
113 Wings Theatre 10014.0 154 Christopher St POINT (-74.00889 40.73240) Theatres
114 Winter Garden Theatre 10019.0 1634 Broadway POINT (-73.98348 40.76152) Theatres
115 York Theatre 10022.0 619 Lexington Ave POINT (-73.96998 40.75836) Theatres
116 Delacorte Theater 0.0 Central Park - Mid-Park at 80th Street POINT (-73.96882 40.78018) Theatres

1380 rows × 5 columns

Spatial Join

Code
geo_total_indoor = gpd.sjoin(
    total_indoor,  # The point data for 311 tickets
    tracts_clean.to_crs(total_indoor.crs),  # The neighborhoods (in the same CRS)
    predicate="within",
    how="left",
)
geo_total_indoor.head()
Name Zip Address geometry Type index_right BoroName CT2020 NTAName GEOID
0 O'reilly William & Co Ltd 10021.0 52 E 76th St POINT (-73.96273 40.77380) Art Galleries 84.0 Manhattan 13000.0 Upper East Side-Carnegie Hill 3.606101e+10
1 Organization of Independent Artists - Gallery 402 10013.0 19 Hudson St. POINT (-74.00939 40.71647) Art Galleries 17.0 Manhattan 3900.0 Tribeca-Civic Center 3.606100e+10
2 Owen Gallery 10021.0 19 E 75th St POINT (-73.96435 40.77400) Art Galleries 84.0 Manhattan 13000.0 Upper East Side-Carnegie Hill 3.606101e+10
3 P P O W Gallerie 10001.0 511 W 25th St POINT (-74.00389 40.74959) Art Galleries 1134.0 Manhattan 9901.0 Chelsea-Hudson Yards 3.606101e+10
4 P P O W Inc 10013.0 476 Broome St POINT (-74.00176 40.72291) Art Galleries 1156.0 Manhattan 4900.0 SoHo-Little Italy-Hudson Square 3.606100e+10

Outdoor Leisure Space: Parks

Code
# Parks
# establish geodataframe
parks = gpd.read_file("data/Parks Properties.geojson")
geo_parks = gpd.GeoDataFrame(parks, geometry='geometry')
geo_parks= geo_parks.set_crs(epsg=4326)
geo_parks


# add type
parks_clean = geo_parks[['signname','geometry','acres']]
parks_clean.loc[:,"Type"]= "Parks"
parks_clean.rename(
    columns={"location": "Address", "signname": "Name", "zipcode": "Zip"},
    inplace=True,)

parks_clean['new_geom'] = parks_clean['geometry']
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1177565529.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  parks_clean.rename(
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
Code

# Since many parks are in multi-polygons, which will lead to errors when doing spatial join later, a centroid of each park is generated here for smoother spatial join.
parks_clean['geometry'] = parks_clean['geometry'].centroid

geo_parks_clean = gpd.sjoin(
    parks_clean,  # The point data for 311 tickets
    tracts_clean.to_crs(parks_clean.crs),  # The neighborhoods (in the same CRS)
    predicate="within",
    how="left",
)
geo_parks_clean
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/3082874007.py:2: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  parks_clean['geometry'] = parks_clean['geometry'].centroid
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
Name geometry acres Type new_geom index_right BoroName CT2020 NTAName GEOID
0 Inwood Hill Park POINT (-73.92544 40.87257) 196.398 Parks MULTIPOLYGON (((-73.92093 40.86999, -73.92145 ... 2191.0 Manhattan 29700.0 Inwood Hill Park 3.606103e+10
1 Challenge Playground POINT (-73.72796 40.75662) 2.035 Parks MULTIPOLYGON (((-73.72738 40.75605, -73.72783 ... 936.0 Queens 152902.0 Douglaston-Little Neck 3.608115e+10
2 Sunset Cove Park POINT (-73.82300 40.59853) 9.375 Parks MULTIPOLYGON (((-73.82218 40.59892, -73.82221 ... 1129.0 Queens 107201.0 Breezy Point-Belle Harbor-Rockaway Park-Broad ... 3.608111e+10
3 Grand Central Parkway Extension POINT (-73.85317 40.75316) 249.389 Parks MULTIPOLYGON (((-73.85875 40.76741, -73.85976 ... 625.0 Queens 39902.0 North Corona 3.608104e+10
4 Idlewild Park POINT (-73.75229 40.65043) 180.85 Parks MULTIPOLYGON (((-73.75809 40.65427, -73.75845 ... 1013.0 Queens 66404.0 Springfield Gardens (South)-Brookville 3.608107e+10
... ... ... ... ... ... ... ... ... ... ...
2040 Maria Hernandez Park POINT (-73.92386 40.70317) 6.873 Parks MULTIPOLYGON (((-73.92251 40.70351, -73.92381 ... 1729.0 Brooklyn 42900.0 Bushwick (West) 3.604704e+10
2041 Crotona Parkway Malls POINT (-73.88477 40.84405) 8.75 Parks MULTIPOLYGON (((-73.88496 40.84470, -73.88496 ... 336.0 Bronx 36300.0 West Farms 3.600504e+10
2042 Park POINT (-73.89807 40.84408) 0.511 Parks MULTIPOLYGON (((-73.89759 40.84410, -73.89773 ... 1229.0 Bronx 16500.0 Claremont Village-Claremont (East) 3.600502e+10
2043 Cunningham Park POINT (-73.76880 40.73382) 358.0 Parks MULTIPOLYGON (((-73.77466 40.72442, -73.77439 ... 2241.0 Queens 128300.0 Cunningham Park 3.608113e+10
2044 Roberto Clemente Ballfield POINT (-73.96767 40.70635) 1.93 Parks MULTIPOLYGON (((-73.96761 40.70581, -73.96735 ... 2220.0 Brooklyn 54500.0 South Williamsburg 3.604705e+10

2045 rows × 10 columns

Outdoor Leisure Space: Open Streets

Code
open_streets = gpd.read_file("data/Open Streets Locations.geojson")
geo_open_streets = gpd.GeoDataFrame(open_streets, geometry='geometry', crs=2263)
geo_open_streets = open_streets.to_crs(epsg=4326)

open_streets_clean = geo_open_streets[['appronstre','apprtostre','apprdayswe','boroughname','reviewstat','shape_stle','geometry']]
Code
join_test = gpd.sjoin(
    open_streets_clean,  # The point data for 311 tickets
    tracts_clean.to_crs(open_streets_clean.crs),  # The neighborhoods (in the same CRS)
    predicate="within",
    how="left",
)
join_test.head()
appronstre apprtostre apprdayswe boroughname reviewstat shape_stle geometry index_right BoroName CT2020 NTAName GEOID
0 DEISIUS STREET STECHER STREET mon,tue,wed,thu,fri Staten Island approvedFullSchools 264.932398036 MULTILINESTRING ((-74.18738 40.53028, -74.1882... 1318.0 Staten Island 17600.0 Annadale-Huguenot-Prince's Bay-Woodrow 3.608502e+10
1 SUFFOLK AVENUE HAROLD STREET fri,sat,sun Staten Island approvedFull 313.087821487 MULTILINESTRING ((-74.12784 40.60288, -74.1277... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10
2 SUFFOLK AVENUE HAROLD STREET fri,sat,sun Staten Island approvedFull 142.063500219 MULTILINESTRING ((-74.12772 40.60202, -74.1276... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10
3 VERMONT COURT SUFFOLK AVENUE fri,sat,sun Staten Island approvedFull 421.392020366 MULTILINESTRING ((-74.12620 40.60209, -74.1277... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10
4 9 STREET ROSE AVENUE fri,sat,sun Staten Island approvedFull 448.103939286 MULTILINESTRING ((-74.11481 40.57316, -74.1157... 1300.0 Staten Island 13400.0 New Dorp-Midland Beach 3.608501e+10
Code
join_test['monday'] = join_test['apprdayswe'].str.count('mon')
join_test['tuesday'] = join_test['apprdayswe'].str.count('tue')
join_test['wednesday'] = join_test['apprdayswe'].str.count('wed')
join_test['thursday'] = join_test['apprdayswe'].str.count('thu')
join_test['friday'] = join_test['apprdayswe'].str.count('fri')
join_test['saturday'] = join_test['apprdayswe'].str.count('sat')
join_test['sunday'] = join_test['apprdayswe'].str.count('sun')
join_test['dayscount'] = join_test[['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']].sum(axis=1)

join_test
appronstre apprtostre apprdayswe boroughname reviewstat shape_stle geometry index_right BoroName CT2020 NTAName GEOID monday tuesday wednesday thursday friday saturday sunday dayscount
0 DEISIUS STREET STECHER STREET mon,tue,wed,thu,fri Staten Island approvedFullSchools 264.932398036 MULTILINESTRING ((-74.18738 40.53028, -74.1882... 1318.0 Staten Island 17600.0 Annadale-Huguenot-Prince's Bay-Woodrow 3.608502e+10 1 1 1 1 1 0 0 5
1 SUFFOLK AVENUE HAROLD STREET fri,sat,sun Staten Island approvedFull 313.087821487 MULTILINESTRING ((-74.12784 40.60288, -74.1277... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10 0 0 0 0 1 1 1 3
2 SUFFOLK AVENUE HAROLD STREET fri,sat,sun Staten Island approvedFull 142.063500219 MULTILINESTRING ((-74.12772 40.60202, -74.1276... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10 0 0 0 0 1 1 1 3
3 VERMONT COURT SUFFOLK AVENUE fri,sat,sun Staten Island approvedFull 421.392020366 MULTILINESTRING ((-74.12620 40.60209, -74.1277... 1323.0 Staten Island 18703.0 Todt Hill-Emerson Hill-Lighthouse Hill-Manor H... 3.608502e+10 0 0 0 0 1 1 1 3
4 9 STREET ROSE AVENUE fri,sat,sun Staten Island approvedFull 448.103939286 MULTILINESTRING ((-74.11481 40.57316, -74.1157... 1300.0 Staten Island 13400.0 New Dorp-Midland Beach 3.608501e+10 0 0 0 0 1 1 1 3
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
361 BECK STREET AVENUE ST JOHN wed Bronx approvedFull 619.314479576 MULTILINESTRING ((-73.90222 40.81449, -73.9014... 200.0 Bronx 8300.0 Longwood 3.600501e+10 0 0 1 0 0 0 0 1
362 NEWKIRK AVENUE EAST 17 STREET sun Brooklyn approvedFull 284.786787549 MULTILINESTRING ((-73.96422 40.63510, -73.9641... 1807.0 Brooklyn 52000.0 Flatbush (West)-Ditmas Park-Parkville 3.604705e+10 0 0 0 0 0 0 1 1
363 18 STREET 4 AVENUE mon,tue,wed,thu,fri Brooklyn approvedFullSchools 768.660222317 MULTILINESTRING ((-73.99201 40.66323, -73.9916... 2248.0 Brooklyn 14300.0 Sunset Park (West) 3.604701e+10 1 1 1 1 1 0 0 5
364 34 AVENUE JUNCTION BOULEVARD mon,tue,wed,thu,fri,sat,sun Queens approvedLimited 272.995599601 MULTILINESTRING ((-73.89716 40.75246, -73.8970... 2168.0 Queens 29100.0 Jackson Heights 3.608103e+10 1 1 1 1 1 1 1 7
365 DECATUR STREET SARATOGA AVENUE sat Brooklyn approvedLimited 769.964464111 MULTILINESTRING ((-73.92001 40.68299, -73.9176... 1682.0 Brooklyn 37700.0 Bedford-Stuyvesant (East) 3.604704e+10 0 0 0 0 0 1 0 1

366 rows × 20 columns

Code
open_streets_days = join_test[['appronstre', 'BoroName', 'monday', 'tuesday', 'wednesday','thursday', 'friday', 'saturday', 'sunday']]
open_streets_days
appronstre BoroName monday tuesday wednesday thursday friday saturday sunday
0 DEISIUS STREET Staten Island 1 1 1 1 1 0 0
1 SUFFOLK AVENUE Staten Island 0 0 0 0 1 1 1
2 SUFFOLK AVENUE Staten Island 0 0 0 0 1 1 1
3 VERMONT COURT Staten Island 0 0 0 0 1 1 1
4 9 STREET Staten Island 0 0 0 0 1 1 1
... ... ... ... ... ... ... ... ... ...
361 BECK STREET Bronx 0 0 1 0 0 0 0
362 NEWKIRK AVENUE Brooklyn 0 0 0 0 0 0 1
363 18 STREET Brooklyn 1 1 1 1 1 0 0
364 34 AVENUE Queens 1 1 1 1 1 1 1
365 DECATUR STREET Brooklyn 0 0 0 0 0 1 0

366 rows × 9 columns

Census Data

Code
Pop_2020 = gpd.read_file("data/NYC_tracts_2020.geojson")
Pop_2020 = Pop_2020[['GEOID', 'estimate']]
Pop_2020['GEOID']=Pop_2020['GEOID'].astype(int)

tracts_pop = tracts_clean.merge(Pop_2020, on='GEOID', how='left') 
neighbor_pop = tracts_pop.groupby(['NTAName']).sum(['estimate']).reset_index()
Code
income = gpd.read_file("data/NYC_Income.geojson")
geo_income = gpd.GeoDataFrame(income, geometry='geometry')
geo_income = geo_income.set_crs(epsg=4326)
geo_income['GEOID']=geo_income['GEOID'].astype('int')

tracts_income = tracts_clean.merge(geo_income.drop(columns='geometry'), on='GEOID', how='left').dropna()
neighbor_income = tracts_income.groupby(['NTAName', 'BoroName']).median(['estimate']).reset_index()

Chart I: Matplotlib – Parks in Neighborhoods

I first hope to investigate into the distribution of Parks in different neighborhoods and boroughs in relation to population. On the one hand, higher population means more people will have need for a bigger public green space for leisure time. On the other hand, less populated neighborhoods tend to have more spaces for parks. And since the count of parks doesn’t perfectly reflect how much space is available, I am using acrage data instead of counts for this analysis.

To investigate, I utilized Matplotlib, which is great for making simple scatterplot charts that speaks for simple linear relationship, if there is any.

Code
geo_parks_clean['acres'] = geo_parks_clean['acres'].astype(float)
parks_acres_neighborhood = geo_parks_clean.groupby('NTAName').sum().drop(columns=['index_right','CT2020']).reset_index()
parks_acres_pop = neighbor_pop.merge(parks_acres_neighborhood, on='NTAName', how='left').dropna()
parks_pop = parks_acres_pop.merge(Boro_NTA, on='NTAName', how='left').dropna()
parks_pop['acres'].describe()
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1268464147.py:2: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
  parks_acres_neighborhood = geo_parks_clean.groupby('NTAName').sum().drop(columns=['index_right','CT2020']).reset_index()
count     237.000000
mean      105.945812
std       224.643587
min         0.005000
25%         7.651000
50%        20.653000
75%        85.527000
max      1930.636136
Name: acres, dtype: float64
After a quick glance of the park data, I notice some extreme outliers with extremely large parks that not only serves adjacent neighborhoods but the whole city. I excluded those outliers to have a better sense of how much acrage of parks common neighborhoods get.
Code
parks_pop_filtered = parks_pop.loc[(parks_pop['acres'] < 86) & (parks_pop['acres'] > 7) & (parks_pop['estimate'] >0)]
parks_pop_filtered
NTAName CT2020 GEOID_x estimate acres GEOID_y BoroName
5 Astoria (East)-Woodside (North) 241600 505134241600 34825.0 8.642 2.886482e+11 Queens
6 Astoria (North)-Ditmars-Steinway 214102 613377214102 47134.0 11.530 4.690532e+11 Queens
10 Barren Island-Floyd Bennett Field 70202 36047070202 26.0 64.665 3.604707e+10 Brooklyn
11 Bath Beach 235800 396517235800 32716.0 21.398 1.081411e+11 Brooklyn
15 Bedford Park 448115 396055448115 55702.0 23.917 1.800252e+11 Bronx
... ... ... ... ... ... ... ...
225 West Farms 136300 180025136300 18206.0 13.540 3.960554e+11 Bronx
230 Whitestone-Beechhurst 703300 252567703300 28353.0 29.086 2.164866e+11 Queens
232 Williamsburg 695100 468611695100 59410.0 81.282 1.045365e+12 Brooklyn
233 Windsor Terrace-South Slope 363808 288376363808 25442.0 8.408 4.325650e+11 Brooklyn
235 Woodside 406004 505134406004 45417.0 10.217 5.412154e+11 Queens

113 rows × 7 columns

Code

color_map = {"Bronx": "#550527", "Brooklyn": "#688E26", "Manhattan": "#FAA613", "Queens": "#F44708", "Staten Island": "#A10702"}

fig, ax = plt.subplots(figsize=(11,6))

for BoroName, group_df in parks_pop_filtered.groupby("BoroName"):
    
    ax.scatter(
        group_df["estimate"],
        group_df["acres"],
        marker="P",
        label=BoroName,
        color=color_map[BoroName],
        alpha=0.75,
        zorder=10
    )

ax.legend(loc="best")
ax.set(
    title = "Park Space (acres) Relative to Population in Neighborhoods, by Boroughs",
    xlabel = "Population in each neighborhood (2020 Census)",
    ylabel = "Parks in Acres")
ax.grid(True)

plt.show()

This chart does not suggest a strong linear relationship between population and park acrages. Overall, Manhattan, Queens and Brooklyn host more large parks, but some neighborhoods are particularly underserved, with very high population and low park acrage (points towards lower right of the chart).

Chart II: Seaborn (x2)

Looking at data for indoor leisure space and open street data, I utilize seaborn to create two types of charts that suite the nature of the data sets.

The first is a grouped bar charts. Similar to park distribution, I hope to have a quick glimpse of number of each type of leisure spaces in each borough, and hope to identify any general spatial patterns or inequalities. A grouped chart is great a revealing such pattern.

The second is a heatmap for open street data. The heatmap explores the number of open streets approved in different boroughs on different days of the week.

Indoor Leisure Spaces: Grouped Bar Charts

Code
indoor_clean = geo_total_indoor.groupby(['Type','BoroName']).count().reset_index().drop(columns=['Zip','Address','geometry','index_right','CT2020','NTAName']).pivot(index='Type',columns='BoroName', values='Name').fillna(0).reset_index()
indoor_melt = indoor_clean.melt(id_vars='Type', value_vars=['Bronx','Brooklyn','Manhattan', 'Queens', 'Staten Island'])
indoor_melt
Type BoroName value
0 Art Galleries Bronx 6.0
1 Libraries Bronx 35.0
2 Museums Bronx 8.0
3 Theatres Bronx 0.0
4 Art Galleries Brooklyn 61.0
5 Libraries Brooklyn 59.0
6 Museums Brooklyn 12.0
7 Theatres Brooklyn 0.0
8 Art Galleries Manhattan 823.0
9 Libraries Manhattan 44.0
10 Museums Manhattan 87.0
11 Theatres Manhattan 115.0
12 Art Galleries Queens 24.0
13 Libraries Queens 65.0
14 Museums Queens 12.0
15 Theatres Queens 2.0
16 Art Galleries Staten Island 3.0
17 Libraries Staten Island 13.0
18 Museums Staten Island 9.0
19 Theatres Staten Island 0.0
Code
sns.set_theme(style="whitegrid")

color_map = ["#550527", "#688E26", "#FAA613", "#F44708", "#A10702"]
sns.set_palette(color_map)

sns.catplot(
    data=indoor_melt, kind="bar",
    x="Type", 
    y="value", 
    hue="BoroName",
    aspect=2, 
    alpha=1
).set_axis_labels(
    "Type of Leisure Space", "Counts"
).set(title="Distribution of 4 Types of Leisure Spaces in Each Borough")
/Users/annzhang/mambaforge/envs/musa-550-fall-2023/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)

Days and Location of Open Streets in NYC: A Heatmap

Code
#heatmap for open_streets: borough x Days of the week 
open_streets_days_melt= open_streets_days.melt(id_vars=['appronstre','BoroName'], value_vars=['monday','tuesday','wednesday', 'thursday', 'friday','saturday','sunday'])
appronstre BoroName variable value
0 DEISIUS STREET Staten Island monday 1
1 SUFFOLK AVENUE Staten Island monday 0
2 SUFFOLK AVENUE Staten Island monday 0
3 VERMONT COURT Staten Island monday 0
4 9 STREET Staten Island monday 0
... ... ... ... ...
2557 BECK STREET Bronx sunday 0
2558 NEWKIRK AVENUE Brooklyn sunday 1
2559 18 STREET Brooklyn sunday 0
2560 34 AVENUE Queens sunday 1
2561 DECATUR STREET Brooklyn sunday 0

2562 rows × 4 columns

Code
open_streets_seaborn = open_streets_days_melt.groupby(['variable','BoroName']).sum().reset_index()

# sort the order of day from monday to sunday 

order=['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']
open_streets_seaborn['variable'] = pd.Categorical(open_streets_seaborn['variable'], categories=order, ordered=True)
open_streets_seaborn = open_streets_seaborn.sort_values(by='variable')
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/3045256453.py:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
  open_streets_seaborn = open_streets_days_melt.groupby(['variable','BoroName']).sum().reset_index()
Code
import seaborn as sns

sns.set_theme()

# Load the example flights dataset and convert to long-form

open_street_heatmap = (
    open_streets_seaborn
    .pivot(index="BoroName", columns="variable", values="value")
)

# Draw a heatmap with the numeric values in each cell
f, ax = plt.subplots(figsize=(9, 6))

ax = sns.heatmap(open_street_heatmap, annot=True, linewidths=.5)
ax.set(xlabel="Days in a week", ylabel="Borough", title="Number of Streets Open on Specific Days")
[Text(0.5, 33.249999999999986, 'Days in a week'),
 Text(79.75, 0.5, 'Borough'),
 Text(0.5, 1.0, 'Number of Streets Open on Specific Days')]

From both charts, we can see Manhattan has disproportional number of art galleries and open streets in comparison to other boroughs. Museums and theatres are predominantly located on Manhattan as well. Interestingly, the distribution of libraries seems more even. However, overall, we are seeing leisure spaces are disproportional abundant and diverse in Manhattan, then brooklyn and queens, leaving Bronx and Staten Island less resourceful.

Chart III: Altair Charts (x3)

Furthering the exploration on indoor leisure spaces, I created one chart on its ditribution in relation with median household income, to see if it embeds any socio-economic inequality. I then creatd a map to visualize this relationship, which can also be helpful to locate different kinds of leisure spaces in the city.

My third chart is an interactive bar chart on open streets in Brookylen, where audience can choose the day of the week to see all open streets approved for that day and the time they open and close. This could potentially be developed into a tool for residents and tourists to track open streets.

Brush Selection: Indoor Space and Income

Code
neighbor_indoor = geo_total_indoor.groupby(['NTAName','Type']).count().reset_index().drop(['Zip', 'Address', 'geometry','index_right','BoroName', 'CT2020'], axis=1)
neighbor_indoor = neighbor_indoor.rename(columns={'Name': 'Count'})
neighbor_income_indoor = neighbor_income.merge(neighbor_indoor, on='NTAName', how='left')

neighbor_income_indoor_pivot=neighbor_income_indoor.pivot(index=['NTAName', 'BoroName','estimate'], columns="Type", values="Count").reset_index().fillna(0)
neighbor_income_indoor_pivot_1 = neighbor_income_indoor_pivot.drop(neighbor_income_indoor_pivot.columns[[3]],axis=1)
income_indoor_pivot = neighbor_income_indoor_pivot_1.melt(id_vars=["NTAName", "estimate", "BoroName"], value_vars=["Art Galleries", "Libraries", "Museums", "Theatres"], var_name="Types",value_name="Count")
income_indoor_pivot_filtered = income_indoor_pivot.loc[(income_indoor_pivot['Count'] <20)]
income_indoor_pivot_filtered = income_indoor_pivot_filtered.rename(columns={'estimate': 'Median Household Income'})
Code
brush = alt.selection_interval()

Brush_Chart = (
alt.Chart(income_indoor_pivot_filtered)
   .mark_point()
   .encode(
       x=alt.X("Median Household Income:Q", scale=alt.Scale(zero=False)),
       y=alt.Y("Count:Q", scale=alt.Scale(zero=False)),
       color=alt.condition(brush, "BoroName:N", alt.value("lightgray")),
       tooltip=["NTAName","BoroName:N", "Median Household Income:Q", "Count:Q"])
   .add_params(brush)
   .properties(width=200, height=200)
   .facet(column="Types:N")
)

Brush_Chart

Map: Income and Indoor Leisure Spaces - Relationships

Code
NTA = pd.read_csv("data/2020 Neighborhood.csv")
NTA['geometry'] = gpd.GeoSeries.from_wkt(NTA['the_geom'])
NTA_geo = gpd.GeoDataFrame(NTA, geometry='geometry', crs=4326)
NTA_geo = NTA_geo.to_crs(epsg=2263)
geo_total_indoor = geo_total_indoor.to_crs(epsg=2263)

geo_NTA = NTA_geo[['NTAName', 'geometry']]

tracts_income_1 = tracts_clean.merge(geo_income.drop(columns='geometry'), on='GEOID', how='left').dropna()
neighbor_income_1 = tracts_income.groupby(['NTAName', 'BoroName']).median(['estimate']).reset_index()

NTA_income = geo_NTA.merge(neighbor_income_1, on='NTAName', how='left')
Code
geo_total_indoor_1 = geo_total_indoor

geo_total_indoor_1['lon'] = geo_total_indoor_1['geometry'].x
geo_total_indoor_1['lat'] = geo_total_indoor_1['geometry'].y
Code
Income = (
    alt.Chart(NTA_income)
    .mark_geoshape(stroke="white")
    .encode(
        tooltip=["NTAName:N", "estimate:Q", "moe:Q"],
        color=alt.Color("estimate:Q", scale=alt.Scale(scheme="greys")),
    )
    # Important! Otherwise altair will try to re-project your data
    .project(type="identity", reflectY=True)
    .properties(width=1000, height=800).interactive()
)

IndoorSpaces = (
    alt.Chart(geo_total_indoor_1)
    .mark_circle(size=10)
    .encode(tooltip=['Name','Type','Address'],
           longitude="lon", latitude="lat",
           color=alt.Color('Type:N', scale=alt.Scale(scheme="lightmulti"))
         ).project(type="identity", reflectY=True)
)



map_1 = Income + IndoorSpaces
map_1

Similarly, libraries seem to be the least discriminatory type of leisure space. For art galleries and museums, while there are some distributed in mid to lower income neighborhood, the higher income neighborhood has higher density of such leisure space. Theatres, at the same time, is mostly located in mid to higher income neighborhood. This trend can be clearly see on the map too. Such spatial distribution means, for residents in lower to mid income neighborhood, they might have to travel further for accessing those spaces.

Open Street Time

Code
open_csv = pd.read_csv("data/Open Streets CSV.csv")
open_csv['geometry'] = gpd.GeoSeries.from_wkt(open_csv['the Geom'])
open_csv_geo = gpd.GeoDataFrame(open_csv, geometry='geometry', crs=2263)
open_csv_geo = open_csv_geo.to_crs(epsg=4326)

open_7days_time = open_csv_geo.drop(['apprDaysWe','Object ID', 'Organization Name', 'Approved From Street', 'Approved To Street', 'apprStartD', 'apprEndDat', 'Shape_STLe', 'segmentidt', 'segmentidf', 'lionversion', 'the Geom'], axis=1)
open_7days_time = open_7days_time.drop_duplicates(subset = "Approved On Street")
open_7days_time_melt = open_7days_time.melt(id_vars=['Approved On Street', 'Borough Name'], value_vars=['Approved Monday Open', 'Approved Monday Close', 'Approved Tuesday Open', 'Approved Tuesday Close', 'Approved Wednesday Open', 'Approved Wednesday Close', 'Approved Thursday Open', 'Approved Thursday Close', 'Approved Friday Open', 'Approved Friday Close', 'Approved Saturday Open', 'Approved Saturday Close', 'Approved Sunday Open', 'Approved Sunday Close']).dropna()
#open_7days_time_melt['value'] =  pd.to_datetime(open_7days_time_melt['value']).dt.time
open_time_Brooklyn = open_7days_time_melt[(open_7days_time_melt['Borough Name'] == 'Brooklyn')]
open_time_Brooklyn
Approved On Street Borough Name variable value
6 RIDGE BOULEVARD Brooklyn Approved Monday Open 10:00
9 82 STREET Brooklyn Approved Monday Open 08:30
10 48 STREET Brooklyn Approved Monday Open 13:30
11 43 STREET Brooklyn Approved Monday Open 09:30
12 ALBEMARLE ROAD Brooklyn Approved Monday Open 08:00
... ... ... ... ...
2282 JEFFERSON AVENUE Brooklyn Approved Sunday Close 21:00
2287 SHARON STREET Brooklyn Approved Sunday Close 20:00
2288 TROUTMAN STREET Brooklyn Approved Sunday Close 22:00
2289 RANDOLPH STREET Brooklyn Approved Sunday Close 23:00
2354 LEXINGTON AVENUE Brooklyn Approved Sunday Close 20:00

378 rows × 4 columns

Code
open_time_Brooklyn['time'] = open_7days_time_melt['variable'].str.extract('(Open|Close)')
open_time_Brooklyn['dayweek'] = open_7days_time_melt['variable'].str.extract('(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday)')
open_time_Brooklyn= open_time_Brooklyn.pivot(index=['Approved On Street', 'dayweek'], columns="time", values="value").reset_index()

order=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
open_time_Brooklyn['dayweek'] = pd.Categorical(open_time_Brooklyn['dayweek'], categories=order, ordered=True)
open_time_Brooklyn = open_time_Brooklyn.sort_values(by='dayweek').reset_index()

open_time_Brooklyn
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1900940832.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  open_time_Brooklyn['time'] = open_7days_time_melt['variable'].str.extract('(Open|Close)')
/var/folders/q3/y0zpvj752qg3_3nvpkx6v2300000gn/T/ipykernel_80527/1900940832.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  open_time_Brooklyn['dayweek'] = open_7days_time_melt['variable'].str.extract('(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday)')
time index Approved On Street dayweek Close Open
0 139 SUMMIT STREET Monday 14:30 12:30
1 151 UNDERHILL AVENUE Monday 20:00 08:00
2 28 82 STREET Monday 15:30 08:30
3 143 THATFORD AVENUE Monday 18:00 11:00
4 33 AITKEN PLACE Monday 15:00 11:00
... ... ... ... ... ...
184 147 TOMPKINS AVENUE Sunday 20:00 11:00
185 51 BEVERLEY ROAD Sunday 18:00 10:00
186 112 RANDOLPH STREET Sunday 23:00 12:00
187 10 4 STREET Sunday 22:00 08:00
188 157 VANDERBILT AVENUE Sunday 23:00 11:00

189 rows × 5 columns

Code
selection = alt.selection_multi(fields=['dayweek'])
color = alt.condition(selection,
                      alt.Color('dayweek:N', legend=None, 
                      scale=alt.Scale(scheme='category10')),
                      alt.value('lightgray'))

opacity = alt.condition(selection,
                        alt.value(1), alt.value(0))



bar = alt.Chart(open_time_Brooklyn).mark_bar().encode(
        x='Open',
        x2='Close',
        y='Approved On Street',
        color=color,
        opacity=opacity,
        tooltip=['Open', 'Close', 'Approved On Street', 'dayweek']).properties(
        width=500,
        height=1000).interactive()

legend = alt.Chart(open_time_Brooklyn).mark_bar().encode(
    y=alt.Y('dayweek:N', axis=alt.Axis(orient='right')),
    color=color
).add_selection(
selection
)

A_Chart = bar | legend
A_Chart

This interactive bar chart on Brooklyn Open Street serves as a pilot that can be adapted for data of all five boroughs. By comparing to open and close time, we can observe that many streets are approved to be open streets with later opening and closing time on weekends. To improve this bar chart the status of each street can be added (i.e., whether they are approved to be closed fully or partially or only on school days), which is important for visitors as well.

Dashboard

The dashboard below is an upgrade from the indoor space and income grouped scatter plot. This dashboard makes it easier to explore the spatial distribution of indoor spaces of certain type or in neighborhoods with certain level of income. Through selecting points with higher counts, we can see neighborhoods with higher density of identified indoor leisure spaces are mostly located in Manhattan.

Code
brush = alt.selection_interval()

points = alt.Chart(income_indoor_pivot_filtered).mark_point().encode(
       x=alt.X("Median Household Income:Q", scale=alt.Scale(zero=False)),
       y=alt.Y("Count:Q", scale=alt.Scale(zero=False)),
       color=alt.condition(brush, "BoroName:N", alt.value("lightgray")),
       tooltip=["NTAName","BoroName:N", "Median Household Income:Q", "Count:Q"]).add_params(brush
).properties(width=200, height=200
).facet(column="Types:N")


bars = alt.Chart(income_indoor_pivot_filtered).mark_bar().encode(
    y='BoroName:N',
    color='BoroName:N',
    x='Count:Q'
).transform_filter(
    brush
)

Dashboard = points & bars
Dashboard