About the project

Motivation and problem statement

Our understanding of the many health risks that lead toxicity pose and their outsized impact on young children are well described elsewhere. There are several primary goals for this project:

  1. Help equip Malden decisionmakers to understand where risks are the highest so that the limited resources set aside for lead service line replacement can be informed by actual public health risk given the data available.
  2. Help quantify the public health impact of applying additional resources to lead service line replacement in Malden, MA. (In other words, we'd like to answer the question: "What could be accomplished with an additional X% of funding devoted to remediation efforts?")
  3. Following work by Potash et al, demonstrate a broadly applicable and relatively simple methodology to evaluate and inform lead mitigation efforts in other geographic communities and with other toxicity modalities (e.g. lead paint).


There are several data sets involved, each of which answers a particular question:

  • Which parcels in Malden have lead service lines? The City of Malden maintains an ArcGIS web application with periodically updated data on what type of service lines exist at each residential parcel in the city. The web application is available here. There is also a GIS web application which has a clean list of Malden property parcels, available here.
  • Which parcels with lead service lines have children living in the house? Anonymized enrollment data from Malden public schools connects student ages to street addresses. These street addresses must be geocoded so that they can be cleaned up and connected with a specific parcel from the city parcels data. This data is updated once per year in the fall after the academic year rolls have been finalized.
  • What street segments contain parcels with children and lead lines? The City of Malden maintains a GIS shapefile called "Malden centerlines" that splits long streets into chunks that are roughly the size of a public works project. This data set changes from time to time, and should be updated periodically.

Once we have access to the datasets above, we can overlay parcels with children and lead on the pieces of each street that City considers a reasonably sized segment of road.

Assumptions and limitations

Here are some assumptions involved in the project, and some simplifcations and limitations that result:

  • Exposure years. To calculate years of exposure, we will assume that a child will live at a certain address until the age of 18. That means that a five year old child living at a parcel with a lead service line is assigned 13 exposure years (18 minus 5). This is not perfect, because the family could move to a different address or the service line could be replaced in that time.
  • Age from school grade. Data from schools was anonymized and did not contain birthdays, only school grade. Assuming that most United States schoolchildren started kindergarten at age 5, we calculate age as the grade plus five, where grade would be 0 for kindergarten, -1 for pre-K, and -2 for preschool.
  • Location of school children, not infants. There is not a simple way to associate infants with particular houses. In theory, birth certificates could be viewed or somehow digitized but these are only a snapshot in time when the child is born and are not updated. Therefore, we miss out on some of the most vulnerable children who are not yet in preschool. On the other hand, the best predictor of where babies might live is where there are already young children indicating parents of childbearing age.
  • Accuracy of lead service line data. The service line data is imperfect, and certain parcels may not have been updated in a long time, or may have been replaced but not updated in the dataset. Over time, this data tends to continue improving as the city replaces lines or conducts physical inspections. For now, some parcels in the Malden data are marked with status UNKNOWN. These tend to be identified over time as the city visits more parcels. Here are the counts as of September, 2021:
    PRIVATE brass cast iron copper ductile iron galvanized lead pvc steel unknown
    cast iron 0 14 67 0 0 5 0 2 66
    copper 5 6 5,516 0 5 975 2 19 1,275
    ductile iron 0 0 10 32 0 0 0 0 21
    galvanized 0 0 0 0 0 0 0 0 3
    lead 1 0 1,415 0 1 267 0 4 1,049
    unknown 0 1 32 0 0 7 0 0 282
  • Accuracy of street segments. Not all street segments are the same size, and they might not be the ideal size for a lead service line replacement project. In the table, we can normalize out length by giving 'exposure years per 100 yards' to try to make apples-to-apples comparisons.
  • Partial replacements. In recent years, it has been observed that replacing only one side (city or private) can be worse than leaving the pipes alone. This is because the service lines tend to build up a protective layer of "scaling" on the inside; disruption of built-up scaling in the pipes may allow more lead to leach into the water. The city can't just unilaterally decide to replace all the city side lines, they must consider which private lines they can persuade owners to cooperate in replacing at the same time. Homeowners in Malden are required by local ordinance to replace lead service lines before selling their house, but without a forcing function its can be difficult to persuade owners to spend upwards of $2,000 to have their private side line replaced. Recent work by Clean Water Action in Chelsea, MA demonstrates that offering to pay for replacement of the private side could help expedite the process of getting owners to replace private side lines.
  • Street disruptions. The City tries to be a good steward of road quality, and given limited resources the DPW is understandably reluctant to tear up streets which have been recently paved.

Future work

Given the assumptions and limitations above, many extensions to this work are possible. For example:

  • More school-age children could be added to the dataset. There are children who live in Malden that are not students enrolled at participating schools. The voluntary participation of other private, charter, or parochial schools could greatly add to the resolution of this data.
  • Infants can be added to the dataset. As discussed above, no data exists on which parcels are home to babies that are not old enough to be enrolled in preschool. Electronic surveys, mailed questionnaires, or door-knocking to solicit voluntary data from parents of infant children would increase the resolution of this data.
  • Differentiate the severity of child exposure years. How much worse is a given unit of lead exposure at age 5 as opposed to age 17? Currently, we treat both of these years of exposure as equally bad. Finding an absolute (in health terms) or relative (in arbitrary units) measure of the comparative severity of exposure at different ages could help re-weight exposure years as a function of a child's age. We could simply pick an arbitrary decay curve, but basing the decay on a thorough review of the literature would be a more principled and data-informed approach.

Why is my child or other children on my street missing from the data?

There are a few reasons why this might be:

  • Only kids living at parcels with lead are counted here. Houses with no lead service lines are not relevant to the project.
  • Not every school in Malden has agreed to provide data, and some children that live in Malden attend school elsewhere.
  • Even if data was shared, not every child's address was able to be connected with a parcel in the lead data set. See above for more detail.
  • Some of the Malden street segments aren't perfect. If you find an error, please feel free to contact with a description of what is wrong so that the City GIS office can be notified.

Who made this?

This is an independent research project by Isaac Slavitt, a data scientist and Malden resident. Please feel free to contact me at .


This project would not have been possible without the cooperation of the City of Malden, particularly Maria Luise in the office of Mayor Gary Christenson for greenlighting the research and Steve Fama in the GIS office for providing many data sets and insight into city recordkeeping. Malden City Councillors Steve Winslow and Ryan O'Malley have been strong proponents of this work and energetic advocates for informed public health promotion and infastructure modernization in the City of Malden.


  • Abernethy, J., Chojnacki, A., Farahi, A., Schwartz, E., Webb, J., 2018. ActiveRemediation: The search for lead pipes in Flint, Michigan. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , Association for Computing Machinery, New York, NY, USA, pp. 514.
  • Advocate, 2021. City announces effort to increase removal of lead services lines in 2022. The Malden Advocate . Available at https://publizr.com/advocatenewsma/malden-advocate-10-slash-15-slash-21?html=true#/3/ (accessed 31 December 2021).
  • Beher, J., Possingham, H.P., Hoobin, S., Dougall, C., Klein, C., 2016. Prioritising catchment management projects to improve marine water quality. Environmental Science & Policy 59, 3543.
  • Christenson, G., 2022. Mayor Christenson's 2022 State of the City Address. Malden Access Television, via YouTube (at timestamp 31:25) . Available at https://youtu.be/4hUeEGPewZ8?t=1885 (accessed 31 December 2021).
  • Clean Water Action, 2019. Chelsea Leaders Receive MWRA Grant for Removing Lead Service Lines. LSLR Collaborative .
  • Costa, F., Murta, L., Ribeiro, C.C., 2014. Applying software engineering techniques in the development and management of linear and integer programming applications. International Transactions in Operational Research 21, 6, 10011030.
  • Dantzig, G.B., 1957. Discrete-variable extremum problems. Operations Research 5, 2, 266288.
  • Eiselt, H., Marianov, V., 2014. Multicriteria decision making under uncertainty: a visual approach. International Transactions in Operational Research 21, 4, 525540.
  • Gabrilska, M., 2021. Malden uses data to prioritize removal of lead-lined service pipes. Massachusetts Municipal Association . Available at https://www.mma.org/malden-uses-data-to-prioritize-removal-of-lead-lined-service-pipes/ (accessed 31 December 2021).
  • Goldberg, D.W., 2017. Geocoding. In D. Richardson, N. Castree, M.F. Goodchild, A. Kobayashi, W. Liu and R.A. Marston (eds) International Encyclopedia of Geography: People, the Earth, Environment and Technology. Wiley, Hoboken, NJ.
  • Goovaerts, P., 2017. How geostatistics can help you find lead and galvanized water service lines: the case of Flint, MI. Science of The Total Environment 599-600, 15521563.
  • Hu, H., Shih, R., Rothenberg, S., Schwartz, B.S., 2007. The epidemiology of lead toxicity in adults: measuring dose and consideration of other methodologic issues. Environmental Health Perspectives 115, 3, 455462.
  • Joseph, L.N., Maloney, R.F., Possingham, H.P., 2009. Optimal allocation of resources among threatened species: a project prioritization protocol. Conservation Biology 23, 2, 328338.
  • Kellerer, H., Pferschy, U., Pisinger, D., 2004. Knapsack Problems. Springer, Berlin.
  • Ko, A., Morton, D.P., Popova, E., Hess, S.M., Kee, E., Richards, D., 2009. Prioritizing project selection. The Engineering Economist 54, 4, 267297.
  • Lockitch, G., 1993. Perspectives on lead toxicity. Clinical Biochemistry 26, 5, 371381.
  • Martello, S., Toth, P., 1990. Knapsack problems: algorithms and computer implementations. John Wiley & Sons, Hoboken, NJ.
  • Maturana, S., Eterovic, Y., 1995. Vehicle routing and production planning decision support systems: designing graphical user interfaces. International Transactions in Operational Research 2, 3, 233.
  • O'Malley, R., 2017. Lead pipes: The cost of kicking the can down the road. Public Health Post . Available at https://www.publichealthpost.org/viewpoints/the-cost-of-lead-pipes/ (accessed 31 December 2021).
  • Pannell, D.J., 2015. Ranking Projects for Water-Sensitive Cities. Working Papers 204263, University of Western Australia, School of Agricultural and Resource Economics.
  • Pannell, D.J., Gibson, F.L., 2016. Environmental cost of using poor decision metrics to prioritize environmental projects: cost of poor decision metrics. Conservation Biology 30, 2, 382391.
  • Potash, E., Ghani, R., Walsh, J., Jorgensen, E., Lohff, C., Prachand, N., Mansour, R., 2020. Validation of a machine learning model to predict childhood lead poisoning. JAMA Network Open 3, 9, e2012734.
  • Rocheleau, M., 2016. Lead water pipes still a concern in boston area. The Boston Globe . Available at https://archive.today/XlIxm (accessed 31 December 2021).
  • Sahni, S., 1975. Approximate algorithms for the 0/1 knapsack problem. Journal of the ACM 22, 1, 115124.
  • Senju, S., Toyoda, Y., 1968. An approach to linear programming with 0–1 variables. Management Science 15, 4, B196.
  • Trueman, B.F., Camara, E., Gagnon, G.A., 2016. Evaluating the effects of full and partial lead service line replacement on lead levels in drinking water. Environmental Science & Technology 50, 14, 73897396.
  • US ATSDR, 2020. Toxicological Profile for Lead. Technical Report CAS #7439-92-1, Agency for Toxic Substances and Disease Registry (ATSDR). US Department of Health and Human Services.
  • US CDC, 2012. Low Level Lead Exposure Harms Children: a Renewed Call for Primary Prevention. Committee report, Advisory Committee on Childhood Lead Poisoning Prevention. US Centers for Disease Control and Prevention.
  • US Census, 2021a. America's families and living arrangements: 2021. U.S. Census Bureau .
  • US Census, 2021b. Tiger line shapefiles and tiger line files technical documentation. U.S. Census Bureau .
  • US EPA, 2021. Stronger protections from lead in drinking water: next steps for the lead and copper rule. U.S. Environmental Protection Agency .
  • Whitmyre, G., Driver, J., 2005. Exposure assessment. In P. Wexler (ed.) Encyclopedia of Toxicology ( 2nd edn). Elsevier, Amsterdam, pp. 303306.
  • Zartarian, V., Xue, J., Tornero-Velez, R., Brown, J., 2017. Children's lead exposure: a multimedia modeling analysis to guide public health decision-making. Environmental Health Perspectives 125, 9.