cv

Education

  • Aug 2016 – Feb 2023

    New York City, NY

    PhD
    City University of New York, The Graduate Center
    Sociology
  • Aug 2014 – May 2016

    Bronx, NY

    MA
    Fordham University
    Sociology
  • Aug 2006 – May 2010

    San Diego, CA

    BA
    Point Loma Nazarene University
    Spanish Literature and Language

Experience

  • Jan 2026 – Present

    Remote

    Consultant – NEPA Clean Energy Review Analysis
    Clean Air Task Force
    • Built a BERT+LLM pipeline to extract and classify records from 120,000+ NEPA documents, enabling CATF to analyze clean energy review timelines at scale
    • Integrated Federal Register API data, targeted web scraping, and ML/regex classifiers to categorize 20,000+ projects by technology, review process, and capacity
    • Scoped research questions collaboratively with CATF and delivered phased findings across six structured deliverables mapped to specific policy priorities
    • Published findings via public project website and HuggingFace document browser
  • Jul 2023 – Present

    Berkeley, CA

    Postdoctoral Researcher and Lecturer
    University of California, Berkeley
    • Lead cross-functional research team to develop, maintain, and analyze eviction database (150M+ records) using SQL
    • Applied R-based machine learning, multilevel regression, and synthetic control methods to study impact of rental assistance on eviction filings
    • Coordinated with UPenn partners, managed research process, and led report writing
    • Presented findings and strategic recommendations to federal housing policy stakeholders
    • Taught machine learning, natural language processing, and causal inference to PhD students
    • Led 32 weekly labs in R and Python libraries (tidyverse, sklearn, TensorFlow)
    • Supervised 8 student projects on predictive modeling and decision-making
  • Sep 2022 – Jun 2023

    New York City, NY

    Data Scientist
    NYC Department of Housing Preservation & Development
    • Collaborated with US Census research team to establish sample survey research frame for Housing Vacancy Survey
    • Addressed Housing Vacancy Survey measurement error by creating post-stratification weights
    • Developed methodology with survey weights to determine where to build affordable housing, presented to non-technical stakeholders, and persuaded Commissioner to adopt methodology
  • Jan 2021 – Dec 2022

    New York City, NY

    Research Scientist, Professor Van Tran
    City University of New York, The Graduate Center
    • Built US Census dataset to study ethno-racial neighborhood integration in metro New York
    • Fitted multinomial logistic regression models and interpreted marginal effects plots in R
    • Wrote methodological sections, created descriptive plots in R, and managed project development on GitHub
  • Oct 2018 – May 2021

    New York City, NY

    Research Scientist, Professor Paul Attewell
    City University of New York, The Graduate Center
    • Developed research projects to study divergence of benefits for groups with different educational attainment
    • Managed 4 large national, longitudinal, and cross-sectional datasets used in analyses
    • Used data mining, HLM, OLS, and logistic regression techniques to understand the impact of educational attainment on mid-life labor market earnings
  • Sep 2016 – Jan 2019

    New York City, NY

    UX Researcher
    CUNY, Center for Urban Research
    • Designed, executed, and led focus group research to identify key labor market trends across 10 different projects
    • Distilled findings into actionable insights and presented recommendations to clients
  • Nov 2014 – Aug 2016

    New York City, NY

    UX Researcher
    Empirical Creative
    • Designed and led mock jury trials (focus groups) to identify favorable juror characteristics and attitudes
    • Conducted strategic research to understand user characteristics that shape case opinions and what facts change decisions
    • Translated insights into actionable recommendations — trial strategy, user profiles, and visual graphics
  • Jan 2020 – Dec 2021

    Remote

    Data Scientist – Superdiversity in Metro New York Project
    Max Planck Institute
    • Managed end-to-end project development of interactive Superdiversity Website and Teaching Tool
    • Built and analyzed 6 cross-sectional, longitudinal databases exploring changing diversity in Metro New York
    • Supervised design team and ensured project aligned with stakeholder vision

Selected Projects

  • 2026–present
    NEPA Clean Energy Environmental Review Analysis
    • Trained BERT classifiers and LLM adjudication layer to extract review dates and process types from 120,000+ unstructured federal documents
    • Integrated Federal Register API, web scraping, and ML/regex classifiers to categorize 20,000+ projects by technology, capacity, and review type
    • Published findings via a public project website and an interactive HuggingFace document browser, making results accessible to policymakers and researchers
  • 2025–2026
    Voter Turnout in New York City
    • Used Bayesian Improved Surname Geocoding (BISG) to impute race/ethnicity from surnames in 4.6 million NYC voter registration records
    • Applied multilevel regression and poststratification (MrP) to impute educational attainment and build survey post-stratification weights
    • Applied RAG techniques and fine-tuned open-source LLMs to extract insights from voter file records
  • 2025
    SF Residential Inspection Risk
    • Linked fire incidents, building violations, inspection records, and parcel tax data at the parcel level using APN identifiers
    • Built composite risk scores and k-means spatial clusters to prioritize inspections for SFFD and the Department of Building Inspection
    • Deployed two interactive Shiny dashboards: SFFD, DBI
  • 2020–2021
    Superdiversity in Metro New York
  • 2017–2019
    Housing Literacy
    • Developed online tool annotating NYC rent regulation legal documents for rent-stabilized tenants and housing advocates

Publications

Skills

Programming
R
Python
SQL
Stata
Git & GitHub
Linux
Quantitative Methods
Spatial econometrics
Machine learning
Generalized linear regression
Natural language processing
Survey analysis
LLMs and GenAI
A/B testing
Qualitative Methods
Focus groups
Interviews
Stakeholder analysis
Survey design
User perception studies
Languages
English (native)
Spanish (fluent)