TextXD 2019 Program

Program Overview

Day 1Tuesday, Dec. 3Training workshopsSpieker Forum at Chou Hall
Day 2Wednesday, Dec. 4Talks and postersSpieker Forum at Chou Hall
Day 3Thursday, Dec. 5Talks and postersSpieker Forum a Chou Hall
Day 4Friday, Dec. 6Collaboration and codingBIDS (190 Doe)

Day 1: Tuesday, December 3rd (Workshops)

Location: Spieker Forum at Chou Hall

These workshops will generally be interactive coding sessions with jupyter notebooks, so we strongly recommend bringing a laptop with a working installation of Anaconda / Python. No prior experience with text analysis is assumed.

9:30amWelcomeClaudia von VacanoD-Lab
9:40amText as Data IntroductionJaren HaberUC Berkeley, Sociology
10:35amWeb APIs and ScrapingGeoff BaconUC Berkeley, Linguistics
11:30amCoffee Break
11:45amTopic modelingIlya AkdemirUC Berkeley, Law
1:40pmWord embeddingsAlina Arseniev-KoehlerUCLA, Sociology
2:45pmSupervised machine learningCaroline Le Pennec-CaldichouryUC Berkeley, Economics
3:45pmCoffee Break
4pmDeep learningDima LituievUC San Francisco, Bakar Computational Health Sciences Institute

Day 2: Wednesday, December 4th (Talks)

Location: Spieker Forum at Chou Hall

9:30amWelcomeHeather HavemanUC Berkeley, Sociology & Business
9:40amKeynoteChris PottsStanford University, Linguistics
10:30amSession 1 - Psychological Threads
“I come before you a changed man”: Historical Changes in the Vocabulary of Parole Release DecisionsIsaac DalkeUC Berkeley, Sociology
“The words of trauma” - Text Analysis of the effect of War World II on Salinger’s literatureAnat Talmon, Chen Edelsburg, Nimrod TalmonStanford University, Psychology and Tel Aviv University
11:15amCoffee Break
11:30amSession 2 - Policy
Gender Stereotypes in Professor-Student InteractionsZachary BleemerUC Berkeley, Economics
State-level racial attitudes and adverse birth outcomes: applying natural language processing to Twitter data to quantify state context for pregnant womenThu NguyenUC San Francisco, Epidemiology & Biostatistics
NLP approaches to detecting behavioral failures in sustainable transportation infrastructureOmar Isaac AsensioGeorgia Institute of Technology, Public Policy
12:30pmLunch + Poster session
Exploratory Expansion of Accounting Word Lists using Word-Embedding Models on SEC FilingsBrian Chivers
Title TBDRaquel Coelho
Refugee Education: A Survey of Topics and Trends in Newswires and Press Releases, 2009 to 2018Seungah Lee
Who is cuing whom? The dual process of shaping knowledge gap in climate change communicationYijyun Lin
The Limits of Interest: Capture, Financialization, or Contestation in the Politics of Rule-Making on DerivativesKonrad Posch
Predicting Semantic Fluency Using Large-scale Language CorporaZhihao Zhang
Applying natural language processing algorithms to detect behavioral failures in emerging electric vehicle infrastructureSooji Ha
1:30pmKeynote: Towards Universal Language UnderstandingYunyao LiIBM, Scalable Knowledge Intelligence
2:15pmSession 3 - Theory and Methods
Interpreting and improving NLP models via disentangled interpretationsChandan SinghUC Berkeley, Computer Science
Cross-domain classificationBarea SinnoUniversity of Texas at Austin, Ohio State University
Automated methods enable direct computation on phenotypic descriptions for novel candidate gene predictionIan BraunIowa State University, Computational Biology
3:15pmCoffee Break
3:30pmSession 4 - Politics
Detecting Meaningful Multi-word Expressions in Political TextKenneth BenoitLondon School of Economics, Methodology
Who speaks for Women in the Indian Parliament?Saloni BhogaleAshoka University, Trivedi Centre
Sentiment is Not Stance: Target-Aware Classification for Political Text AnalysisSamuel E. Bestvater, Burt MonroeThe Pennsylvania State University, Political Science
4:30pmKeynoteJustin GrimmerStanford University, Political Science
5:30pmReception - Berkeley Institute for Data Science (190 Doe Library)

Day 3: Thursday, December 5th (Talks)

Location: Spieker Forum at Chou Hall

9:40amKeynoteKathleen CarleyCarnegie Mellon University, Computer Science
10:30amSession 5 - Innovation
Quantifying Innovation with BERT: Linguistic Prescience and Firm Stock ReturnsPaul VicinanzaStanford University, Graduate School of Business
Identifying (Dis)Continuities in Ed Tech’s Discourse of InventionSebastian Muñoz-Najar GalvezStanford University, Graduate School of Education
11:15amCoffee Break
11:30amSession 6 - Public Health
NLP for conversational dialogOrianna DeMasiUC Davis, Computer Science
#Vape: Measuring E-cigarette Influence on Instagram with Deep Learning and Text AnalysisJulia VasseyUC Berkeley, Public Health
No More Silence: Monitoring Bias with Word2VecLauren KaplanUC San Francisco, Medicine
12:30pmLunch + Poster session
Natural Language Processing for Materials Discovery and DesignJohn Dagdelen
Teaching machine synthesis: collecting dataset of “codified synthesis recipes” extracted from millions of publicationsOlga Kononova
A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language ProcessingLiyan Liu
Understanding emerging forms of cannabis use through online communitiesMeredith Meacham
Making Sense of Clinical Trial Descriptions: A Text Analysis ApproachMunif Ishad Mujib
Impacts of the ArtsGabriel Harp
FrameNet and Natural Language ProcessingMiriam R L Petruck, Collin Baker
1:30pmSession 7 - Lightning Talks
Hidden Political Dynasties in China: Analyzing Chinese Baby Names as Ultra-Short Political Text DataTao LiUniversity of Macau, Government & Public Administration
Are both policemen and policewomen police officers? The gender connotations of gender-fair languageAlina Arseniev-KoehlerUCLA, Sociology
Uses of the Machine-learning Protest Event Database SystemAlex HannaGoogle, ML Fairness
A pipeline for analyzing Akkadian textsAleksi SahalaUniversity of Helsinki, Linguistics
Summer Institute in Computational Social Science in the San Francisco Bay Area: Computational Social Science for Social GoodJaren Haber and Jae Yeon KimUC Berkeley
2pmSession 8 - Biomedical
Application of text mining methods to identify lupus nephritis from electronic health recordsMilena GianfrancescoUC San Francisco, Medicine
Unstructured Text Analysis in Electronic Health Records to Characterize Sepsis PresentationMeghana BhimaraoKaiser Permanente, Division of Research
Extracting patient-reported functional status and disease activity information from electronic health recordsTome EftimovStanford University, Biomedical Data Science
Natural language processing for automated rapid cancer ascertainmentLiyan LiuKaiser Permanente, Division of Research
3:15pmCoffee Break
3:30pmSession 9 - News and Media
“Downloading” the news: Reproducible access to text as dataCody HennesyUniversity of Minnesota, Libraries
Media Attention and Bureaucratic ResponsivenessAaron ErlichMcGill University, Political Science
Using Text Data as AlternativeJae Yeon KimUC Berkeley, Political Science
4:30pmKeynoteBrandon StewartPrinceton University, Sociology
5:30pmReception - Tap Haus, 2518 Durant Ave

Day 4: Friday, December 6th (Collaboration)

Location: Berkeley Institute for Data Science (190 Doe Library)

Theme: Text Analysis for Social Good

Day 4 will be at BIDS and will include a hackathon component as well as parallel breakout sessions for discussing major issues in text analysis / NLP. The hackathon will feature multiple projects with associated datasets and starter jupyter notebooks. Participants will form teams and apply text analysis methods of their choice, potentially leading to future research collaborations. Breakout sessions will feature introductory presentations followed by facilitated discussions leading to summary recommendations on the chosen topic.

TimeTopicBreakout session(s)
9:30amWelcome - David Mongeau, BIDS
9:40amProject introductions
10amCoding / collaborationPedagogy of Text Analysis - Evan Muzzall
11amCoffee Break
11:15amCoding / collaborationText Analysis for Social Good
1:30pmCoding / collaborationTextXD 2020 priorities
3:00pmCoffee Break
3:15pmCoding / collaboration
4:00pmReport back & conference close